Symposium on VLSI Technology and Circuits

8:25 Opening

8:30 Hardware Accelerator Design for AI: Enabling Efficient Generative Models, Leland Chang, IBM

Abstract:
The advent of large language models and generative AI has ushered enormous demand for hardware accelerators to perform AI training, fine-tuning, and inference. The design of such accelerators depends on holistic optimization of technology, circuits, and systems, but also fundamentally upon the models and use cases that this hardware needs to serve. Achieving the proper balance of compute vs. communication to optimize latency and throughput in AI workloads will require tradeoffs across the hardware/software stack to reconcile the long development cycles needed to build chips and systems with the torrid pace of innovation in AI models and algorithms. This talk will provide an overview of the landscape for AI hardware accelerators and discuss research roadmaps to improve both compute efficiency and communication bandwidth, particularly as Generative AI evolves towards Agentic AI and smaller, fit-for-purpose models.

9:20 Architecture Trends for AI Hardware Platforms, Norman James, AMD

Abstract:
AI has received a large amount of press in recent years and the underlying AI hardware is an important part of the popularity. New AI models are continually released and push the limits of the hardware’s capability. Thousands of software developers have the sole purpose of extracting as much performance from the hardware as possible so the AI platforms can do more at lower cost. Bandwidth per dollar and picojoules per bit are key metrics. To optimize these metrics, GPU’s are ideally connected with passive copper which has the side effect of more dense packaging at all levels so the GPU’s are in close proximity. The high performance computing (HPC) segment has faced these issues but there are differences from the AI market segment. The hope of lower power and lower cost optics could change the density paradigms in the future. This presentation covers these trends leading to higher density, many of the complications, and gives a glimpse into the future of AI platforms.

About Norman James

Norm James is a Senior Fellow at AMD in the data center GPU business unit. He has a bachelor’s degree in EE from Texas A&M and a master’s degree from University of Texas. He is currently the lead platform architect for AMD’s Instinct GPU line targeted for large scale AI deployments. Norm started his career at IBM in analog circuit design for custom CPU’s. At IBM, he enjoyed a broad set of roles including logic design, manufacturing test, firmware development, and system architecture. While at IBM, he also started the OpenBMC project which is now part of Linux Foundation. Norm joined AMD in 2019 and helped form AMD’s data center GPU business and has done many roles including system lead engineer and lead platform architect since then.

10:10 Break

10:25 Modular chiplet approaches for scalable and efficient machine learning, Zhengya Zhang, University of Michigan

Abstract:
Machine learning models are rapidly increasing in size and complexity, surpassing the pace of accelerator chip upgrades. The development of monolithic chips to match these evolving models is both expensive and challenging. Alternatively, modular chiplets can be designed and reused to create multi-chip packages (MCPs) capable of addressing diverse models and tasks. The future success of chiplet technology hinges on advancements in chiplets that offer high utilization and flexibility, efficient high-bandwidth die-to-die interfaces, and high-density packaging. In this presentation, I will introduce two MCPs resulting from our collaboration with Intel and the Institute of Microelectronics in Singapore. The first MCP, Arvon, utilizes Embedded Multi-die Interconnect Bridge (EMIB) to integrate an FPGA chiplet and two DSP chiplets. As a programmable MCP, Arvon can adapt to evolving workloads over time. The second MCP, NetFlex, integrates four neural network chiplets using high-density fan-out wafer level packaging (HD-FOWLP). NetFlex’s streamlined architecture enables scalability for larger configurations.

About Zhengya Zhang

Zhengya Zhang received the B.A.Sc. degree from the University of Waterloo in 2003, and the M.S. and Ph.D. degrees from UC Berkeley in 2005 and 2009, respectively. Since 2009, he has been with the Department of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor, where he is currently a professor. His research primarily focuses on low-power and high-performance VLSI circuits and systems, with applications in computing, communications, and signal processing. Dr. Zhang was a recipient of the NSF CAREER Award, the Intel Early Career Faculty Award, the Neil Van Eenam Memorial Award from the University of Michigan, and the David J. Sakrison Memorial Prize from UC Berkeley. He also served as an IEEE Solid-State Circuits Society Distinguished Lecturer.

11:25 AI for EDA: Challenges and Opportunities, Igor Markov, Synopsys

Abstract:
Applying modern AI techniques to EDA can address the need to design more chips and larger chips with limited engineering resources. However, achieving significant impact is harder than it may seem. In this presentation, we review paths to impact of AI in EDA and discuss obstacles to such applications. We discuss relevant data types, training data, GenAI outputs, and what can be expected from current transformer architectures. We then turn our attention to EDA applications and identify patterns of successful AI applications, which we illustrate by successful Synopsys products.

12:05 Lunch

12:55 Connectivity Technologies to Accelerate AI, Tony Chan Carusone, University of Toronto Alphawave Semi: Abstract:
The rapid scaling of AI is reshaping large-scale computing and communication hardware, driving new demands for wireline connectivity. Chiplet-based architectures are emerging as a key enabler, integrating logic, memory, and connectivity to reduce the cost and time-to-market of custom AI hardware optimized for specific workloads. These architectures depend on high-density die-to-die interfaces, which are evolving rapidly. At the same time, the increasing compute density within a package is accelerating demand for high-speed off-package connectivity. Scaling AI clusters at the datacentre and inter-datacentre level requires new organizational paradigms, with optical data transmission playing an expanding role in meeting these challenges.

About Tony Chan Carusone

Dr. Tony Chan Carusone has taught and researched integrated circuits and systems for high-speed connectivity in industry and academia for over 20 years. He has been the Chief Technology Officer of Alphawave Semi since 2022 and a faculty member at the University of Toronto since completing his Ph.D. there in 2002. He has well over 100 publications, including 11 award-winning best papers at leading conferences for work on chip-to-chip and optical communication, analog-to-digital conversion, and precise clock generation. He also co-authored the latest editions of the classic textbooks “Analog Integrated Circuit Design” and “Microelectronic Circuits,” the best-selling engineering textbook of all time. Tony has also been a consultant to the semiconductor industry for over 20 years, working with both startups and some of the largest technology companies in the world. He is a Fellow of the IEEE.
13:45 3D Optical Interconnect Design, Frank Lee, TSMC: Abstract:
With the recent rapid advances in AI large language models, the demand for high-speed data communication links has increased dramatically. Conventional copper-based data links are reaching their limits. Optical interconnects, due to their high bandwidth and lower power consumption, have emerged as a promising solution for next-generation data links.
In this short course, we will introduce the operating principles of basic silicon photonic devices. We will then discuss design considerations for photonic transmitter modulators and receiver filters, along with their associated electronic driver and amplifier designs. High-speed electrical circuits, optical circuits and 3D packaging co-design will be illustrated as well.

About Frank Lee

Frank earned his Ph.D. from Carnegie Mellon University in Pittsburgh, Pennsylvania. He is currently a Vice President at TSMC Technology Inc. in San Jose, California, where he leads the Custom Design Methodology group. His work focuses on advancing analog/RF design automation and migration for cutting-edge nodes, as well as pioneering 3D COUPE silicon photonics design and analysis methodologies. Prior to his role at TSMC, Frank served as Vice President at Synopsys in Mountain View, California. During his tenure there, he significantly contributed to a diverse array of products, including physical design verification, IR/EM analysis, static timing analysis, timing optimization, advanced placement and routing, physical optimization, cell characterization, and both SPICE and fast SPICE circuit simulation.
14:35 Break
14:50 HBM for AI computing, Jinhyung Lee, SK Hynix: Abstract:
High bandwidth memory (HBM) has become a key enabler for AI computing, offering high bandwidth and low power consumption with small form factor. Its 3D-stacked architecture with a wide memory interface significantly enhances performance for AI workloads. However, to eliminate risks that may occur in the 3D stack structure, a lot of design techniques are involved such as DFT/BIST/Redundancy and power delivery for HBM.
In particular, HBM requires a lot of consideration in the design process because the environment in which it is tested and it is actually used are different. These design topics for HBM are going to be discussed.

About Jinhyung Lee

Jinhyung Lee received the B.S. and Ph.D. degrees in electrical engineering and computer science from Seoul National University, Seoul, South Korea, in 2014 and 2019, respectively. He joined SK Hynix, Icheon, South Korea. His current research interests include high-speed I/O circuits, adaptive equalizers, and high bandwidth memory (HBM). Dr. Lee received the Best JSSC Paper Award from SSCS Seoul Chapter, in 2021.
15:40 Semiconductor Storage for Further Evolution of Generative AI, Jun Deguchi, KIOXIA Corp.: Abstract:
In accordance with Moore's Law, the advancement of semiconductors has been a driving force behind the evolution of AI, including the current boom in generative AI. The size of mainstream generative AI models has significantly increased to enhance their reasoning, interpretative capabilities, and memory capacity. This has subsequently led to a rising demand for semiconductor devices such as CPU, GPU, and DRAM to efficiently operate these massive AI models. However, if AI models continue to grow at this rate, the costs and power consumption required for training and inference will escalate dramatically. This indicates the need for a shift towards a different trajectory for the evolution of AI.
Under these circumstances, the primary role of semiconductor storage devices such as flash memory and SSDs has been to store vast amounts of data necessary for AI model training. However, they have not directly contributed to resolving the aforementioned challenges. To address these issues, even before the generative AI boom, our company has proposed the concept of "Memory-Centric AI," which separates and advances the memory and reasoning/interpretative functions of AI. By appropriately allocating the required semiconductors to each function, we aim to overcome these challenges.
In this talk, we will discuss the issues highlighted above, provide an overview of Memory- Centric AI, and introduce the new roles of semiconductor storage within this context. Additionally, we will outline our open-source software technology, "KIOXIA AiSAQ™ (Allin-Storage Approximate Nearest Neighbor Search with Product Quantization)," which we developed to promote the utilization of SSDs in "RAG: Retrieval Augmented Generation,“ one of the implementation forms of Memory-Centric AI, that enhances generative AI’s response accuracy and database scalability. Finally, we will also discuss the future prospects of Memory-Centric AI.

About Jun Deguchi

Jun Deguchi received the B.E. and M.E. degrees in machine intelligence and systems engineering and the Ph.D. degree in bioengineering and robotics from Tohoku University, Sendai, Japan, in 2001, 2003, and 2006, respectively. In 2004, he was a Visiting Scholar at the University of California, Santa Cruz, CA, USA. In 2006, he joined Toshiba Corporation, and was involved in design of analog/RF circuits for wireless communications, CMOS image sensors, high-speed I/O, and accelerators for deep learning. From 2014 to 2015, he was a Visiting Scientist at the MIT Media Lab, Cambridge, MA, USA, and was involved in research on brain/neuro science. In 2017, he moved to Kioxia Corporation (formerly Toshiba Memory Corporation), and is serving as the group manager of a research team working on AI-related technology from algorithms to circuit designs. Dr. Deguchi has served as a member of the international technical program committee (TPC) of IEEE Asian Solid-State Circuits Conference (A-SSCC) since 2017. He also served as a TPC member of IEEE International Solid-State Circuits Conference (ISSCC) from 2016 to 2023, a Far-East chair of IEEE ISSCC 2023, a TPC vice-chair of IEEE A-SSCC 2019, a guest editor of IEEE Journal of Solid-State Circuits (JSSC) for the special issues on IEEE A-SSCC 2020, IEEE ISSCC 2020 and IEEE ISSCC 2021. He has also been a review committee member of IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2020.
16:30 Advancements in Power Architectures for AI Computing: The Role of Intermediate Bus Converters and Hybrid Power Systems, Ke-Horng Chen, National Yang Ming Chiao Tung University: Abstract:
The ever-increasing demand for efficient and high-quality AI computing power systems has accelerated the transition from centralized power supplies to distributed architectures. To address the challenges posed by high input voltage and large driving currents, a new Intermediate Bus design has gained significant traction. This architecture offers lower cost, superior power quality, and enhanced efficiency while leveraging the latest advancements in power components, particularly GaN HEMT devices. This talk provides a comprehensive overview of the historical evolution of high-reliability power systems, followed by an in-depth exploration of the benefits and design challenges of Intermediate Bus Converters (IBC). A practical example will illustrate the control requirements for an IBC converter utilizing GaN devices. Additionally, the hybrid power system will be introduced as an effective strategy to reduce power distribution costs and enable point-of-load (POL) regulators to be placed directly adjacent to the corresponding load. This approach minimizes supply-plane parasitics and significantly improves high di/dt and dv/dt transient response. With high power density, cost efficiency, and robust driving capability, these solutions are designed to meet the stringent power demands of modern AI computing systems.

About Ke-Horng Chen

Professor Ke-Horng Chen, IEEE Fellow, is a leading researcher in semiconductor and green energy technologies and currently serves as the Associate Dean of the College of Electrical and Computer Engineering at National Yang Ming Chiao Tung University (NYCU), Taiwan. His contributions to power management and semiconductor research have significantly advanced Taiwan’s global standing in the field. Under his leadership, his research team has achieved the highest record of ISSCC publications in Taiwan, with a total of 30 ISSCC papers, including 7 papers in ISSCC 2023, and earning the 2024 IEEE ISSCC Takuo Sugano Award for Outstanding Far-East Paper.
Beyond ISSCC, Professor Chen has authored over 330 journal and conference papers and published the widely used textbook "Power Management Techniques for Integrated Circuit Design," ranked among Amazon’s top 5 most popular IC textbooks worldwide in 2020. His work has had a profound impact on both academia and industry, earning him prestigious awards such as the Pan Wen-Yuan Outstanding Research Award, the Nagamori Award (Japan), and the National Invention and Creation Award. With 113 patents, including 51 in the US and 62 in Taiwan, Professor Chen continues to push the boundaries of power electronics and semiconductor innovation, while also being deeply committed to mentoring the next generation of engineers. His achievements have positioned Taiwan as a leader in semiconductor research, driving both technological advancement and industrial growth. IEEE.

2025 Symposium on
VLSI Technology and Circuits

Short Course 2

Circuits and Systems for AI and Computing