Friday, March 20, 2026 10:00am to 11:30am
About this Event
210 South Bouquet Street, Pittsburgh, PA 15260
Abstract: With the rise of large language models (LLMs), artificial intelligence (AI) is transforming the world at an unprecedented speed. In the meantime, AI is also costing us a lot of energy due to high computational demands. As a remedy, a variety of narrower data formats have been proposed over the last decades. Examples include, but are not limited to, BF16, FP8, FP6, FP4, microscaling (MX), and a lot of these have been successfully commercialized in the latest generation of GPUs from major vendors. This trend has instigated a lot of interesting works from the computer architecture community to design new hardware to efficiently accelerate these narrow data formats.
This talk focuses on extracting a new form of parallel computing, dubbed value-level parallelism (VLP), where opportunities arising from these aggressive narrow data formats. The key insight is that narrow data with fewer bits produces a limited number of outputs, which can be reused by different inputs during computation. VLP draws inspiration from neuromorphic computing, illustrating how it can benefit the efficiency of traditional computer architecture. This talk will cover two VLP architectures, one for batched GEMM and one for non-linear operations, which together compose the full landscape of LLMs.
Bio: Di Wu is an assistant professor in the Department of ECE at the University of Central Florida. He earned his PhD from the Department of ECE at the University of Wisconsin-Madison in 2023, with BS and MS degrees from Fudan University. His research interests broadly spread out in emerging areas of computer architecture and systems, such as such as brain-inspired computing, machine learning systems and quantum error correction, featured in ASPLOS, HPCA, ISCA, MICRO
Please let us know if you require an accommodation in order to participate in this event. Accommodations may include live captioning, ASL interpreters, and/or captioned media and accessible documents from recorded events. At least 5 days in advance is recommended.