Viviane received her BSc and MSc degree in Electrical Engineering and Information Technology from ETH Zurich in 2020 and 2022, respectively. She is currently pursuing a PhD in the Digital Circuits and Systems group of Prof. Benini. Her research focuses on heterogeneous architectures for energy-efficient multi-modal AI fusion and the exploration of innovative data representation strategies to enhance the computational efficiency and adaptability on devices at the extreme edge, ranging from high-performance to resource-constrained environments.
Optimizing Transformer Model Inference on a RISC-V Platform
This presentation will cover the optimization techniques for deploying transformer-based foundation models, including Large Language Models (LLMs) and Vision Transformers (ViTs), on an open-source many-core RISC-V platform. The talk will detail how we implemented distributed Softmax primitives, SIMD extensions, and specialized DMA engines to achieve significant speedups and improved power efficiency. The hardware architecture features a scalable, hierarchical structure with compute clusters organized for efficient data flow, minimizing latency and maximizing floating-point unit utilization. We also compare the performance of this open-source platform against state-of-the-art accelerators, highlighting its potential for scalable and energy-efficient AI model inference in natural language processing and computer vision applications.