LLM Inference on RISC-V Embedded CPUs
By Yueh-Feng Lee, Andes Technology
The advancement of large language models (LLMs) has significantly enhanced natural language processing capabilities, enabling complex text understanding and generation tasks. This presentation focuses on optimizing the open-source LLaMA CPP project for the RISC-V P extension. By running the TinyLLaMA 1.1B model on the Andes Voyager development board using a quad-core CPU supporting the RISC-V P extension, performance results show that the model can achieve near real-time response. This work highlights the potential of RISC-V as an efficient platform for deploying advanced AI models in resource-constrained environments, contributing to the growing field of edge computing and embedded AI applications.
Related Semiconductor IP
- 64-bit, RISC-V, ultra-high performance processors
- 64-bit, RISC-V, performance and data computation processors
- 32-bit, RISC-V, deeply embedded processors
- RISC-V Display Connectivity Subsystem (DCS)
- RISC-V IOPMP IP
Related Videos
- RISC-V at NVIDIA: One Architecture, Dozens of Applications, Billions of Processors
- LLMs On The Edge
- Secure RISC-V Processor for Root of Trust
- RISC-V Design Innovations with Custom Extensions
Latest Videos
- Software to silicon with RISC-V for Physical AI
- Breaking the Memory Wall: How New Memory Architectures are Reshaping AI Inference
- Functional Safety & Security Aspects of CAN XL
- Powering the AI Supercycle: Design for AI and AI for Design - Anirudh Devgan
- Scaling AI from Edge to Data Center with SiFive RISC-V Vectors