LLM Inference on RISC-V Embedded CPUs
By Yueh-Feng Lee, Andes Technology
The advancement of large language models (LLMs) has significantly enhanced natural language processing capabilities, enabling complex text understanding and generation tasks. This presentation focuses on optimizing the open-source LLaMA CPP project for the RISC-V P extension. By running the TinyLLaMA 1.1B model on the Andes Voyager development board using a quad-core CPU supporting the RISC-V P extension, performance results show that the model can achieve near real-time response. This work highlights the potential of RISC-V as an efficient platform for deploying advanced AI models in resource-constrained environments, contributing to the growing field of edge computing and embedded AI applications.