SiFive Accelerates RISC-V Vector Integration in XNNPACK for Optimized AI Inference
In this blog, we’ll begin by introducing XNNPACK and exploring its current status on RISC-V platforms. Next, to bridge the technical gap for contributors, we will provide a step-by-step guide on integrating RVV-optimized microkernels into XNNPACK, using F32-GEMM (single-precision floating-point general matrix multiplication) as a practical example. Finally, we will highlight the performance improvements achieved through these optimizations.
XNNPACK and status of RVV backend
XNNPACK is a crucial library-based solution for neural network inference on ARM, x86, and RISC-V platforms. It serves as a low-level acceleration backend for machine learning frameworks, such as TensorFlow Lite, PyTorch, ONNX Runtime, and MediaPipe. XNNPACK enhances performance by decomposing operations into microkernels and applying target-specific optimizations for each architecture.
Historically, XNNPACK offered limited support for RISC-V Vector (RVV) extension, providing only a small number of RVV-optimized microkernels. As a result, RISC-V users often had to rely on generic C implementations. To utilize vector hardware, users can only rely on auto-vectorizers for parallelization or translation headers or tools to adapt intrinsics from other platforms.
To enhance AI inference performance on RISC-V, SiFive has contributed several RVV-optimized floating-point microkernels to XNNPACK. They are listed in Table 1. These contributions significantly improve performance, making RISC-V a more viable platform for neural network inference. We welcome everyone to join us in XNNPACK RVV backend contributions.
To read the full article, click here
Related Semiconductor IP
- 64-bit embedded processor, fully compliant with the RISC-V ISA
- 8-stage, dual-issue, highly efficient in-order pipeline compatible with the RISC-V RV64GCV ISA
- Highest performance four-issue, out-of-order RISC-V vector application processors
- High performance three-issue, out-of-order RISC-V vector application processor
- RISC-V CPU IP
Related Blogs
- SiFive - The Market Leader In RISC-V Vectors
- Incredibly Scalable High-Performance RISC-V Core IP
- Part 1: Fast Access to Accelerators: Enabling Optimized Data Transfer with RISC-V
- High-Bandwidth Core Access to Accelerators: Enabling Optimized Data Transfers with RISC-V
Latest Blogs
- RISC-V Processor Design - Free YouTube Course by Maven Silicon
- Why Secure Boot is Your Network’s Best Friend (And What BlackTech Taught Us)
- How PCIe® Technology is Connecting Disaggregated Systems for Generative AI
- Future of PQC on OpenTitan
- HiFive Premier P550 Development Boards with Ubuntu Now Available—With Great Reviews and a Lower Price