One Instruction Stream, Infinite Possibilities: The Cervell™ Approach to Reinventing the NPU
AI is evolving faster than the chips designed to run it. Models like large language transformers and generative networks are shifting rapidly–while silicon development cycles remain long and rigid. Traditional NPUs, built around proprietary instruction sets and opaque compilers, simply can’t keep up.
That’s why Semidynamics built Cervell™, a new kind of Neural Processing Unit IP based entirely on RISC-V. Cervell is configurable, transparent, and ready to run modern AI workloads with unmatched flexibility.
Cervell is a neural processing unit IP that can deliver up to 256 TOPS per core at 2GHz, depending on configuration. But the point isn’t just raw performance–it’s adaptability.
Rather than shipping a fixed-function design, Semidynamics uses its IP building blocks to customize the architecture to the customer’s specific needs. Standard configurations are available, but Cervell is intended to be tailored–because real-world workloads vary, and optimization matters.
The target applications include full-stack AI workloads, from convolutional networks to transformers and generative AI. Importantly, Cervell is designed to handle end-to-end inference, including preprocessing and postprocessing–not just the main matrix multiplication.
Most NPUs today are locked into secret instruction sets. That’s a problem when AI models are evolving on a 6-month cycle–and your silicon takes 3 years to develop. By the time you ship, your NPU might not even support the latest models.
With Cervell, the instruction set is based on the RISC-V vector extension–which is designed for parallel data workloads. This allows AI developers to use familiar general-purpose programming environments while retaining the flexibility to support future model changes.
RISC-V vector instructions also play the role of what many would recognize as GPU-style general-purpose compute (GPGPU)–but within a unified, open framework.
One Instruction Stream, All Compute Types
What sets Cervell apart architecturally is that it doesn’t split CPU, vector, and tensor units into separate islands. Everything is handled through a single instruction stream. Whether you’re doing scalar, vector, or tensor operations, it all runs through the same programmable infrastructure. That unification makes the hardware easier to adapt and the software stack much more manageable.
Cervell configurations can range from 8 to 64 TOPS at 1GHz (or up to 256 TOPS at 2GHz), and the design supports precision scaling–including FP64 for data center inference and lower-precision formats like INT8 for mobile and embedded use.
This configurability is critical: in many cases, the AI workload determines what really matters. For example, some large language models do most of their heavy lifting on the GPU or vector units, not the NPU. Cervell lets designers dial in the performance where it counts–without overbuilding.
Cervell is supported by a complete SDK called Aliado, which builds on standard RISC-V tools like GCC and LLVM. It includes:
- A full math library for common AI operations
- Transparent compiler toolchains with and without vectorization
- Support for ONNX models and runtimes
- Tools for model optimization, including quantization recommendations for bandwidth-limited deployments
Importantly, Semidynamics does not rely on a proprietary machine learning compiler. Developers work with standard tools they already know, and everything remains open and inspectable.
Earlier this year, the team took a PyTorch version of DeepSeek, converted it to ONNX, and had it up and running on Cervell in under three days. That’s not a simulation–it’s a working implementation of a real large language model, deployed on real IP.
Benchmarks on additional models are available under NDA, and the company has already built a growing library of supported workloads.
AI models will continue to evolve rapidly. Whether you're building for automotive, datacenter inference, robotics, or edge AI, the hardware needs to be flexible enough to adapt. Cervell’s RISC-V foundation and instruction-level configuration give chip designers a new level of control–without sacrificing performance or software portability.
And because it’s IP, not a monolithic SoC, you can integrate it into your own silicon, scale it to match your needs, and take it to production on your schedule.
Related Semiconductor IP
- All-In-One RISC-V NPU
- NPU
- NPU IP Core for Mobile
- NPU IP Core for Edge
- Specialized Video Processing NPU IP
Related Blogs
- Migrating the CPU IP Development from MIPS to RISC-V Instruction Set Architecture
- The Shift-Left Approach to Software Development
- Legacy IP Providers Struggle to Solve the NPU Dilemna
- Synopsys and Alchip Collaborate to Streamline the Path to Multi-die Success with Soft Chiplets
Latest Blogs
- Powering the Future of RF: Falcomm and GlobalFoundries at IMS 2025
- The Coming NPU Population Collapse
- Driving the Future of High-Speed Computing with PCIe 7.0 Innovation
- Khronos Announces Vulkan Video Decode VP9 Extension
- One Instruction Stream, Infinite Possibilities: The Cervell™ Approach to Reinventing the NPU