One Instruction Stream, Infinite Possibilities: The Cervell™ Approach to Reinventing the NPU

By Volker Politz

June 11, 2025

AI is evolving faster than the chips designed to run it. Models like large language transformers and generative networks are shifting rapidly–while silicon development cycles remain long and rigid. Traditional NPUs, built around proprietary instruction sets and opaque compilers, simply can’t keep up.

That’s why Semidynamics built Cervell™, a new kind of Neural Processing Unit IP based entirely on RISC-V. Cervell is configurable, transparent, and ready to run modern AI workloads with unmatched flexibility.

Cervell is a neural processing unit IP that can deliver up to 256 TOPS per core at 2GHz, depending on configuration. But the point isn’t just raw performance–it’s adaptability.

Rather than shipping a fixed-function design, Semidynamics uses its IP building blocks to customize the architecture to the customer’s specific needs. Standard configurations are available, but Cervell is intended to be tailored–because real-world workloads vary, and optimization matters.

The target applications include full-stack AI workloads, from convolutional networks to transformers and generative AI. Importantly, Cervell is designed to handle end-to-end inference, including preprocessing and postprocessing–not just the main matrix multiplication.

Most NPUs today are locked into secret instruction sets. That’s a problem when AI models are evolving on a 6-month cycle–and your silicon takes 3 years to develop. By the time you ship, your NPU might not even support the latest models.

With Cervell, the instruction set is based on the RISC-V vector extension–which is designed for parallel data workloads. This allows AI developers to use familiar general-purpose programming environments while retaining the flexibility to support future model changes.

RISC-V vector instructions also play the role of what many would recognize as GPU-style general-purpose compute (GPGPU)–but within a unified, open framework.

One Instruction Stream, All Compute Types

What sets Cervell apart architecturally is that it doesn’t split CPU, vector, and tensor units into separate islands. Everything is handled through a single instruction stream. Whether you’re doing scalar, vector, or tensor operations, it all runs through the same programmable infrastructure. That unification makes the hardware easier to adapt and the software stack much more manageable.

Cervell configurations can range from 8 to 64 TOPS at 1GHz (or up to 256 TOPS at 2GHz), and the design supports precision scaling–including FP64 for data center inference and lower-precision formats like INT8 for mobile and embedded use.

This configurability is critical: in many cases, the AI workload determines what really matters. For example, some large language models do most of their heavy lifting on the GPU or vector units, not the NPU. Cervell lets designers dial in the performance where it counts–without overbuilding.

Cervell is supported by a complete SDK called Aliado, which builds on standard RISC-V tools like GCC and LLVM. It includes:

A full math library for common AI operations
Transparent compiler toolchains with and without vectorization
Support for ONNX models and runtimes
Tools for model optimization, including quantization recommendations for bandwidth-limited deployments

Importantly, Semidynamics does not rely on a proprietary machine learning compiler. Developers work with standard tools they already know, and everything remains open and inspectable.

Earlier this year, the team took a PyTorch version of DeepSeek, converted it to ONNX, and had it up and running on Cervell in under three days. That’s not a simulation–it’s a working implementation of a real large language model, deployed on real IP.

Benchmarks on additional models are available under NDA, and the company has already built a growing library of supported workloads.

AI models will continue to evolve rapidly. Whether you're building for automotive, datacenter inference, robotics, or edge AI, the hardware needs to be flexible enough to adapt. Cervell’s RISC-V foundation and instruction-level configuration give chip designers a new level of control–without sacrificing performance or software portability.

And because it’s IP, not a monolithic SoC, you can integrate it into your own silicon, scale it to match your needs, and take it to production on your schedule.

NPU IP Selector

One Instruction Stream, Infinite Possibilities: The Cervell™ Approach to Reinventing the NPU

One Instruction Stream, All Compute Types

Related Semiconductor IP

Related Blogs

Latest Blogs

One Instruction Stream, Infinite Possibilities: The Cervell™ Approach to Reinventing the NPU

One Instruction Stream, All Compute Types

Subscribe to the Semi IP Hub Newsletter

Related Semiconductor IP

Related Blogs

Latest Blogs