Overview
32-bit RISC processor with 10-stage pipeline offers performance of over 2GHz suitable for embedded applications with high performance, large memories, and compute-intensive requirements
The Cadence® Tensilica® Xtensa® NX processor platform is the newest addition to the Xtensa customizable processors with a performance of over 2GHz suitable for embedded applications that require high performance and large memories, and for performing compute-intensive tasks. The Xtensa NX processor is built on the highly successful energy-efficient Xtensa Instruction Set Architecture (ISA) with various architectural enhancements, including a deeper pipeline, new interrupt architecture, branch prediction, and many more, while offering the same level of configurability and extensibility that Xtensa processors are known for.
Provider
Cadence Design Systems, Inc.
HQ:
USA
If you want to achieve silicon success, let Cadence help you choose the right IP solution and capture its full value in your SoC design. Cadence® IP solutions offer the combined advantages of a high-quality portfolio, an open platform, a modern IP factory approach to quality, and a strong ecosystem.
Now you can tackle IP-to-SoC development in a system context, focus your internal effort on differentiation, and leverage multi-function cores to do more, faster.
The Cadence IP Portfolio includes silicon-proven Tensilica® IP cores, analog PHY interfaces, standards-based IP cores, verification IP cores, and other solutions as well as customization services for current and emerging industry standards. The Cadence IP Factory provides you with an automated approach to the customization, delivery, and verification of SoC IP. As a result, you can spend more time on differentiation, with the assurance that you'll meet your performance, power, and area requirements.
Choosing Cadence IP enables you to design with confidence because you have more freedom to innovate your SoCs with less risk and faster time to market.
Learn more about CPU IP core
For the first time in our more than 35-year history, Arm is delivering its own silicon products – extending the Arm Neoverse platform beyond IP and Arm Compute Subsystems (CSS) to give customers greater choice in how they deploy Arm compute – from building custom silicon to integrating platform-level solutions or deploying Arm-designed processors.
The ChiPy DSL is Quadric's Python framework for building complete on-chip pipelines. Using YOLOX-M as a case study, we show how backbone inference, box decoding, and NMS run entirely on the Chimera GPNPU — no host CPU intervention, no DDR round-trips, just Python compiled to silicon.
As part of the new Arm Lumex compute subsystem (CSS) platform, the Arm C1 CPU cluster – the first built on the Armv9.3 architecture – is the next evolution of our highest performing CPU cluster for consumer devices, designed to unleash the full potential of on-device AI and elevate the user experience.
Hardware fuzzing has recently gained momentum with many discovered bugs in open-source RISC-V CPU designs. Comparing the effectiveness of different hardware fuzzers, however, remains a challenge: each fuzzer optimizes for a different metric and is demonstrated on different CPU designs.
Unlock ultra-efficient performance, advanced AI processing, and robust security with the Cortex-A320—designed to power the future of IoT and edge AI innovation.
Pie maintains low computation latency, high throughput, and high elasticity. Our experimental evaluation demonstrates that Pie achieves optimal swapping policy during cache warmup and effectively balances increased memory capacity with negligible impact on computation. With its extended capacity, Pie outperforms vLLM by up to 1.9X in throughput and 2X in latency. Additionally, Pie can reduce GPU memory usage by up to 1.67X while maintaining the same performance. Compared to FlexGen, an offline profiling-based swapping solution, Pie achieves magnitudes lower latency and 9.4X higher throughput.