Highly scalable performance for classic and generative on-device and edge AI solutions

Overview

The Cadence Neo NPUs offer energy-efficient hardware-based AI engines that can be paired with any host processor for offloading artificial intelligence and machine learning (AI/ML) processing. The Neo NPUs target a wide variety of applications, including sensor, audio, voice/speech, vision, radar, and more. The comprehensive performance range makes the Neo NPUs well-suited for ultra-power-sensitive applications such as IoT, hearables/wearables, high-performance systems in AR/VR, automotive, and more.

The Neo NPUs target a wide variety of applications, including sensor, audio, voice/speech, vision, radar, and more. The comprehensive performance range makes the Neo NPUs well-suited for ultra-power-sensitive applications such as IoT, hearables/wearables, high-performance systems in AR/VR, automotive, and more.The product architecture natively supports the processing required for many network topologies and operators, allowing for a complete or near-complete offload from the host processor. Depending on the application’s needs, the host processor can be an application processor, a general-purpose MCU, or a DSP for pre-/post-processing and associated signal processing, with the inferencing managed by the NPU.

Key Features

  • Flexible System Integration.
    • The Neo NPUs can be integrated with any host processor to offload the AI portions of the application.
  • Scalable Design and Configurability.
    • The Neo NPUs support up to 80 TOPS with a single-core and are architected to enable multi-core solutions of 100s of TOPS.
  • Efficient in Mapping State-of-the-Art AI/ML Workloads.
    • Best-in-class performance for inferences per second with low latency and high throughput, optimized for achieving high performance within a low-energy profile for classic and generative AI.
  • Industry-Leading Performance and Power Efficiency.
    • High Inferences per second per area (IPS/mm2 and per power (IPS/W).
  • End-to-End Software Toolchain for All Markets and a Large Number of Frameworks.
    • NeuroWeave SDK provides a common tool for compiling networks across IP, with flexibility for performance, accuracy, and run-time environments.

Benefits

  • Single-core performance up to 80 TOPS
    • Configurable range of 256 MACs per cycle to 32k MACs per cycle
    • Upward-scalable with multi-core topologies for 100s of TOPS
  • Efficient offload and execution of neural network processing from any application host processor
  • Built-in support for many networks, including CNN, RNN, Transformer, and more
  • Built-in support for multiple data types, including int4, int8, int16, and fp16
  • Application targets varied across many domains (sensor, audio, vision, radar) and markets (IoT, hearables/wearables, AR/VR, automotive)

Block Diagram

Highly scalable performance for classic and generative on-device and edge AI solutions Block Diagram

Deliverables

  • Free Software Evaluation
    • Try our SDK Software Development Toolkit for 15 days absolutely free. We want to show you how easy it is to use our Eclipse-based IDE.
  • Online Support
    • The Cadence Online Support (COS) system fields our entire library of accessible materials for self-study and step-by-step instruction.
  • Xtensa Processor Generator (XPG)
    • The Xtensa Processor Generator (XPG) is the heart of our technology - the patented cloud-based system that creates your correct-by-construction processor and all associated software, models, etc. (Login Required)
  • Technical Forums
    • Find community on the technical forums to discuss and elaborate on your design ideas.

Technical Specifications

×
Semiconductor IP