NPU IP
Filter
Compare
66
IP
from 19 vendors
(1
-
10)
-
General Purpose Neural Processing Unit (NPU)
- Hybrid Von Neuman + 2D SIMD matrix architecture
- 64b Instruction word, single instruction issue per clock
- 7-stage, in-order pipeline
-
NPU IP for Embedded AI
- Fully programmable to efficiently execute Neural Networks, feature extraction, signal processing, audio and control code
- Scalable performance by design to meet wide range of use cases with MAC configurations with up to 64 int8 (native 128 of 4x8) MACs per cycle
- Future proof architecture that supports the most advanced ML data types and operators
-
NPU IP family for generative and classic AI with highest power efficiency, scalable and future proof
- Support wide range of activations & weights data types, from 32-bit Floating Point down to 2-bit Binary Neural Networks (BNN)
-
Highly scalable inference NPU IP for next-gen AI applications
- Matrix Multiplication: 4096 MACs/cycles (int 8), 1024 MACs/cycles (int 16)
- Vector processor: RISC-V with RVV 1.0
- Custom instructions for softmax and local storage access
-
4-/8-bit mixed-precision NPU IP
- Easy customization at different core sizes and performance
- NN Converter converts a network file into an internal network format and supports ONNX (PyTorch), TF-Lite, and CFG (Darknet)
-
Optional extension of NPX6 NPU tensor operations to include floating-point support with BF16 or BF16+FP16
- Scalable real-time AI / neural processor IP with up to 3,500 TOPS performance
- Supports CNNs, transformers, including generative AI, recommender networks, RNNs/LSTMs, etc.
- Industry leading power efficiency (up to 30 TOPS/W)
- One 1K MAC core or 1-24 cores of an enhanced 4K MAC/core convolution accelerator
-
NPU
- Sustainable Innovation
- Scalable Performance
- Generative AI at the Edge
- System Level Solution
-
High-Performance NPU
- Low Power Consumption
- High Performance
- Flexibility and Configurability
- High-Precision Inference
-
ARC NPX Neural Processing Unit (NPU) IP supports the latest, most complex neural network models and addresses demands for real-time compute with ultra-low power consumption for AI applications
- ARC processor cores are optimized to deliver the best performance/power/area (PPA) efficiency in the industry for embedded SoCs. Designed from the start for power-sensitive embedded applications, ARC processors implement a Harvard architecture for higher performance through simultaneous instruction and data memory access, and a high-speed scalar pipeline for maximum power efficiency. The 32-bit RISC engine offers a mixed 16-bit/32-bit instruction set for greater code density in embedded systems.
- ARC's high degree of configurability and instruction set architecture (ISA) extensibility contribute to its best-in-class PPA efficiency. Designers have the ability to add or omit hardware features to optimize the core's PPA for their target application - no wasted gates. ARC users also have the ability to add their own custom instructions and hardware accelerators to the core, as well as tightly couple memory and peripherals, enabling dramatic improvements in performance and power-efficiency at both the processor and system levels.
- Complete and proven commercial and open source tool chains, optimized for ARC processors, give SoC designers the development environment they need to efficiently develop ARC-based systems that meet all of their PPA targets.
-
Neural network processor designed for edge devices
- High energy efficiency
- Support mainstream deep learning frameworks
- Low power consumption
- An integrated AI solution