The ARC® NPX Neural Processor IP family provides a high-performance, power- and area-efficient IP solution for a range of applications requiring AI-enabled SoCs. The ARC NPX6 NPU IP is designed for deep learning algorithm coverage including both computer vision tasks such as object detection, image quality improvement, and scene segmentation, and for broader AI applications, including generative AI.
The NPX6 NPU family offers multiple products to choose from to meet your specific application requirements. The architecture is based on individual cores that can scale from 1K MACs to 96K MACs for a single AI engine performance of over 250 TOPS and over 440 TOPS with sparsity. The NPX6 NPU IP includes hardware and software support for multi-NPU clusters of up to 8 NPUs achieving 3500 TOPS with sparsity. Advanced bandwidth features in hardware and software, and a memory hierarchy (including L1 memory in each core and a high-performance, low-latency interconnect to access a shared L2 memory) make scaling to a high MAC count possible. An optional tensor floating point unit is available for applications benefiting from BF16 or FP16 inside the neural network.
To speed application software development, the ARC NPX6 NPU Processor IP is supported by the MetaWare MX Development Toolkit, a comprehensive software programming environment that includes a neural network Software Development Kit (NN SDK) and support for virtual models. The NN SDK automatically converts neural networks trained using popular frameworks, like Pytorch, Tensorflow, or ONNX into optimized executable code for the NPX hardware.
The NPX6 NPU Processor IP can be used to create a range of products – from a few TOPS to 1000s of TOPS – that can be programmed with a single toolchain.