The NeuroMosaic Processor (NMP) family is shattering the barriers to deploying ML by delivering a general-purpose architecture and simple programmer’s model to enable virtually any class of neural network architecture and use case.
Our unique differentiation starts with the ability to simultaneously execute multiple AI/ML models significantly expanding the realm of capability over existing approaches. This game-changing advantage is provided by the co-developed NeuroMosAIc Studio software’s ability to dynamically allocate HW resources to match the target workload resulting in highly optimized, low-power execution. The designer may also select the optional on-device training acceleration extension enabling iterative learning post-deployment. This key capability cuts the cord to cloud dependence while elevating the accuracy, efficiency, customization, and personalization without reliance on costly model retraining and deployment, thereby extending device lifecycles.
Performance Efficiency Leading AI Accelerator for Mobile and Edge Devices
Overview
Key Features
- Performance: Up to 4 TOPs
- MACs (8x8): 512, 1K, 2K
- Data Types: 1-bit, INT8, INT16
- Internal SRAM: Up to 4MB
- AXI x3 interfaces
Benefits
- The NMP-550 is a significant leap forward for mid- to high-end mobile and edge computing systems requiring ultimate performance efficiency. AI acceleration extends to 6 TOPS while delivering an industry-leading 40 TOPS/W. Numerous architectural advances result in higher convolution throughput and 2x compute density while lowering total area by 25%. The addition of MISH and SWISH activation function support extends efficiency while an upgraded RISC-V controller delivers 4X initialization and post-processing performance over the NMP-500. Alternatively, designers may elect to use the Arm® Cortex®-M or Cortex-A for further flexibility and software extension.
- The patented and co-developed hardware and software architecture enables end-user flexibility to mold multiple models to the accelerator resources to achieve simultaneous, sequential or event-based requirements.
- Three configuration options allow the designer to scale back compute and memory resources to further optimize for area and power in highly constrained devices.
Block Diagram
Applications
- Driver Monitoring and Fleet Management
- Image and Video Analytics
- Security and Surveillance
- Safety and Compliance Monitoring
Technical Specifications
Maturity
Production Proven
Availability
Publicly Licensable
Related IPs
- Performance Efficiency AI Accelerator for Mobile and Edge Devices
- Performance AI Accelerator for Edge Computing
- PCIe Switch for USB4 Hubs, Hosts and Devices
- PCIe Controller for USB4 Hosts and Devices, supporting PCIe Tunneling
- PCIe Controller for USB4 Hosts and Devices supporting PCIe Tunneling, with optional built-in DMA and configurable AMBA AXI interface
- Edge AI Accelerator NNE 1.0