Performance Efficiency Leading AI Accelerator for Mobile and Edge Devices

Overview

The NeuroMosaic Processor (NMP) family is shattering the barriers to deploying ML by delivering a general-purpose architecture and simple programmer’s model to enable virtually any class of neural network architecture and use case.

Our unique differentiation starts with the ability to simultaneously execute multiple AI/ML models significantly expanding the realm of capability over existing approaches. This game-changing advantage is provided by the co-developed NeuroMosAIc Studio software’s ability to dynamically allocate HW resources to match the target workload resulting in highly optimized, low-power execution. The designer may also select the optional on-device training acceleration extension enabling iterative learning post-deployment. This key capability cuts the cord to cloud dependence while elevating the accuracy, efficiency, customization, and personalization without reliance on costly model retraining and deployment, thereby extending device lifecycles.

Key Features

  • Performance: Up to 4 TOPs
  • MACs (8x8): 512, 1K, 2K
  • Data Types: 1-bit, INT8, INT16
  • Internal SRAM: Up to 4MB
  • AXI x3 interfaces

Benefits

  • The NMP-550 is a significant leap forward for mid- to high-end mobile and edge computing systems requiring ultimate performance efficiency. AI acceleration extends to 6 TOPS while delivering an industry-leading 40 TOPS/W. Numerous architectural advances result in higher convolution throughput and 2x compute density while lowering total area by 25%. The addition of MISH and SWISH activation function support extends efficiency while an upgraded RISC-V controller delivers 4X initialization and post-processing performance over the NMP-500. Alternatively, designers may elect to use the Arm® Cortex®-M or Cortex-A for further flexibility and software extension.
  • The patented and co-developed hardware and software architecture enables end-user flexibility to mold multiple models to the accelerator resources to achieve simultaneous, sequential or event-based requirements.
  • Three configuration options allow the designer to scale back compute and memory resources to further optimize for area and power in highly constrained devices.

Block Diagram

Performance Efficiency Leading AI Accelerator for Mobile and Edge Devices Block Diagram

Applications

  • Driver Monitoring and Fleet Management
  • Image and Video Analytics
  • Security and Surveillance
  • Safety and Compliance Monitoring

Technical Specifications

Maturity
Production Proven
Availability
Publicly Licensable
×
Semiconductor IP