Efficient inference on IMG Series4 NNAs
Research into neural network architectures generally prioritises accuracy over efficiency. Certain papers have investigated efficiency (Tan and Le 2020) (Sandler, et al. 2018), but quite often this is with CPU- or GPU-based rather than accelerator-based inference in mind.
In this original work from Imagination’s AI Research team, many well-known classification networks trained on ImageNet are evaluated. We are not interested in accuracy or cost in their own right, but rather in efficiency, which is a combination of the two. In other words, we want networks that get high accuracy on our IMG Series4 NNAs at as low a cost as possible. We cover:
- identifying ImageNet classification network architectures that give the best accuracy/performance trade-offs on our Series4 NNAs.
- reducing cost dramatically using quantisation-aware training (QAT) and low-precision weights without affecting accuracy.
To read the full article, click here
Related Semiconductor IP
- NPU IP Core for Mobile
- NPU IP Core for Edge
- Specialized Video Processing NPU IP
- HYPERBUS™ Memory Controller
- AV1 Video Encoder IP
Related Blogs
- Deep learning inference performance on the Yitian 710
- Word from the Source - USB-IF on What USB-IF Is and What's New in USB (Jeff Ravencraft Interview - Part 1)
- Word from the Source - USB-IF on USB Type-C and Alternate Modes (Jeff Ravencraft Interview - Part 2)
- Semiconductors Future Hinges on a Single Pillar
Latest Blogs
- Cadence Extends Support for Automotive Solutions on Arm Zena Compute Subsystems
- The Role of GPU in AI: Tech Impact & Imagination Technologies
- Time-of-Flight Decoding with Tensilica Vision DSPs - AI's Role in ToF Decoding
- Synopsys Expands Collaboration with Arm to Accelerate the Automotive Industry’s Transformation to Software-Defined Vehicles
- Deep Robotics and Arm Power the Future of Autonomous Mobility