Benefit of pruning and clustering a neural network for before deploying on Arm Ethos-U NPU
Pruning and clustering are optimization techniques:
- Pruning: setting weights to zero
- Clustering: grouping weights together into clusters
These techniques modify the weights of a Machine Learning model. In some cases, they enable:
- Significant speed-up of the inference execution
- Reduction of the memory footprint
- Reduction in the overall power consumption of the system
We assume that you can optimize your workload without loss in accuracy and that you target an Arm® Ethos NPU. You can therefore prune and cluster your neural network before using the Vela compiler and deploying it on the Ethos-U hardware. See below for more information on optimizing your workload.
To read the full article, click here
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related Blogs
- Reviewing different Neural Network Models for Multi-Agent games on Arm using Unity
- How Google and Arm Collaborate on the Next Wave of Cloud Infrastructure
- Neural Network Model quantization on mobile
- Pace of Innovation for Custom Silicon on Arm Continues with AWS Graviton4
Latest Blogs
- Cadence Announces Industry's First Verification IP for Embedded USB2v2 (eUSB2v2)
- The Industry’s First USB4 Device IP Certification Will Speed Innovation and Edge AI Enablement
- Understanding Extended Metadata in CXL 3.1: What It Means for Your Systems
- 2025 Outlook with Mahesh Tirupattur of Analog Bits
- eUSB2 Version 2 with 4.8Gbps and the Use Cases: A Comprehensive Overview