Benefit of pruning and clustering a neural network for before deploying on Arm Ethos-U NPU
Pruning and clustering are optimization techniques:
- Pruning: setting weights to zero
- Clustering: grouping weights together into clusters
These techniques modify the weights of a Machine Learning model. In some cases, they enable:
- Significant speed-up of the inference execution
- Reduction of the memory footprint
- Reduction in the overall power consumption of the system
We assume that you can optimize your workload without loss in accuracy and that you target an Arm® Ethos NPU. You can therefore prune and cluster your neural network before using the Vela compiler and deploying it on the Ethos-U hardware. See below for more information on optimizing your workload.
To read the full article, click here
Related Semiconductor IP
- SLVS Transceiver in TSMC 28nm
- 0.9V/2.5V I/O Library in TSMC 55nm
- 1.8V/3.3V Multi-Voltage GPIO in TSMC 28nm
- 1.8V/3.3V I/O Library with 5V ODIO & Analog in TSMC 16nm
- ESD Solutions for Multi-Gigabit SerDes in TSMC 28nm
Related Blogs
- Reviewing different Neural Network Models for Multi-Agent games on Arm using Unity
- Neural Network Model quantization on mobile
- Pace of Innovation for Custom Silicon on Arm Continues with AWS Graviton4
- Windows on Arm is Ready for Prime Time: Native Chrome Caps Momentum for the Future of Laptop Computing
Latest Blogs
- Half of the Compute Shipped to Top Hyperscalers in 2025 will be Arm-based
- Industry's First Verification IP for Display Port Automotive Extensions (DP AE)
- IMG DXT GPU: A Game-Changer for Gaming Smartphones
- Rivos and Canonical partner to deliver scalable RISC-V solutions in Data Centers and enable an enterprise-grade Ubuntu experience across Rivos platforms
- ReRAM-Powered Edge AI: A Game-Changer for Energy Efficiency, Cost, and Security