Benefit of pruning and clustering a neural network for before deploying on Arm Ethos-U NPU
Pruning and clustering are optimization techniques:
- Pruning: setting weights to zero
- Clustering: grouping weights together into clusters
These techniques modify the weights of a Machine Learning model. In some cases, they enable:
- Significant speed-up of the inference execution
- Reduction of the memory footprint
- Reduction in the overall power consumption of the system
We assume that you can optimize your workload without loss in accuracy and that you target an Arm® Ethos NPU. You can therefore prune and cluster your neural network before using the Vela compiler and deploying it on the Ethos-U hardware. See below for more information on optimizing your workload.
To read the full article, click here
Related Semiconductor IP
- Chiplet Die-to-Die Interconnect IP Solution
- High speed MACsec Engine 100G/200G/400G/800G/1.6T
- Temperature/Voltage sensors
- AMBA Bus Host to eSPI Controller/Target
- AMBA Bus Host to eSPI Controller
Related Blogs
- Reviewing different Neural Network Models for Multi-Agent games on Arm using Unity
- Neural Network Model quantization on mobile
- Silicon-proven LVTS for 2nm: a new era of accuracy and integration in thermal monitoring
- Area, Pipelining, Integration: A Comparison of SHA-2 and SHA-3 for embedded Systems.
Latest Blogs
- Embedded Security explained: Advanced Encryption Standard (AES)
- Cadence Demonstrates PCIe 8.0 PHY at PCI-SIG DevCon 2026
- Cadence Achieves Successful Silicon Validation of 1st IP Test Chips on Intel 18A
- From Classical CAN and CAN FD to CAN XL: Functional Safety and Security for Next-Generation In-Vehicle Communication
- Accelerating Embedded Memory Performance with 16-bit xSPI PSRAM IP