Benefit of pruning and clustering a neural network for before deploying on Arm Ethos-U NPU
Pruning and clustering are optimization techniques:
- Pruning: setting weights to zero
- Clustering: grouping weights together into clusters
These techniques modify the weights of a Machine Learning model. In some cases, they enable:
- Significant speed-up of the inference execution
- Reduction of the memory footprint
- Reduction in the overall power consumption of the system
We assume that you can optimize your workload without loss in accuracy and that you target an Arm® Ethos NPU. You can therefore prune and cluster your neural network before using the Vela compiler and deploying it on the Ethos-U hardware. See below for more information on optimizing your workload.
To read the full article, click here
Related Semiconductor IP
- Very Low Latency BCH Codec
- 5G-NTN Modem IP for Satellite User Terminals
- 400G UDP/IP Hardware Protocol Stack
- AXI-S Protocol Layer for UCIe
- HBM4E Controller IP
Related Blogs
- Reviewing different Neural Network Models for Multi-Agent games on Arm using Unity
- Neural Network Model quantization on mobile
- Silicon-proven LVTS for 2nm: a new era of accuracy and integration in thermal monitoring
- Verification of UALink (UAL) and Ultra Ethernet (UEC) Protocols for Scalable HPC/AI Networks using Synopsys VIP
Latest Blogs
- Embedded Security explained: Post-Quantum Cryptography (PQC) for embedded Systems
- Accreditation Without Compromise: Making eFPGA Assurable for Decades
- Synopsys Delivers First Complete UFS 5.0 and M‑PHY v6.0 IP Solution for Next‑Gen Storage
- World First: Synopsys MACsec IP Receives ISO/PAS 8800 Certification for Automotive and Physical AI Security
- Last-level cache has become a critical SoC design element