Benefit of pruning and clustering a neural network for before deploying on Arm Ethos-U NPU
Pruning and clustering are optimization techniques:
- Pruning: setting weights to zero
- Clustering: grouping weights together into clusters
These techniques modify the weights of a Machine Learning model. In some cases, they enable:
- Significant speed-up of the inference execution
- Reduction of the memory footprint
- Reduction in the overall power consumption of the system
We assume that you can optimize your workload without loss in accuracy and that you target an Arm® Ethos NPU. You can therefore prune and cluster your neural network before using the Vela compiler and deploying it on the Ethos-U hardware. See below for more information on optimizing your workload.
Related Semiconductor IP
- AES GCM IP Core
- High Speed Ethernet Quad 10G to 100G PCS
- High Speed Ethernet Gen-2 Quad 100G PCS IP
- High Speed Ethernet 4/2/1-Lane 100G PCS
- High Speed Ethernet 2/4/8-Lane 200G/400G PCS
Related Blogs
- Reviewing different Neural Network Models for Multi-Agent games on Arm using Unity
- How Google and Arm Collaborate on the Next Wave of Cloud Infrastructure
- Neural Network Model quantization on mobile
- Pace of Innovation for Custom Silicon on Arm Continues with AWS Graviton4
Latest Blogs
- Why Choose Hard IP for Embedded FPGA in Aerospace and Defense Applications
- Migrating the CPU IP Development from MIPS to RISC-V Instruction Set Architecture
- Quintauris: Accelerating RISC-V Innovation for next-gen Hardware
- Say Goodbye to Limits and Hello to Freedom of Scalability in the MIPS P8700
- Why is Hard IP a Better Solution for Embedded FPGA (eFPGA) Technology?