Efficiently Packing Neural Network AI Model for the Edge
Packing applications into constrained on-chip memory is a familiar problem in embedded design, and is now equally important in compacting neural network AI models into a constrained storage. In some ways this problem is even more challenging than for conventional software because working memory in neural network-based systems is all “inner loop”, where demand to page out to DDR memory could kill performance. Equally bad, repetitive DDR accesses during inferencing will blow typical low power budgets for edge devices. A larger on-chip memory is one way to resolve the problem but that adds to product cost. The best option where possible is to pack the model as efficiently as possible into available memory.
When compiling a neural network AI model to run on an edge device there are well known quantization techniques to reduce size: converting floating point data and weight values to fixed point, then shrinking further to INT8 or smaller values. Imagine if you could go further. In this article I want to introduce a couple of graph optimization techniques which will allow you to fit a wider range of quantized models to say a 2MB L2 memory where these would not have fit after quantization alone.
To read the full article, click here
Related Semiconductor IP
- High-Speed 3.3V I/O library with 8kV ESD Protection in TSPCo 65nm
- Verification IP for DisplayPort/eDP
- Wirebond Digital and Analog Library in TSMC 65nm
- 5V FSGPIO, 5V GPIO, 5V GPI, 5V ODIOy in DB HiTek 130nm
- I/O Library in TSMC 130nm 5V Gen3 BCD
Related Blogs
- Real-Time Intelligence for Physical AI at the Edge
- Neural Network Model quantization on mobile
- 2023 in Review: AI Takes Center Stage in the Eternal Quest for Innovation
- Custom Compute for Edge AI: Accelerating innovation with Lund University and Codasip University Program
Latest Blogs
- Half of the Compute Shipped to Top Hyperscalers in 2025 will be Arm-based
- Industry's First Verification IP for Display Port Automotive Extensions (DP AE)
- IMG DXT GPU: A Game-Changer for Gaming Smartphones
- Rivos and Canonical partner to deliver scalable RISC-V solutions in Data Centers and enable an enterprise-grade Ubuntu experience across Rivos platforms
- ReRAM-Powered Edge AI: A Game-Changer for Energy Efficiency, Cost, and Security