Self-Compressing Neural Networks
The last decade of AI research has been characterised by the exploration of the potential of deep neural networks. The advances we have seen in recent years can be at least partly attributed to the increasing size of networks. Considerable effort has been put into creating larger and more complex architectures capable of increasingly impressive feats, from text generation with GPT-3 [1] to image generation with Imagen [2]. Moreover, the success of modern neural networks has led to their deployment in a wide variety of applications. Even as I'm writing this, a neural network is attempting to predict the next word I'm about to write, albeit not accurately enough to replace me anytime soon!
Performance optimisation, on the other hand, has received relatively little attention in the field, which is a significant obstacle to the wider deployment of neural networks. A likely reason for this is the ability to train large neural networks in data centres on thousands of GPUs or other hardware simultaneously. This contrasts with the field of computer graphics for example, where the constraint of having to run in real-time on a single computer created a strong incentive to optimise algorithms without sacrificing quality.
Research in neural network capacity suggests that network capacities needed to discover high-accuracy solutions are greater than capacities needed to represent these solutions. In their paper, The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, Frankle and Carbin [3] found that only a small fraction of weights in a network are needed to represent a good solution, but directly training a reduced capacity network doesn’t lead to the same level of accuracy. Similarly, Hinton et al. [4] found that transferring “knowledge” from a high-accuracy network to a low-capacity one can produce a network with higher accuracy than training using the same loss function as the high-capacity network.
In this blog post, we ask whether it is possible to dynamically reduce a network’s parameters while training.
To read the full article, click here
Related Semiconductor IP
- eDP 2.0 Verification IP
- Gen#2 of 64-bit RISC-V core with out-of-order pipeline based complex
- LLM AI IP Core
- Post-Quantum Digital Signature IP Core
- Compact Embedded RISC-V Processor
Related Blogs
- Embedded Vision: The Road Ahead for Neural Networks and Five Likely Surprises
- Push-button generation of deep neural networks
- Hierarchical Neural Networks
- Deployable Artificial Neural Networks Will Change Everything
Latest Blogs
- Enhancing PCIe6.0 Performance: Flit Sequence Numbers and Selective NAK Explained
- Smarter ASICs and SoCs: Unlocking Real-World Connectivity with eFPGA and Data Converters
- RISC-V Takes First Step Toward International Standardization as ISO/IEC JTC1 Grants PAS Submitter Status
- Running Optimized PyTorch Models on Cadence DSPs with ExecuTorch
- PCIe 6.x: Synopsys IP Selected as First Gold System for Compliance Testing
