Towards Delta Aware Training: Efficient DNN Weight Storage for Resource-Constrained FPGAs
By David Peter Federl, Lukas Einhaus, Andreas Erbslöh, Gregor Schiele
University Duisburg-Essen, Germany

Abstract
The deployment of embedded deep neural networks on resource-constrained field programmable gate arrays (FPGAs) is challenging due to limited memory and computational capacities. We introduce a new compression technique to reduce the memory footprint by saving weights in deltas with lower bitwidth and training the network to cope with compressed deltas. Two delta schemes are investigated: consecutive deltas and deltas with a fixed-reference value. We evaluate both on the FashionMNIST data set with a multi-layer-perceptron. The results indicate that fixed-reference delta compression outperforms the consecutive variant, achieving a validation accuracy of approximately 78.6 %, with 4 bit weight deltas, representing an accuracy loss of roughly 8.3 % compared to a fixed-point network with 8 bit. Our specialized hardware accelerator with a delta-compressed multiply-and-accumulate operator compresses weights by nearly 50 % and achieves a maximum throughput of 7.992M MACs/s on an AMD Spartan-7 S15 FPGA.
Keywords: embedded deep neural networks, resource-constrained, field programmable gate array, hardware accelerator, fixpoint-arithmetic, quan tization aware training, weight compression
To read the full article, click here
Related Articles
- FastPath: A Hybrid Approach for Efficient Hardware Security Verification
- Efficient Hardware-Assisted Heap Memory Safety for Embedded RISC-V Systems
- Bare-Metal RISC-V + NVDLA SoC for Efficient Deep Learning Inference
- A Resource-Driven Approach for Implementing CNNs on FPGAs Using Adaptive IPs
Latest Articles
- Towards Delta Aware Training: Efficient DNN Weight Storage for Resource-Constrained FPGAs
- CHERI-D: Secure and efficient inline object ID for CHERI temporal memory safety
- AIA: A 16nm Multicore SoC for Approximate Inference Acceleration Exploiting Non-normalized Knuth-Yao Sampling and Inter-Core Register Sharing
- InjectV: Modeling Fault Injection Attacks in RISC-V Simulation Environment
- SPARX: Secure and Privacy-Aware Approximate CNN Acceleration with Edge RISC-V SoC