ITP-STDP: An Intrinsic-Timing Power-of-Two Learning Engine for On-Chip SNN Training
By Haihang Xia 1, Xinyu Zhao 1, Xuecheng Wang 1, John Goodenough 1, Charith Abhayaratne 1, Panagiotis A. Panagiotou 1, Chunyi Song 2,3,4, and Tiantai Deng 5
1 School of Electrical and Electronic Engineering, The University of Sheffield, U.K.
2 Donghai Laboratory, Zhoushan 316021, China
3 Engineering Research Center of Oceanic Sensing Technology and Equipment, Ministry of Education, Zhoushan 316021, China
4 State Key Laboratory of Ocean Sensing and Ocean College, Zhejiang University, Zhoushan 316021, China.
5 Donghai Laboratory, Zhoushan 316021, China

Abstract
Spiking neural networks (SNNs) have the potential to emerge as the third generation of neural networks and have attracted increasing attention across a wide range of applications. However, the large number of synaptic connections in SNNs leads to intensive weight-update computation by on-chip learning algorithms during training, resulting in substantial hardware resource utilization and energy consumption. Among existing SNN learning algorithms, spike-timing-dependent plasticity (STDP) is one of the most extensively studied and widely adopted, serving as a fundamental learning component in SNNs. To address the hardware and energy overheads associated with SNN training, this paper presents intrinsic-timing power-of-two STDP (ITP-STDP) and its corresponding prototype learning engine hardware architecture. The proposed design is evaluated through a dedicated mean-field synaptic drift model for dynamical analysis and further validated across SNN networks of different scales and datasets. It is further implemented on both ASIC and FPGA platforms and compared with state-of-the-art approaches, including the original STDP and more complex STDP variants. The results demonstrate superior energy efficiency, higher operating speed, and substantially lower hardware resource utilization, as the proposed design eliminates most of the computational overhead of STDP through both algorithmic and hardware-level optimizations. On the FPGA platform, the proposed design improves energy efficiency by 4.5x to 219.8x over the compared designs. On the ASIC platform, the proposed design achieves a 4.8x to 22.01x speedup while consuming only 1.2% to 3.3% of the area required by prior works.
To read the full article, click here
Related Semiconductor IP
- Chiplet Die-to-Die Interconnect IP Solution
- High speed MACsec Engine 100G/200G/400G/800G/1.6T
- Temperature/Voltage sensors
- AMBA Bus Host to eSPI Controller/Target
- AMBA Bus Host to eSPI Controller
Related Articles
- Display Driver with on-chip frame buffer and a scalable image compression engine
- Aircraft Jet Engine Failure Analytics Using Google Cloud Platform Based Deep Learning
- Learning Cache Coherence Traffic for NoC Routing Design
- All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM Devices
Latest Articles
- ITP-STDP: An Intrinsic-Timing Power-of-Two Learning Engine for On-Chip SNN Training
- OpenEye: A Scalable Open-Source Hardware Accelerator for DNNs
- CHIMERA: A Flexible and Scalable 3.1 TOPS/W AI-MCU with Transformer Accelerator and 563 Gb/s Shared-L2 Memory Subsystem with QoS Guarantees
- CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST
- A Time Scaling Theory for Multi-Layer Electronic Systems