Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory

By Guan-Cheng Chen ¹, Chieh-Lin Tsai ², Pei-Hsuan Tsai ¹, Yuan-Hao Chang ²
¹ National Cheng Kung University, Tainan, Taiwan
² National Taiwan University, Taipei, Taiwan

Abstract

Compute-In-Memory (CIM) systems, particularly those utilizing ReRAM and memristive technologies, offer a promising path toward energy-efficient neural network computation. However, conventional quantization and compression techniques often fail to fully optimize performance and efficiency in these architectures. In this work, we present a structured quantization method that combines sensitivity analysis with mixed-precision strategies to enhance weight storage and computational performance on ReRAM-based CIM systems. Our approach improves ReRAM Crossbar utilization, significantly reducing power consumption, latency, and computational load, while maintaining high accuracy. Experimental results show 86.33% accuracy at 70% compression, alongside a 40% reduction in power consumption, demonstrating the method's effectiveness for power-constrained applications.

To read the full article, click here

Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory

Abstract

Related Semiconductor IP

Related Articles

Latest Articles

Related Articles

Pyramid Vector Quantization and Bit Level Sparsity in Weights for Efficient Neural Networks Inference

The Growing Importance of AI Inference and the Implications for Memory Technology

CANsec: Security for the Third Generation of the CAN Bus

Top 5 Reasons why CPU is the Best Processor for AI Inference

PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability

TeraPool: A Physical Design Aware, 1024 RISC-V Cores Shared-L1-Memory Scaled-up Cluster Design with High Bandwidth Main Memory Link

AutoGNN: End-to-End Hardware-Driven Graph Preprocessing for Enhanced GNN Performance

LUTstructions: Self-loading FPGA-based Reconfigurable Instructions

CQ-CiM: Hardware-Aware Embedding Shaping for Robust CiM-Based Retrieval

Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory

Abstract

Subscribe to the Semi IP Hub Newsletter

Related Semiconductor IP

Related Articles

Latest Articles