CQ-CiM: Hardware-Aware Embedding Shaping for Robust CiM-Based Retrieval
By Xinzhao Li 1, Alptekin Vardar 3, Franz Müller 3, Navya Goli 2, Umamaheswara Tida 2, Kai Ni 4, X. Sharon Hu 4, Thomas Kämpfe 3, Ruiyang Qin 1
1 Villanova University,
2 NDSU,
3 Fraunhofer IPMS,
4 University of Notre Dame

Abstract
Deploying Retrieval-Augmented Generation (RAG) on edge devices is in high demand, but is hindered by the latency of massive data movement and computation on traditional architectures. Compute-in-Memory (CiM) architectures address this bottleneck by performing vector search directly within their crossbar structure. However, CiM's adoption for RAG is limited by a fundamental ``representation gap,'' as high-precision, high-dimension embeddings are incompatible with CiM's low-precision, low-dimension array constraints. This gap is compounded by the diversity of CiM implementations (e.g., SRAM, ReRAM, FeFET), each with unique designs (e.g., 2-bit cells, 512x512 arrays). Consequently, RAG data must be naively reshaped to fit each target implementation. Current data shaping methods handle dimension and precision disjointly, which degrades data fidelity. This not only negates the advantages of CiM for RAG but also confuses hardware designers, making it unclear if a failure is due to the circuit design or the degraded input data. As a result, CiM adoption remains limited. In this paper, we introduce CQ-CiM, a unified, hardware-aware data shaping framework that jointly learns Compression and Quantization to produce CiM-compatible low-bit embeddings for diverse CiM designs. To the best of our knowledge, this is the first work to shape data for comprehensive CiM usage on RAG.
To read the full article, click here
Related Semiconductor IP
- Band-Gap Voltage Reference with dual 2µA Current Source - X-FAB XT018
- 250nA-88μA Current Reference - X-FAB XT018-0.18μm BCD-on-SOI CMOS
- UCIe D2D Adapter & PHY Integrated IP
- Low Dropout (LDO) Regulator
- 16-Bit xSPI PSRAM PHY
Related Articles
- Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval
- Dataquest forecasts robust IP market growth
- Robust verification deserves an audit
- Robust designs cut vendor hype
Latest Articles
- SCENIC: Stream Computation-Enhanced SmartNIC
- Agentic AI-based Coverage Closure for Formal Verification
- Microarchitectural Co-Optimization for Sustained Throughput of RISC-V Multi-Lane Chaining Vector Processors
- RISC-V Functional Safety for Autonomous Automotive Systems: An Analytical Framework and Research Roadmap for ML-Assisted Certification
- Emulation-based System-on-Chip Security Verification: Challenges and Opportunities