Part 3: High-Bandwidth Accelerator Access to Memory: Enabling Optimized Data Transfers with RISC-V
This is the third in a series of blogs about Domain-specific accelerators (DSAs), which are becoming increasingly common in system-on-chip (SoC) designs. Part #1 addressed the challenges associated with data transfers between DSAs and the core complex, and showed how RISC-V offers a unique opportunity to optimize fine-grain communication between them and improve core-DSA interaction performance. Part #2 addressed the challenges associated with point-to-point ordering between cores and DSA memory, and how RISC-V offers a unique opportunity to optimize high-bandwidth communication between cores and DSAs. This third instalment will focus on the challenges associated with data transfers between DSA and memories, such as DDR, LPDDR or HBM, and explain how SoCs based on RISC-V can use an alternate approach to write the data directly to memory.
To recap, a DSA provides higher performance per watt by optimizing the specialized function it implements. Examples of DSAs include compression/decompression units, random number generators and network packet processors. A DSA is typically connected to the core complex using a standard IO interconnect, such as an AXI bus (Figure 1).
To read the full article, click here
Related Semiconductor IP
- MIPI I3C Master RISC-V based subsystem
- ISO26262 ASIL-B/D Compliant 32-bit RISC-V Core
- RISC-V CPU IP
- RISC-V Vector Extension
- RISC-V Real-time Processor
Related Blogs
- High-Bandwidth Accelerator Access to Memory: Enabling Optimized Data Transfers with RISC-V
- High-Bandwidth Core Access to Accelerators: Enabling Optimized Data Transfers with RISC-V
- Part 1: Fast Access to Accelerators: Enabling Optimized Data Transfer with RISC-V
- Fast Access to Accelerators: Enabling Optimized Data Transfer with RISC-V
Latest Blogs
- The Growing Importance of PVT Monitoring for Silicon Lifecycle Management
- Unlock early software development for custom RISC-V designs with faster simulation
- HBM4 Boosts Memory Performance for AI Training
- Using AI to Accelerate Chip Design: Dynamic, Adaptive Flows
- Locking When Emulating Xtensa LX Multi-Core on a Xilinx FPGA