Using scheduled cache modeling to reduce memory latencies in multicore DSP designs
By Ofer Lent, Moshe Anschel, Erez Steinberg, Itay Peled and Amir Kleen (Freescale)
Embedded.com, (10/13/09, 08:28:00 PM EDT)
The most advanced high-end DSP cores in the market today are fully cache-based by concept while maintaining low latency when accessing higher memory hierarchies (L2/L3). Performance of cache-based DSP systems is highly affected by the cache hit ratio and by the miss penalty.
Hit ratio - the number of accesses that are "hit" in the cache divided by the total number of accesses ("hit" count + "miss" count) - depends on the application locality in time and place. Miss penalty - the number of cycles that the core waits for a "miss" to be served - depends on the physical location of data in the memory system at the time of a cache miss.
Traditional systems rely on the Direct Memory Access (DMA) model in which the DMA controller is used to move data to a memory closer to the core. This method is complicated and requires precise, restrictive scheduling to achieve coherency.
As an alternative, this article describes a new software model - and hardware mechanisms that support it - used in the Freescale SC3850 StarCore DSP subsystem residing in the MSC8156 multi-core DSP. Called the scheduled cache model,, it reduces the need for DMA programming and synchronization to achieve high core utilization.
The scheduled cache model relies on hardware mechanisms (some of which are controlled by software) to increase cache efficiency. Using these mechanisms can yield DMA-like performance while maintaining
To read the full article, click here
Related Semiconductor IP
- NPU IP Core for Mobile
- MSP7-32 MACsec IP core for FPGA or ASIC
- UHF RFID tag IP with 3.6kBit EEPROM and -18dBm sensitivity
- NPU IP Core for Edge
- Specialized Video Processing NPU IP
Related White Papers
- Selecting memory controllers for DSP systems
- Embedded DSP Software Design Using Multicore a System-on-a-Chip (SoC) Architecture: Part 2
- Leveraging OCP for Cache Coherent Traffic Within an Embedded Multi-core Cluster
- Achieving cache coherence in a MIPS32 multicore design
Latest White Papers
- Ramping Up Open-Source RISC-V Cores: Assessing the Energy Efficiency of Superscalar, Out-of-Order Execution
- Transition Fixes in 3nm Multi-Voltage SoC Design
- CXL Topology-Aware and Expander-Driven Prefetching: Unlocking SSD Performance
- Breaking the Memory Bandwidth Boundary. GDDR7 IP Design Challenges & Solutions
- Automating NoC Design to Tackle Rising SoC Complexity