Using scheduled cache modeling to reduce memory latencies in multicore DSP designs
By Ofer Lent, Moshe Anschel, Erez Steinberg, Itay Peled and Amir Kleen (Freescale)
Embedded.com, (10/13/09, 08:28:00 PM EDT)
The most advanced high-end DSP cores in the market today are fully cache-based by concept while maintaining low latency when accessing higher memory hierarchies (L2/L3). Performance of cache-based DSP systems is highly affected by the cache hit ratio and by the miss penalty.
Hit ratio - the number of accesses that are "hit" in the cache divided by the total number of accesses ("hit" count + "miss" count) - depends on the application locality in time and place. Miss penalty - the number of cycles that the core waits for a "miss" to be served - depends on the physical location of data in the memory system at the time of a cache miss.
Traditional systems rely on the Direct Memory Access (DMA) model in which the DMA controller is used to move data to a memory closer to the core. This method is complicated and requires precise, restrictive scheduling to achieve coherency.
As an alternative, this article describes a new software model - and hardware mechanisms that support it - used in the Freescale SC3850 StarCore DSP subsystem residing in the MSC8156 multi-core DSP. Called the scheduled cache model,, it reduces the need for DMA programming and synchronization to achieve high core utilization.
The scheduled cache model relies on hardware mechanisms (some of which are controlled by software) to increase cache efficiency. Using these mechanisms can yield DMA-like performance while maintaining
To read the full article, click here
Related Semiconductor IP
- AMBA Bus Host to eSPI Controller/Target
- AMBA Bus Host to eSPI Controller
- AMBA Bus Host to eSPI Target
- Simplified Integration PD Capable Type-C Source IP
- 16-Bit xSPI PSRAM Master
Related Articles
- Embedded DSP Software Design Using Multicore a System-on-a-Chip (SoC) Architecture: Part 2
- Leveraging OCP for Cache Coherent Traffic Within an Embedded Multi-core Cluster
- Achieving cache coherence in a MIPS32 multicore design
- Taking a multicore DSP approach to medical ultrasound beamforming
Latest Articles
- CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST
- A Time Scaling Theory for Multi-Layer Electronic Systems
- GenAI-Driven Approach to RISC-V Supply Chain Exploration
- HSCO-Bench: An Agent-Driven End-to-End Hardware-Software Co-design Benchmark for Systems-on-Chip
- Taking Cryptography Out of the Data Path via Near-Memory Processing in DRAM