Using scheduled cache modeling to reduce memory latencies in multicore DSP designs
By Ofer Lent, Moshe Anschel, Erez Steinberg, Itay Peled and Amir Kleen (Freescale)
Embedded.com, (10/13/09, 08:28:00 PM EDT)
The most advanced high-end DSP cores in the market today are fully cache-based by concept while maintaining low latency when accessing higher memory hierarchies (L2/L3). Performance of cache-based DSP systems is highly affected by the cache hit ratio and by the miss penalty.
Hit ratio - the number of accesses that are "hit" in the cache divided by the total number of accesses ("hit" count + "miss" count) - depends on the application locality in time and place. Miss penalty - the number of cycles that the core waits for a "miss" to be served - depends on the physical location of data in the memory system at the time of a cache miss.
Traditional systems rely on the Direct Memory Access (DMA) model in which the DMA controller is used to move data to a memory closer to the core. This method is complicated and requires precise, restrictive scheduling to achieve coherency.
As an alternative, this article describes a new software model - and hardware mechanisms that support it - used in the Freescale SC3850 StarCore DSP subsystem residing in the MSC8156 multi-core DSP. Called the scheduled cache model,, it reduces the need for DMA programming and synchronization to achieve high core utilization.
The scheduled cache model relies on hardware mechanisms (some of which are controlled by software) to increase cache efficiency. Using these mechanisms can yield DMA-like performance while maintaining
To read the full article, click here
Related Semiconductor IP
Related White Papers
- Selecting memory controllers for DSP systems
- Embedded DSP Software Design Using Multicore a System-on-a-Chip (SoC) Architecture: Part 2
- Debugging a Shared Memory Problem in a multi-core design with virtual hardware
- Leveraging OCP for Cache Coherent Traffic Within an Embedded Multi-core Cluster
Latest White Papers
- The backpropagation algorithm implemented on spiking neuromorphic hardware
- The Benefits of a Multi-Protocol PMA
- Integrating Ethernet, PCIe, And UCIe For Enhanced Bandwidth And Scalability For AI/HPC Chips
- eUSB2V2 with 4.8Gbps and Use Cases: A Comprehensive Overview
- Bringing SOT-MRAM Tech Closer to Cache Memory