Providing memory system and compiler support for MPSoc designs: Compiler Support (Part 3)
Embedded.com (01/07/09, 01:41:00 AM EST)
An optimizing compiler that targets MPSoC environments should tackle a number of critical issues. Building on what was learned in Part 1 and Part 2, we first explain these issues and then study potential solutions. From the performance viewpoint, perhaps the two most important memory-related tasks to be performed in an MPSoC environment are optimizing parallelism and locality. Other important issues relate to power/energy consumption and memory space.
The problem with parallelism
Optimizing parallelism is obviously important, since parallelism is the main reason to employ multiple processors in a single unit. In fact, a parallelization strategy determines how memory is utilized by multiple on-chip processors and can be an important factor for achieving an acceptable performance. However, maximum parallelism may not always be easy to achieve because of several factors. For example, intrinsic data dependences in the code may not allow full utilization of all on-chip processors. Similarly, in some cases, interprocessor communication costs can be overwhelming as one increases the number of processors used.
Finally, performance benefits due to increased interprocessor parallelism may not be sufficient when one considers the increase in power consumption. Because of all these, it may be preferable to avoid increasing the number of processors arbitrarily. In addition, the possibility of different parts of the same application demanding different number of processors can make the problem much harder.
Instruction and Data Locality
An equally important problem is ensuring locality of data/instruction accesses. Although achieving acceptable instruction cache performance is not very difficult (since instructions are read-only and exhibit perfect spatial locality), the same cannot be said for data locality.
This is because straightforward coding of many applications can lead to poor data cache utilization. In addition, in an MPSoC environment, interprocessor communication can lead to frequent cache line invalidations/updates (due to interprocessor data sharing), which in turn increases overall latency.
This last issue becomes particularly problematic when false sharing occurs (i.e., the multiple processors share a cache line but not the same data in it). Therefore, an important task for the compiler is to minimize false sharing as much as possible.
Related Semiconductor IP
- RISC-V CPU IP
- AES GCM IP Core
- High Speed Ethernet Quad 10G to 100G PCS
- High Speed Ethernet Gen-2 Quad 100G PCS IP
- High Speed Ethernet 4/2/1-Lane 100G PCS
Related White Papers
- Providing memory system and compiler support for MPSoc designs: Customization of memory architectures (Part 2)
- Providing memory system and compiler support for MPSoc designs: Memory Architectures (Part 1)
- Use Pre-Configured Device Drivers (PCD) to reduce embedded system memory footprint
- Memory system tradeoffs: embedded DRAM in SoCs, Chip-on-Board, multichip packages or memory modules
Latest White Papers
- New Realities Demand a New Approach to System Verification and Validation
- How silicon and circuit optimizations help FPGAs offer lower size, power and cost in video bridging applications
- Sustainable Hardware Specialization
- PCIe IP With Enhanced Security For The Automotive Market
- Top 5 Reasons why CPU is the Best Processor for AI Inference