Analyzing multithreaded applications - Identifying performance bottlenecks on multicore systems
Nandan Tripathi and Amrit Singh, Freescale Semiconductor
EETimes (4/7/2011 11:04 AM EDT)
Abstract
Various aspects preventing applications from achieving theoretical maximum utilization of multicore resources include: operating system (scheduling, synchronization, etc.), application code (parallelization factor, data/function decomposition, etc.), and hardware architecture scalability (cores, memory subsystem, interconnects, etc.).
We use various multithreaded execution scenarios generated through EEMBC's Multibench as stimulus. We introduce a step by step methodology to analyze these scenarios and identify the bottlenecks. Techniques used for kernel tracing, time/function profiling, etc. and tools used to deploy the methodology are discussed next. The paper ends with discussion of various case studies representing different bottlenecks.
To read the full article, click here
Related Semiconductor IP
- CXL 3 Controller IP
- PCIe GEN6 PHY IP
- FPGA Proven PCIe Gen6 Controller IP
- Real-Time Microcontroller - Ultra-low latency control loops for real-time computing
- AI inference engine for real-time edge intelligence
Related White Papers
- Multi-core multi-threaded SoCs pose debugging hurdles
- Achieving multicore performance in a single core SoC using a multi-threaded virtual multiprocessor: Part 1
- Achieving multicore performance in a single core SoC design using a multi-threaded virtual multiprocessor: Part 2
- Meeting Increasing Performance Requirements in Embedded Applications with Scalable Multicore Processors
Latest White Papers
- Adaptable Hardware with Unlimited Flexibility for ASIC & SoC ICs
- CAST Provides a Functional Safety RISC-V Processor IP for Microchip FPGAs
- Design and Implementation of Test Infrastructure for Higher Parallel Wafer Level Testing of System-on-Chip
- Soft Tiling RISC-V Processor Clusters Speed Design and Reduce Risk
- 8051s in Modern Systems: Interfacing to AMBA Buses