Analyzing multithreaded applications - Identifying performance bottlenecks on multicore systems
Nandan Tripathi and Amrit Singh, Freescale Semiconductor
EETimes (4/7/2011 11:04 AM EDT)
Abstract
Various aspects preventing applications from achieving theoretical maximum utilization of multicore resources include: operating system (scheduling, synchronization, etc.), application code (parallelization factor, data/function decomposition, etc.), and hardware architecture scalability (cores, memory subsystem, interconnects, etc.).
We use various multithreaded execution scenarios generated through EEMBC's Multibench as stimulus. We introduce a step by step methodology to analyze these scenarios and identify the bottlenecks. Techniques used for kernel tracing, time/function profiling, etc. and tools used to deploy the methodology are discussed next. The paper ends with discussion of various case studies representing different bottlenecks.
To read the full article, click here
Related Semiconductor IP
- Xtal Oscillator on TSMC CLN7FF
- Wide Range Programmable Integer PLL on UMC L65LL
- Wide Range Programmable Integer PLL on UMC L130EHS
- Wide Range Programmable Integer PLL on TSMC CLN90G-GT-LP
- Wide Range Programmable Integer PLL on TSMC CLN80GC
Related White Papers
- Multi-core multi-threaded SoCs pose debugging hurdles
- Achieving multicore performance in a single core SoC using a multi-threaded virtual multiprocessor: Part 1
- Achieving multicore performance in a single core SoC design using a multi-threaded virtual multiprocessor: Part 2
- Meeting Increasing Performance Requirements in Embedded Applications with Scalable Multicore Processors