Statistical Profile Extension: extracting value from SPE for SoC Telemetry
The Arm Statistical Profiling Extension (SPE) is an architectural feature designed for enhanced instruction execution profiling within Arm CPUs. This feature has been available since the introduction of the Neoverse N1 CPU platform in 2019, along with performance monitor units (PMUs) generally available in Arm CPUs. An important step in extracting value from capabilities like SPE and PMUs is the tooling, documentation, and examples to form a top-down solution for SoC telemetry. Six engineers at Arm recently published a detailed white paper on the use of SPE for performance analysis. Their approach and findings are summarized here. This blog post aims to introduce the concept of using SPE for performance analysis and root cause analysis, targeting software developers, performance analysts, and silicon engineers.
Arm SPE is a hardware-assisted CPU profiling mechanism that offers detailed profiling capabilities. It records key execution data, including program counters, data addresses, and PMU events. SPE enhances performance analysis for branches, memory access, and more, making it useful for software optimization. SPE data can be applied for precise sampling in source code hotspot detection, memory access analysis, and data sharing analysis using tools like the Linux perf tool. SPE sampling involves four stages: statistical selection of operations, recording key execution information, post-filtering of sample records, and storing records in memory. It enables efficient profiling and data extraction using monitoring tools. SPE uses a down counter to periodically select micro-operations for profiling. SPE sample records capture the execution lifecycle of an operation, starting at the CPU backend.
To read the full article, click here
Related Semiconductor IP
- 1.8V/3.3V I/O library with ODIO and 5V HPD in TSMC 16nm
- 1.8V/3.3V I/O Library with ODIO and 5V HPD in TSMC 12nm
- 1.8V to 5V GPIO, 1.8V to 5V Analog in TSMC 180nm BCD
- 1.8V/3.3V GPIO Library with HDMI, Aanlog & LVDS Cells in TSMC 22nm
- Specialed 20V Analog I/O in TSMC 55nm
Related Blogs
- SoC QoS gets help from machine learning
- The future of tooling from IP configuration to SoC verification
- 5 Strategies for Protecting Your Advanced SoC Designs from Security Breaches
- Alif Is Creating SoC Solutions for Machine Learning with Cadence and Arm
Latest Blogs
- Cadence Unveils the Industry’s First eUSB2V2 IP Solutions
- Half of the Compute Shipped to Top Hyperscalers in 2025 will be Arm-based
- Industry's First Verification IP for Display Port Automotive Extensions (DP AE)
- IMG DXT GPU: A Game-Changer for Gaming Smartphones
- Rivos and Canonical partner to deliver scalable RISC-V solutions in Data Centers and enable an enterprise-grade Ubuntu experience across Rivos platforms