Practical Applications of Data Abstraction Techniques for Embedded Systems Debug
Charles Janac, Arteris S.A. -- Paris, France
Abstract :
Current embedded processor-based platforms, while enabling the rapid implementation of complex and wide ranging functionality on a single device, present new challenges in terms of overall design methodology. One of the most significant is the debug of functional operation, the implementation of which is often spread across multiple software and hardware components, and masked in protocol layers and signal transformations.
This paper will present in detail practical applications that illustrate the host of issues that hamper the debug of an embedded platform as well as methods to mitigate these problems. We will discuss the debug of a “Network on Chip” (NoC) communication mechanism, which while providing potential performance, power and interconnect flexibility gains for the target system, disguises implementation detail to a great degree. We will additionally demonstrate the use of an FPGA-based rapid prototyping solution, which enhances verification performance gains by hiding or transforming circuit structure, but to the possible detriment of signal visibility.
Introduction
Today’s modern embedded platform often consists of one or more control processors cooperating with multiple slave signal or other dedicated processors, a collection of interacting storage mechanisms, a number of peripheral components, and custom hardware blocks, all tied together with high- and low-speed communication paths. Operating on this collection of components is a range of software blocks that include firmware code segments for specific processors, application code to drive the whole system and provide user interfaces, middleware functions that enable specific activities, and a Real-Time Operating System (RTOS) to tie the whole thing together while providing intricate platform operational capability.
To debug even a basic embedded processor system, the data being manipulated must be inspected against both the state of the major hardware components and the flow of the software on the various processors. By way of example, let’s consider an audio signal processing platform utilizing many of the features noted above (Figure 1). Discovering and understanding operational detail and tracking potential problems across such a diverse and often beguiling range of hardware and software infrastructure requires inventive data abstraction approaches beyond traditional visualization methods.
Figure 1: Debugging an embedded processor platform with a NoC
Network-on-Chip
Our first application case study incorporates a NoC transport mechanism, which takes communication complexity a level further. While the NoC system eliminates design issues associated with a more conventional Northbridge / Southbridge bus system and provides power and performance gains, it also introduces network-style visibility issues.
A NoC-based system can take a number of forms. The one utilized in this example uses ports that interface to common standard busses. The ports then extract the data and include information as to its routing. The data and associated control words are structured into packets and then passed at high speed over the network to a destination port. The internal network structure consists of routers, synchronizers, various adapters, and QOS cautious arbiters that provide for very rapid data transfer.
The peripheral connections into the NoC often utilize bus standards, such as AXI, and data passing through these ports may be debugged as if they were on such a bus. Once the data is within the network, the addressing memory map no longer applies and network protocol takes over as the information is directed to its destination. An error in this process is very hard to locate and handle, and thus requires special techniques. Sometimes a port may be added to the NoC to allow for data flow around the network to be examined directly.
From a methodology standpoint, the NoC testing is first done as a standalone component. The environment illustrated below (Figure 2) provides generation of RTL code and testbenches to check connectivity and address some basic performance levels (peak bandwidth and minimal latency).
Figure 2 – NoC Environment
It should be pointed out that given the caching schemes and other memory management methods in practical use today, it is often hard to track data as it is moved around a system. The smallest change in operation, even during simulation, can result in a significant alteration in memory allocation sequences.
Data is also masked by protocol layers used within the bus or other communication system. These need to be abstracted away for the underlying information to be accessed. This has traditionally been accomplished with the use of a “bus functional model” or BFM, a device that can read bus signals and recognize protocol information or coding methods, delivering a pure data stream in an abstract form that matches the level required by other components. It has become clear that leveraging the abstraction enabled by these BFMs or “transactors” within the debug environment saves a significant amount of time. This time savings derives primarily from the « automation of understanding » provided by transactors – the user is no longer required to manually reverse-engineer the operation of the device from the signal-level details, as the transactor does all the work (Figure 3). Intelligent transactors might also provide additional critical information that eases the analysis and debug process, such as details about the underlying data being processed, its original source and ultimate destination, and its relationship to other operations in progress. Such transactors can even include verification structures such as assertion sets or other analysis mechanisms that greatly aid bug detection and correction.
Figure 3 – Abstraction in a typical bus-based system
Utilizing the powerful abstraction mechanisms provided by transaction-level capabilities, engineers can more easily analyze and navigate complex functionality and trace activity that might otherwise be obscured by the detailed implementation of the NoC platform. State-of-the-art debug systems now include features that leverage transaction-level data – providing visualization in traditional waveform views that support analysis of transactions alongside detailed signal-level activity, sorting and filtering capabilities that make it easy isolate operations of interest, comparison capabilities that enable operation-level analysis across multiple verification runs, and specialized graphical displays that support performance analysis work (Figure 4).
Figure 4 – Transaction visualization and analysis in an advanced debug system
The knowledge required to automate the abstraction process can be provided in a number of ways. If transactors are already required in order to translate data between system components, then these transactors likely already encapsulate the required protocol details and can be augmented to write abstract functional information directly to a debug database. The port located on the NoC may also include a BFM of sorts, custom built to understand the protocol within the network and deliver cleaned data into the database. A promising area of emerging technology enables the post-verification extraction of transaction-level data from signal-level details, via either a library of transaction extractors (suitable for standard protocols) or a user-provided specification of the transactions (suitable for proprietary or customized protocols). Although no broadly accepted standard language exists for the latter, commercial products based on standard assertion languages are available today. Figure 5 shows how one might specify an AHB single-read transaction using SystemVerilog Assertion (SVA) syntax.
Figure 5 – AHB single-read transaction specification using SVA
The advantages of an SVA-based approach are many – use of a standard language that is gaining in popularity, re-use of the same description to verify assertions (i.e., in support of protocol checking), and the use of local variables to capture and associate information (attributes) with transactions, just to name a few.
FPGA-based Prototype Verification
Two of the factors driving the move towards embedded platforms are the reusability aspects of the platform hardware, as well as the greater ease with which functionality may be updated or repaired post hardware production through the use of software to replace custom RTL code. While software may be reloaded into a device after fabrication, it is still very convenient, and sometimes essential in the case of ROM-based firmware, to verify software operation with the hardware before fabrication. However, a purely simulation-based verification methodology is slower by an order of magnitude or more than what is required to run the substantial code segments necessary to gain operational confidence.
Our second application focuses on the debug of a system running on an FPGA-based rapid prototyping system (Figure 6). In our example, system integrators work with the hardware team to construct an FPGA board representation of the design early on in the overall methodology, such that it will be ready on cue as the hardware design nears completion. As the other phases of design near completion, the system integrators set the prototype board to run using software modules as they become available. Debug now requires a more complex software / hardware approach in which both must be inspected simultaneously.
Figure 6 - An example of a system verification methodology
At the time of construction, various nodes within the circuit of significance for the debug process are tapped and brought to the FPGA pins. Key nodes are collected and wired to a bus to allow data to be streamed out into a memory subsystem that can later be read. One of the advantages of using an FPGA system is that it is also possible to “rewire” the logic of the FPGA to bring other signals to the bus, providing that enough pins and a collection mechanism are provided.
This mechanism allows for many of the signals that correspond to nodes from the original design to be observed. However, often only clocked registers in the design will have been preserved in the FPGA version, with many of the intermediate signals having been optimized away during design re-synthesis targeting the FPGA solution. Today, most designers either do without these values, resulting in much guesswork and inefficient debug of prototype operation. Or, they rebuild the FPGA logic to include the values, leading to multiple long iterations as the necessary signal data is gathered in an incremental fashion, and the engineer discovers in any given debug session that they don’t have access to critical data.
New technology is emerging that utilizes formal and simulation techniques to solve this problem by using software techniques to regenerate missing signal data from the captured register configuration and contents. Data for the minimal required set of signals is made available by streaming from the hardware solution back into a software database. Following this process, two operations must be performed. The first correlates the data back to the register transfer level (RTL), which allows it to be overlayed and analyzed in the context of the original design description using standard debug environments. Next, a « data expansion engine » that operates on the logic inferred from the RTL description transparently calculates missing signal data on-the-fly as the user asks questions and performs operations in the target debug system. For the end-users, it is as if they are debugging the results of a simulation in which all signal data has been captured, operating on the familiar RTL code and utilizing their debug system of choice (Figure 7).
Figure 7 – Prototype debug with data correlation and expansion
This emerging « visibility enhancement » technology has analagous applications in the simulation space – making simulation more efficient by enabling the capture of a relatively small amount of signal data while retaining full visibility into design operation.
As discussed earlier, the complexity of a typical design in this class is such that simple signal-level visualization in conjunction with the RTL code is not sufficient to support efficient analysis and debug of design operation. The software designers and verification specialists using prototypes can also derive a great deal of value by a further leap to the transaction level. With the support of these emerging visibility enhancement technologies, the same transaction extraction capabilities described previously can be applied to the signal data after correlation and expansion to derive transaction-level abstraction information for use in the prototyping environment.
Conclusion
Effective debug is all about understanding design operation. Embedded platforms drive the debug process to a new level of complexity, given greater team diversification, significant functional complexity, and expanding methodologies. Fortunately, it may be observed from these two application case studies that achieving design understanding can be greatly enhanced through the abstraction and translation of data into a form that is recognizable by the appropriate engineering discipline and that can be matched against the original design
About the Presenters
George Bakewell, Product Marketing Director, Novas Software, Inc.
Charles Janac, President and CEO, Arteris
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related White Papers
- Real-Time Trace: A Better Way to Debug Embedded Applications
- Paving the way for the next generation of audio codec for True Wireless Stereo (TWS) applications - PART 5 : Cutting time to market in a safe and timely manner
- Software Architecture for DO-160 Qualification of STM32H7 based systems
- Traceability for Embedded Systems
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience