Refactoring Hardware Algorithms to Functional Timed SystemC Models
Praveen Kondugari, Intel Mobile Communications
Bangalore, India
Abstract:
SystemC Modelling is an emerging technology used for SoC Verification and termed as Virtual Platforms. Virtual platforms are Simulation Environment of the SoC. SystemC is a high level language and the platforms built with it run at higher simulation speeds compared to HDL models or emulators. For creating a virtual platform each of the peripherals, cores, bus etc in the SoC is modelled using SystemC and are integrated. Many of these peripherals would be hardware implementation of algorithms such as up samplers, down samplers, data descrambling, crypto encoding and decoding etc. This paper presents a systematic approach of converting a hardware algorithm into a functional timed SystemC model and simulation speed improvement techniques that could be incorporated.
1. INTRODUCTION
Simulation methodologies can be broadly abstracted out as Virtual Platforms, FPGA (Field-Programmable Gate Array), Emulators, RTL (Register Transfer Level) and Gate Level. The speed of simulations is faster in Virtual Platforms as they are modelled at transaction level and with high level languages and slowest at Gate Level. Most of the verification activities are carried out at Virtual Platforms down to RTL level. RTL Verification is more time consuming compared to Virtual Platforms and Emulators. Compared to Emulator, Virtual Platforms bring-up time is less and simulation speeds are 10x times faster.
Virtual Platforms are simulation environment of hardware created using SystemC. Many vendors provide SystemC tooling like OSCI SystemC [1], SCML (SystemC Modeling Language) [2], VAST [3] etc. Virtual Platforms are developed in different abstraction levels like loosely timed, approximately timed and cycle accurate. In loosely timed mode, modelling the virtual platforms focuses on functionality of the cores and peripherals and doesn’t carry actual timing information. This mode is preferred when requirement is to develop and verify driver, application & verification software. Loosely timed mode is faster compared to other two modes. Virtual Platform in approximately timed mode is preferred to explore architectural alternatives to platform architecture. This mode can also be used to partly carry out performance measurement. Virtual Platform in Cycle Accurate Mode is used to jointly verify the hardware and software in the system context. This mode is used to carry out performance measurement of architectures.
A typical Virtual Platform comprise processor core simulators connected to various models of bus, routers, adapters, memory controllers, data encryption & decryption models, encoders & decoders, scramblers & descramblers, samplers and other peripheral models. Each of these models would be developed in SystemC. Most of these hardware peripherals would be respective algorithm implementation, for example up samplers, down samplers, data descrambling, crypto encoding, crypto decoding etc. This paper presents a systematic approach of converting a hardware algorithm available in higher level language like C or C++ into a functional timed SystemC model.
2. PROPOSED APPROACH
Following steps should be processed in transforming a hardware C/C++ algorithm into a Functional timed SystemC model.
- Code Breakup
- Initialization
- Input Triggering
- Algorithm
- Output
- Convert to SystemC Model
- Peripheral Interface
- Call backs & Triggers
- Time/Delay Plug-in
- Interrupt Plug-in
- Configuration Parameters
- Logging and Tracing
2.1 CODE BREAKUP
Let us consider a typical algorithm structure as shown below
{
// variable initialization
...
// algorithm execution
...
// return output
}
As a first step the available algorithm has to be separated into two phases initialization phase and algorithm execution phase. Create a C++ class with three interface functions to set the required arguments, to perform initialization code and to perform the algorithm on the input. Below is a sample snippet of the code.
{
void set_control_arguments(...)
{
// sets algorithm control variables
}
void initialization()
{
// executes the initialization code
}
output run_algorithm(input)
{
// executes the algorithm on input
// returns output
}
private:
// control variables
// local variables
The set_control_arguments() function will be used to set the control parameters of the algorithm like key, data block size, mode etc. The initialization() function is used to run the initialization part of the algorithm i.e initializing the required variables. The run_algorithm() function is used to run the algorithm on a block of data or repetitively for single data and provide the output.
If the algorithm is expected to execute on a block of data and provide output then the run_algorithm function uses C/C++ algorithm as is. But if the algorithm is expected to update any status or raise an interrupt for some condition in between algorithm execution then the algorithm needs to be tailored to operate on single data instead on block data. For example consider a color up-sampling algorithm which performs on a frame of data, suppose if the model is configured to raise an interrupt after operating on 5th line of frame then the run_algorithm function should contain algorithm to perform on line data instead of on complete frame data. This tailoring of algorithm will be based on the features supported by the peripheral. If the peripheral doesn’t support any intermediate status/interrupt and provides status/interrupt only at the end of operation on block data then the algorithm could be re-used as is.
2.2 CONVERT TO SYSTEMC MODEL
To convert the created class to SystemC model following are some of the typical steps involved
1. Include header file systemc.h: This is the header where all the SystemC constructs will be available in the SystemC library.
2. Derive as public from sc_module
class xyz_algo: public sc_module
3. Add SC_HAS_PROCESS()as a public member.
4. Add constructor with sc_module_name as argument. Additional arguments can be provided if required.
xyz_algo(sc_module_name name);
For more details on these, SystemC LRM [1] would be a good reference.
2.3 PERIPHERAL INTERFACE
Refer the hardware specification of the peripheral for interface details. Add the required clock, interrupts, input and output ports. Add the TLM [4] initiator and target sockets where ever required. If SCML kind of SystemC libraries are used it provides many user simplified constructs for modelling. It provides simple constructs for binding a TLM port to registers of the peripheral, attach call-backs to register read writes, port read writes etc. Model the registers as per the hardware specification.
2.4 CALL BACKS & TRIGGERS
Attach call-backs to the registers where ever required. Typically call-backs are attached for
- Reset register; call back to reset the module
- Configuration registers; call back to configure the control parameters of the algorithm;
- set_control_arguments() is called in this call-back
- initialization() is called in this call-back
- Input data register; call back to trigger algorithm
- run_algorithm() is called in this call-back
- If the input is from separate port this call-back has to be attached to writes on this separate port
- Output data register; call back to trigger algorithm
- Output data from run_algorithm() is sent from this call-back
- If the output is from separate port a SystemC thread (SC_THREAD [1]) which is made always active posts a transaction on this port when data from run_algorithm() is available.
- Clear interrupt register; call-back to clear the interrupt port and related status.
2.5 TIME/DELAY PLUG-IN
The Delay for executing the algorithm could be consumed in the slave before raising the algorithm complete interrupt and status update. This can be achieved using delayed event notifications [1] after completion of algorithm.
Example:
sc_event ev_algo_complete;
// Delayed Notification
ev_algo_complete.notify(100, SC_NS);
In this case notification would cause after 100 SC_NS after the delayed notify (notify(100, SC_NS)) is called.
2.6 INTERRUPT PLUG-IN
A method (SC_METHOD [1]) can be attached to the event notified in above section. This method is triggered after specified delay. In this method the interrupt can be set high depending whether the interrupt is enabled or disabled and can update the status.
Example:
sc_in <bool> interrupt;
void algo_notify_status();
// Declaration in Constructor
SC_METHOD(algo_notify_status);
sensitive << ev_algo_complete;
// Implementation
void algo_notify_status()
{
interrupt.write(1);
status_reg = 1;
}
In this example the algo_notify_status function (SC_METHOD) is sensitive to event ev_algo_complete. At the end of the algorithm if this event is delayed notified as shown in previous section, this function would be triggered after 100 SC_NS. The model in this function raises interrupt indicating algorithm complete.
2.7 CONFIGURATION PARAMETERS
Certain parameters of algorithms like “key polynomial” in crypto encryption are not provided through registers and are hardware specific (hidden with hardware). Such parameters should be taken as constructor arguments or run-time configurable parameters. Below is snippet to take the key parameter as a constructor argument
// configuration parameter
xyz_algo(sc_module_name name, uint32 key);
2.8 LOGGING & TRACING
The model can log the algorithm details of configuration parameters, input data, output data, and status register & interrupt changes. This enables users to verify the model and algorithm while running the simulation or while debugging.
3 SIMULATION SPEED CONSIDERATIONS
Following need to be considered while developing the SystemC model in order to not affect the simulation speed.
- Reduce number of context switching across different processes i.e. reduce number of SC_THREADs in the model
- If polling is required, avoid polling for every clock, otherwise would drop the simulation speed of entire platform. Instead use a quantum and poll after every quantum. It is advisable to have this quantum as a configurable parameter.
- Create and use look-up tables if applicable instead of calculating values from recursive functions.
4 CONCLUSIONS
This paper presents a typical step by step approach in refactoring an existing hardware C/C++ algorithm into a functional timed SystemC model. Considerations while developing the SystemC model in order to not affect the simulation speed are listed. It also describes virtual platforms, advantages of SystemC models, and different abstractions for SystemC models.
5. REFERENCES
[1] IEEE Standard SystemC® Language Reference Manual, IEEE Std 1666™-2005
[2] SCML 2.0, SystemC Modeling Library, http://www.synopsys.com/cgi-bin/slcw/kits/reg.cgi
[3] http://www.synopsys.com/Community/Interoperability/SystemLevelCatalyst/Pages/MVaST.aspx
[4] OSCI TLM-2.0 Language Reference Manual, www.systemc.org
[5] Grotker, Thorsten and Liao, Stan and Martin, Grant and Swan, Stuart. System Design with SystemC. Kluwer Academic Publishers, 2002, pp 87-130.
[6] Ghenassia, F. Transaction Level Modeling with SystemC. Springer, Dordrecht, Netherlands, 2005.
[7] C. Genz and R. Drechsler, “System exploration of SystemC designs” in IEEE Annual Symposium on VLSI, 2006.
[8] Amit Garg, “Fast Virtual Prototyping for early software design and verification”, Design & Reuse, 2008
Keywords— SystemC Modelling, SoC, Code Refactoring, Hardware Algorithms, Virtual Platforms
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related White Papers
- Fit the hardware to the algorithm with SystemC models
- Introduction to and Regression Test for OCP SystemC Channel Models
- Leveraging system models for RTL functional verification
- Timing Annotation of UnTimed Functional Models for Architecture Use-Case
Latest White Papers
- Monolithic 3D FPGAs Utilizing Back-End-of-Line Configuration Memories
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard