Hardware Co-Verification using VMM HAL-SCEMI On ChipIT Platform
By Dilip Sutariya, Jigar Patel
Einfochips Ltd.
ABSTRACT
Having SoC for HDTV under verification, verifying different components of SoC at block level are so far satisfactorily done using simulation. Now having full chip level environment and run-ning simulation for few frames takes huge amount of time. To locate deep issues one need to run for multiple frames, this could run into system limitation. Hardware acceleration allows verifica-tion engineers to speed up verification and achieve early validation too. This article talks about VIP architecture requirement for achieving Hardware Accelarated Verification using on board generation, error checker, data checker and coverage collection, taking example of AHB Synthe-sizable VIP.
1. Introduction
In last decade ASIC design has moved a lot from million gate to multi-million gate SoC level. So does verification requirements for it. Verification is moved from directed HDL based verification to more sophisticated System Verilog (HVL) driven automatic test-bench based verification. Moving forward SoC are getting more complex and so does verification environment for them, leading to system limitations. At same time available time windows are shrinking too.
To address both time window and system limitation one need to think beyond traditional simulation based approach and come up with composite approach which provides flexibilities of simulation and speed of Hardware. What we mean is 'Co-verification' where one need to involve simulator as well as Hardware. This approach involves ChipIT/Eve like accelerator tools, Simulator (VCS) and HAL/SCEMI like hardware-software interface.
As many vendor provide VIP solution which is available with simulator, we can have VIP solution which can be ported to emulation too - Synthesizable VIP.
2. Synthesizable VIP Architecture Requirements
Targeting VIP for emulation has its own set of requirement and challenges. There are two main points to discuss here, partitioning of modules and Hardware-Software communication requirements.
2.1 Partitioning of Modules
Main objective of emulation is to achieve higher verification speed. This requirement implicitly points to partitioning design based on it time consumption and flexibilities. Areas (modules) where flexibility is required should be designed in System Verilog and run on simulator. Modules which consume time and where speed is concern should be placed on board to utilize hardware platform which can run million of cycles per second.
Any basic VIP is consist of Generation, Comparability, Driving, Monitoring, Score Boarding, Coverage and error logging facilities. Here, generation and configuration are areas where System Verilog's flexibility is main advantage, should be designed and run on simulator. Driver and Monitor are actually time consumer and where hardware speed can provide advantage, should be placed on board.
2.2 Communication Requirement
As design is divided in software and hardware, we need communication medium which allows exchange of information between them. SCEMI and HAL are such communication ways available, which allows communication back and forth. Both method allows connecting un-timed software model to cycle-accurage RTL. Many vendor support both communication ways.
2.2.1 VMM HAL
VMM‘s hardware abstraction layer (HAL) consists of a toolkit of classes and methods to implement emulation friendly transactors. What does a VMM HAL transactor look like? Remember that the external interface or application programming interface (API) of the transactor is the same as the pure simulation version. Only the guts of the monitor or checker change to accommodate faster throughput and message passing for emulation.
The SystemVerilog part of the transactor mostly consists of forwarding messages to the emulator using the predefined objects of VMM HAL. Those objects are called VMM hardware interface objects (vmm_hw_if ). Messages are transmitted with high speed and low latency to the hardware emulator. There, they are decoded by the back end of the transactor.
The transactor‘s back end must be written in synthesizable RTL Verilog code. Each SystemVerilog object has a corresponding Verilog macro module. Every time a stream is created in SystemVerilog, ports are added in the Verilog code to transport the actual data. The transport and translation between the messages in software and the ports in hardware are transparent and irrelevant to the designer implementing the transactor logic. What remains to be designed is the control logic or state machine. It processes the messages present on the message ports to convert them into activity on the design ports and internal signals. Typically, these are built using internally available IP.
VMM HAL also utilizes SCEMI (Macro Based Method) for performing communication with hardware. VMM HAL support running same test bench for simulation only and with board environment. VMM HAL supports ZEBU and CHIPIT emulation board access.
2.2.2 What are the Benefits of Interfacing at the Transaction-Level
Also, what to communicate should be rightly chosen to achieve goal of higher speed, as it takes many clocks (actual time) to communicate! More communication means more time consumed in communication and less given to Hardware to actually work. We should come up with way to communicate less and hardware work more number of clocks for each communication. For that one should have transaction based model, where information for many transaction can be provided to hardware in single communication. Later hardware can interpret this 'command' and can perform number of transactions or can run for number of clocks (frames etc.).
The VMM HAL specifies the interaction between the testbench and the DUT at transaction-level, the Monitors and Drivers layers acting as smart adapters between transactions and signal-level protocols. Previous co-emulation approaches had relied on synchronizing the testbench and the DUT at the signal or event level, which proved to be very inefficient as the emulator and simulator had to run serially and synchronize at every simulation event.
The transactors that move data to and from the design need to be able to keep up with such high throughput. Instead of communicating with the design at the signal level, which is all that Verilog can do, transactors use SystemVerilog‘s Direct Programming Interface (DPI) to exchange higher-level messages.
By raising the level of abstraction to transactions instead of signals, the emulator is now able to run at full speed, which can exceed 20 MHz, without sacrificing accuracy and still be completely under the control of a VMM based testbench. The co-emulation link can stream data between the design and the testbench at up to 800 Mbit/s, enough for even the most data-intensive applications. The conversion between transactions and cycle-accurate signals is performed in hardware by the synthesized portions of the Drivers and Monitors layers.
To reduce communication requirements, one can consider having some part of coverage collection and score boarding designed on board.
3. AHB Synthesizable VIP
This section provides brief overview of AHB Synthesizable VIP architecture.
As we talked about partitioning of design and communication between Simulator and Board are key areas. AHB VIP uses transaction level communication, where SystemVerilog based ran-domization is used to generate configuration and transaction level information. This generated information is provided to Synthesizable parts, placed on board using HAL interface. Once in-formation reaches to board information decoder, AHB Master/Slave State machine, Protocol checker, Data integrity checker, Bus Decoder and Coverage Collector work in synchronization to execute received transaction request and provides status back to simulator which logs informa-tion.
To achieve higher throughput we came up with facility where with single request we can allow hardware to run for multiple clock cycles. This is achieved using on board generation facility, where SystemVerilog based transaction generator providing different parameters to on board generator and on board generator is able to generate random (using LFSR)/sequential/fixed data and perform number of AHB transaction.
On board monitor allows bus monitoring and logging errors for each transaction that happened on bus. That information can be read by simulator with pre-defined request at end of transaction. This mechanism allows higher speed monitoring and error logging. Apart from error logging, on board monitor helps to collect coverage. Coverage will be collected and stored in form of Boo-lean register and at end of simulation simulator reads this information to generate coverage re-port.
To avoid all data to be transferred to simulator to achieve data integrity checking we have opti-mised information to be sent back to simulator for any transaction with help of CRC. Here Mas-ter and Slave performing any transaction will generate CRC on all transferred data. At end of any transfer only CRC information needs to be checked, rather than complete data. This mechanism helps to avoid heavy data transfers between simulator and board.
3.1 AHB Synthesizable VIP Architecture
Below figure provides top level view of AHB Synthesizable VIP. VIP details have been parti-tioned in three parts – AHB VIP (Simulation Part), VMM HAL Infrastructure and AHB Synthe-sizable BFM on ChipIT board.
Figure 1 – Architecture for AHB Design
3.1.1 AHB VIP (Simulation Part)
AHB VIP is System Verilog based VMM-1.2 compliant VIP running on VCS-2009.06 version.
- Configuration: This is AHB VIP configuration. It provides information required for building VIP environment. It provide information on Bus size, endianness, number of master involved, number of slave involved, checker enabled, score board enabled etc.
- Stimulus Generator: This can be atomic generator or scenario generator. This generator generates AHB transaction using factory or extended class provided to it. Generator generates constrained number of transaction and provide it to message formatter.
- Message Formatter: This transactor is involved in making communication with VMM HAL. It processes transaction received from Stimulus Generator and form a frame out of this. This frame is provided to VMM HAL. This transactor also forms receiver for frames from VMM HAL. Later transactor froms transaction out of received frame and provides transaction to error logger/monitor.
- Error Logger/Monitor: This component is involved in processing response received from on board monitors and enables logging of messages. It receives transaction from Message Formatter component and analyzes transaction for any error reported by on board monitor. All status bits are logged by it.
- Coverage: This is functional coverage for VIP. Software collects coverage on generation and monitoring to enable us collect comprehensive coverage data to see verification completeness. Monitored coverage are provided by on board monitor.
- Score Board: Score board forms data integrity checker. Data is collected from Message Formatter for data sent to DUT and data received from DUT. Score boarding is implemented using call-back mechanism and configurable to provide checking on checksum (CRC) or complete data.
3.1.2 VMM HAL Infrastructure
VMM HAL Interface is which connect VIP running on VCS with actual host port and portion of this will be on ChipIT. VMM HAL infrastructure consist of components – PC I/F Driver, Physical Interface and VMM HAL connection to ChipIT.
- VMM HAL: VMM HAL forms interface between VIP and Drivers. VMM HAL method is called by SV Test bench to initiate data transfer to ChipIT. VIP calls different VMM HAL method for sending packet across hardware interface to ChipIT and receiving packet from ChipIT.
- PC I/F Driver: PC Driver could be USB Driver or PCI Driver or Serial Port Driver or Parallel Port Driver. We need to select appropriate Driver in accordance with hardware interface supported by ChipIT. This driver APIs are called by VMM HAL API to perform packet transfer with ChipIT.
- Physical Interface: Physical H/W interface is actual physical cable interface between PC to ChipIT. This could be USB, RS-232, and Parallel Port etc.
- VMM HAL connection to ChipIT: VMM HAL H/W Packet TX/RX Interface is physical interface on ChipIT though with Packet will be received or sent to PC. Received packet are made available to FPGA based Message Ei_hal_inport. And packet provided by Message Ei_hal_outport are driven back to PC.
3.1.3 AHB Synthesizable BFM (On ChipIT Board)
AHB Synthesizable BFM is placed on ChipIT. There are two different BFM one for AHB Master and one for AHB Slave. There are five components which are required on ChipIT – Message Ei_hal_inport, Message Ei_hal_outport, Clock Control, AHB Master or Slave BFM and Protocol Error Monitor.
- Message Ei_hal_inport: This component receives frame sent by VIP. Then it decodes this frame into AHB signals and then send data to AHB transactor. This block uses the dual Ready handshake protocol to receive data. The VMM HAL Macro connected with this block consists of two handshake signals which play dual Ready protocol and data bus that presents the message itself.
- Message Ei_hal_outport: This component receives data from AHB BFM. It encapsulates data received from the AHB in to frame. This block uses the dual Ready handshake protocol to transmit data. The VMM HAL Macro connected with this block consists of two handshake signals which play dual Ready protocol and data bus that presents the message itself. It forms frame out of provided data and sends it to VIP running on host.
- Clock Control: This block is involved in clock gating. Based on request from VIP or BFM this block controls clock provided to AHB Bus and DUT. This block is parameterized to specify a controlled clock of a given frequency, Phase shift and duty cycle. This block also supplies a controlled reset whose duration is the specified number of cycles of the cclock.
- AHB Synthesizable BFM: This could be Master or Slave BFM. This BFM receives command/transaction from Ei_hal_inport block and it performs requested data transaction with DUT. The data received from the AHB BUS is sent to Ei_hal_outport block. This Block supports data transfer up to 256 bit. This block also contains the error injector which uses to generate the error message for protocol violation.
- Protocol Error Monitoring: This block checks any protocol violation happen on the Bus and logs information. This block also form coverage collection module, where it coverage different coverage and combinations as register bits. When requested by software it provies information on violation and coverage back to software.
3.2 Verification using VIP
Below Figure provide diagram of verification environment using Synthesizable VIP.
Figure 2 – Verification Environment for Multi-Master/Slave Setup
Above setup shows how multi-master slave verification can be achieved using multiple instances of VIP and required interconnection of DUT. It also shows how DUT has been interfaced with simulator using custom VMM HAL interface, which is easy to make.
User need to create test bench where multiple instance of VIP BFM model need to instantiated along with DUT and interfaced with simulation model. User can run multiple simulations in short span of time on emulator (ChipIT) and locate issues. Once user is able to locate any er-ror/violation on board, user can run same test on simulation environment which will run with same set of configuration and will generate same data and can be debugged easily.
4. Results
Following table provide insight comparison of results achieved with simulation only and emula-tion based verification. Same tests have been run on simulation and emulation environments. Here transaction information is given in form of 10k x 1kbytes, which means 10k transaction of 1k bytes (AHB Supports 1k bytes transfer at max). On ChipIT we performed two type of testing, where one involved sending 10k different transaction request for perform 10k transaction and sending single request which perform 10k transaction of 1kbytes each.
Table 1 Simulation-Emulation Result
5. Conclusion
Using the Above approch Verification productivity is increased. It is Possible to reduce verification time and find out some corener case in design through hardware implementation. Also, analysis of results also indicates that allowing hardware to work more and reducing communication between software and hardware can yield better results – pointing that data heavy verification can be handled more efficiently using hardware accelaration.
6. References
[1] ChipIT Reference documents.
[2] VMM User Guide
[3] VMM HAL User Guide
[4] AMBA Standard Version 2.0
[5] SceMI Protocol Specification Version 2.0
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related White Papers
- Supporting hardware assisted verification with synthesizable assertions
- Is Agile coming to Hardware Development?
- Applying Continuous Integration to Hardware Design and Verification
- Developing processor-compatible C-code for FPGA hardware acceleration
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience