BIST Verification at SoC level
By Abhinav Gaur, Amit Bathla, Gaurav Jain (NXP Semiconductors)
Introduction
BIST (Built-in self-test) is a feature provided in integrated circuits which allow testing its own operation without need of any external hardware. It is a must have feature in safety critical SoCs.
It mainly consist of MBIST (Memory built-in self-test to test memories) and LBIST (Logic built-in self-test to test logic).
Usually, SoCs offer three flavors of self-test:
- Startup or Offline Self-test runs when chip boots up, before the starting of user application code.
- Online (Run-time) Self-test can be triggered at any point of time by the user application software.
- Shutdown Self-test runs at the end of application code.
With the increase complexity of modern day SoCs, the number of memory blocks and LBIST partitions are increasing, which is in turn making the verification efforts quite challenging. This paper highlights the key points to keep in mind while deciding the verification strategy for self-test, and what are the road-blocks in executing this “ideal” verification plan.
Prerequisites to start Self-test verification
Before starting, the verification engineer should have below knowledge/understanding of BIST architecture of the SoC:
- Number of memory blocks supported by the chip.
- Number of LBIST partitions supported by chip.
- Safety compliance understating supported by chip and its requirement.
- Algorithms supported by MBIST blocks.
- Stuck-at/at-speed coverage requirement for LBIST.
- Self-test time specification during startup, run-time and shutdown selftest.
- Power numbers/limitations of the device.
- Supported self-test clock configurations.
- Any special use-case or customer requirement related to self-test.
- Sequence of events that occur when running complete selftest on the SoC.
The “Ideal” verification plan for Self-test
Below key points are helpful to ensure functionality of self-test:
- Cover all MBIST, LBIST partitions individually – All MBIST blocks and LBIST partitions must be run individually. This should be the first step in selftest verification since it helps in ironing out the issues at block level. It is not advisable to directly jump to the complete selftest cases in the beginning of verification.
- Execute all MBIST partitions in parallel and sequential manner and similarly execute all LBIST partition in parallel and sequential manner – This helps in checking that the SoC is capable of running all MBIST/LBIST blocks together.
- Achieve functional coverage by running parallel and sequential MBIST/LBIST cases with different clock sources options present in the SoC– like PLLs, External oscillators, Internal RC oscillator, etc. In case of PLL, cases should be run with various reference clock options for PLL.
- Different type of MBIST algorithms must be covered – there can be reduced algorithms as well as full MBIST algorithms. However, more focus can be given on MBIST algorithm which will be provided to the customer.
- Various LBIST configurations (stuck-at, at-speed) should be covered with LBIST pattern counts for targeted coverage.
- Execute full self-test cases (MBIST + LBIST) - There can be a lot of possible sequences (which makes simulation run-time a challenge) in which these blocks can be run, and hence, it is important to find out the ideal sequence of MBIST/LBIST in the verification cycle itself so that more focus can be given on customer self test sequence.
- Running MBIST before LBIST, running LBIST before MBIST – all combinations must be covered as per what the architecture offers.
- Running startup (offline) self-test followed by shutdown self-test. Running startup self-test again after online self-test. To extend this scenario, startup, online and shutdown self-test can be run one after the other with different self-test configurations to find out some hidden issues in the design.
- Health check of the SoC after completion of self-test – check self-test result registers and critical status registers, etc. – This is important since you do not want selftest to affect any critical functionality of the SoC like clocks,resets, LVD status, error status flags, etc.
- State of the external PADs of the SoC at the time of self-test must be as per expectations. Assertions can be coded for the same which can be activated for all selftest cases.
- Clock and glitch monitors should be present to monitor all clock domains during self-test execution.
- Assertions for known critical signals which should not toggle during self-test (LBIST) execution.
- . Negative cases should be run - for example,
- What happens if self-test is running on PLL and we get PLL loss of lock in between?
- What happens when self-test is aborted with external reset?
- What happens when a LVD event/ POR event/ other reset event comes in between startup/shutdown self-test?
- What happens when a memory fault comes while running MBIST? Also, what happens the LBIST is run after this?
- Measure self-test time and power number for each use-case configuration.
- The various clock dividers of the SoC must be programmed as per the desired frequency for self-test.
- If there is provision for more than one central controllers for initiating/running self-test, it should be run from each of the controllers as per requirements of the SoC.
- Any interrupt, reset event that are generated after self-test must be checked thoroughly in various combinations.
- At RTL level, random forcing should be done on LBIST partition outputs to check issues related to missing safe stating of critical signals during LBIST. This is done to mimic the behavior of LBIST scan flops at netlist level, so that most of the issues are caught before-hand.
- All these cases mentioned above must be covered for startup, runtime and shutdown selftest.
- Any other requirement of the SoC during selftest execution needs to be verified.
Road-blocks in execution of the ideal self-test verification plan and the way out
- There can be a lot of possible scenarios in selftest as described above; and covering each one of them is a challenge in itself – it is limited by simulation runtime and sign-off time provided. Hence, it is advisable to focus more on customer usecases in case of selftest verification. The selection of customer use-case must be based on the following factors:
- (a) Selftest time – Ideally, the configuration should target for minimum selftest time with targeted coverage.
- (b) Current consumption – Running everything (MBIST+LBIST) in parallel will probably be not a good idea since the current consumption of the SoC will become very high during selftest and can violate the spec. On the other hand, running all blocks sequentially will lead to large selftest time which is again not desired for. The ideal selftest configuration must be a compromise between both these factors.
- (c) What to run and what not to run in startup, runtime and shutdown selftest – This depends on the customer requirement. For example, if customer wants to have only 60% coverage during startup selftest, then there is no sense in running the entire selftest, since it will unnecessarily increase the selftest time. So one must fine-tune the selftest configurations as per customer requirements.
- It is important to think from SoC level perspective- what all events can occur when selftest is running, what are the expectations of the customer on running selftest, etc. Thinking from use-case/customer perspective will help in identifying various scenarios/corner cases, which otherwise are tough to anticipate.
- Another challenge is to verify the different permutations and combinations of MBIST and LBIST blocks along with the different configurations offered by the SoC. It is important to identify some limited number of combinations at verification level – out of these combinations, one sequence can be finalized as customer sequence based on further analysis.
- LBIST safe stating issues are tough to find (especially at the RTL stage), and require thorough testing of the design. Smarter techniques must be employed to catch such issues before-hand, instead of solely relying on directed tests.
- Running targeted pattern count of LBIST partitions in a single testcase will take a lot of time. Hence it is advisable to break it in to groups.
- Since the actual scan flops are introduced at netlist level and not RTL level, running Gate level simulations (GLS) is a must for LBIST cases. This adds to the simulation time woes. Hence, the GLS test suite must be chosen wisely so that it can be run within the targeted sign-off time window.
- Similar challenge exists for MBIST as well to cover all combination with all supported algorithms and with different clock configuration. Only targeted MBIST algorithms should be run.
Conclusion
Self-test is a SoC level feature and requires deep understanding of not only the MBIST/LBIST architecture, but also what all events can happen while running self-test, and hence requires knowledge of the complete SoC. Hence, the verification planning for selftest requires utmost caution, so that no possible case is missed. The verification engineer must try to anticipate the issues from a customer view-point to ensure thorough verification and effective sign-off of this category. The test suite must also be chosen wisely since the number of cases can be huge. Within a limited time frame for verification sign-off, instead of running everything, the verification engineer must try to optimize the test suite so that it is manageable (from simulation run-time perspective).
References
Related Semiconductor IP
- A memory BIST solution which has been optimized for Dolphin memories
- DDRx Bist Controller with I2C slave and multi-channel AMBA master
Related White Papers
- Shifting Mindsets: Static Verification Transforms SoC Design at RT Level
- Creating IP level test cases which can be reused at SoC level
- A Survey on SoC Security Verification Methods at the Pre-silicon Stage
- IP Verification : Building systems at the silicon level: time, cost, design constraints
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience