Optimization of AI Performance of SoCs for AD/ADAS
In recent years, advances in artificial intelligence (AI) technology based on deep learning have led to an increasing number of situations where AI is directly useful in our daily lives, such as improving the accuracy of automatic translation and making recommendations that match consumers' preferences. One example is automated driving (AD) and advanced driver assistance (ADAS) in automobiles.
Since processing of recent AI models represented by deep neural networks (DNNs) requires large-scale parallel operations, GPUs capable of general-purpose parallel operations are often used for development on PCs. On the other hand, SoCs for AD and ADAS are increasingly equipped with dedicated circuits (hereinafter referred to as "accelerators") that realize DNN processing with low power consumption and high performance. However, it is generally not easy to confirm at an early stage of SoC development whether the on-chip accelerator can deliver sufficient performance for the DNN that one wishes to use. TOPS (Tera Operations Per Second) values, which represent the maximum arithmetic performance of the accelerator design, and TOPS/W values, which are calculated by dividing the above by the power consumption during operation, are often used as indicators for performance comparisons. But since accelerators are designed specifically to perform specific processing at high speed (*1), even if the TOPS values are sufficient, the performance of the accelerator may not be sufficient due to the existence of operations that cannot be processed efficiently or insufficient data transfer bandwidth. In addition, the power consumption of the overall SoC may exceed the acceptable range due to an increase in accelerator power.
(*1) Dedicated design: While it is possible to use a general-purpose GPU as an accelerator, hardware design focused on specific processing can achieve higher processing performance with less circuit size and power consumption. For example, the accelerators in Renesas' automotive SoCs, R-Car V3H, R-Car V3M, and R-Car V4H, have a structure suitable for processing convolutional neural networks (CNN), which use convolutional operations for feature extraction among DNNs.
As SoC development progresses, the degree of difficulty in making design changes due to insufficient performance or excessive power consumption generally increases, and the impact on the SoC development schedule and development cost also increases. For this reason, it is very important in the development of SoCs for automotive AI devices to confirm at an early stage of SoC development whether the accelerator to be installed can deliver sufficient performance for the DNN that the customer product wants to use and whether the power consumption is within an acceptable range.
To read the full article, click here
Related Semiconductor IP
- SoC Security Platform / Hardware Root of Trust
- SPI to AHB-Lite Bridge
- Octal SPI Master/Slave Controller
- I2C and SPI Master/Slave Controller
- AHB/AXI4-Lite to AXI4-Stream Bridge
Related Blogs
- The Importance of Memory Architecture for AI SoCs
- The Evolving Role of Layout-Versus-Schematic (LVS) Checking for Modern SoCs
- Mentium Accelerates Tape-out of AI Accelerator Chip for Space Applications on Synopsys Cloud
- Arm Ethos-U85: Addressing the High Performance Demands of IoT in the Age of AI