Optimization of AI Performance of SoCs for AD/ADAS
In recent years, advances in artificial intelligence (AI) technology based on deep learning have led to an increasing number of situations where AI is directly useful in our daily lives, such as improving the accuracy of automatic translation and making recommendations that match consumers' preferences. One example is automated driving (AD) and advanced driver assistance (ADAS) in automobiles.
Since processing of recent AI models represented by deep neural networks (DNNs) requires large-scale parallel operations, GPUs capable of general-purpose parallel operations are often used for development on PCs. On the other hand, SoCs for AD and ADAS are increasingly equipped with dedicated circuits (hereinafter referred to as "accelerators") that realize DNN processing with low power consumption and high performance. However, it is generally not easy to confirm at an early stage of SoC development whether the on-chip accelerator can deliver sufficient performance for the DNN that one wishes to use. TOPS (Tera Operations Per Second) values, which represent the maximum arithmetic performance of the accelerator design, and TOPS/W values, which are calculated by dividing the above by the power consumption during operation, are often used as indicators for performance comparisons. But since accelerators are designed specifically to perform specific processing at high speed (*1), even if the TOPS values are sufficient, the performance of the accelerator may not be sufficient due to the existence of operations that cannot be processed efficiently or insufficient data transfer bandwidth. In addition, the power consumption of the overall SoC may exceed the acceptable range due to an increase in accelerator power.
(*1) Dedicated design: While it is possible to use a general-purpose GPU as an accelerator, hardware design focused on specific processing can achieve higher processing performance with less circuit size and power consumption. For example, the accelerators in Renesas' automotive SoCs, R-Car V3H, R-Car V3M, and R-Car V4H, have a structure suitable for processing convolutional neural networks (CNN), which use convolutional operations for feature extraction among DNNs.
As SoC development progresses, the degree of difficulty in making design changes due to insufficient performance or excessive power consumption generally increases, and the impact on the SoC development schedule and development cost also increases. For this reason, it is very important in the development of SoCs for automotive AI devices to confirm at an early stage of SoC development whether the accelerator to be installed can deliver sufficient performance for the DNN that the customer product wants to use and whether the power consumption is within an acceptable range.
To read the full article, click here
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related Blogs
- The Importance of Memory Architecture for AI SoCs
- The Evolving Role of Layout-Versus-Schematic (LVS) Checking for Modern SoCs
- Mentium Accelerates Tape-out of AI Accelerator Chip for Space Applications on Synopsys Cloud
- Arm Ethos-U85: Addressing the High Performance Demands of IoT in the Age of AI
Latest Blogs
- Cadence Announces Industry's First Verification IP for Embedded USB2v2 (eUSB2v2)
- The Industry’s First USB4 Device IP Certification Will Speed Innovation and Edge AI Enablement
- Understanding Extended Metadata in CXL 3.1: What It Means for Your Systems
- 2025 Outlook with Mahesh Tirupattur of Analog Bits
- eUSB2 Version 2 with 4.8Gbps and the Use Cases: A Comprehensive Overview