The role of cache in AI processor design
By Frank Schirrmeister, Arteris
EDN (March 22, 2024)
Artificial intelligence (AI) is making its presence felt everywhere these days, from the data centers at the Internet’s core to sensors and handheld devices like smartphones at the Internet’s edge and every point in between, such as autonomous robots and vehicles. For the purposes of this article, we recognize the term AI to embrace machine learning and deep learning.
There are two main aspects to AI: training, which is predominantly performed in data centers, and inferencing, which may be performed anywhere from the cloud down to the humblest AI-equipped sensor.
AI is a greedy consumer of two things: computational processing power and data. In the case of processing power, OpenAI, the creator of ChatGPT, published the report AI and Compute, showing that since 2012, the amount of compute used in large AI training runs has doubled every 3.4 months with no indication of slowing down.
With respect to memory, a large generative AI (GenAI) model like ChatGPT-4 may have more than a trillion parameters, all of which need to be easily accessible in a way that allows to handle numerous requests simultaneously. In addition, one needs to consider the vast amounts of data that need to be streamed and processed.
To read the full article, click here
Related Semiconductor IP
- AXI to UCIe FDI Interface IP
- 45SPCLO UCIe-Class 1-32Gbps Low Power Receiver IP (NRZ)
- 45SPCLO UCIe-Class 1-32Gbps Low Power Transmitter IP (NRZ)
- Peripheral Sensor Interface (PSI5) Host Controller
- Link Acceleration Unit
Related Articles
- MIPI in next generation of AI IoT devices at the edge
- The Role of Interconnection in the Evolution of Advanced Packaging Technology
- Understanding the Importance of Prerequisites in the VLSI Physical Design Stage
- The Growing Imperative Of Hardware Security Assurance In IP And SoC Design
Latest Articles
- Croc: Training the Next Generation Chip Designers on Domain-Specific End-to-End Open Source Silicon
- Design and Development of a Neuromorphic Silicon Suite: PVT Sensing, Stochastic LIF Inference, On-Chip STDP Learning, and Crossbar Programming
- LLM4RTL: Tool-Assisted LLM for RTL Generation
- Towards Delta Aware Training: Efficient DNN Weight Storage for Resource-Constrained FPGAs
- CHERI-D: Secure and efficient inline object ID for CHERI temporal memory safety