Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience
By Francesco Conti, Angelo Garofalo, Davide Rossi, Giuseppe Tagliavini, and Luca Benini -- University of Bologna
The complexity of Artificial Intelligence (AI) algorithms increases at an exponential pace that pure technological scaling, especially with the slowing of Moore’s law, can not keep up with. Epoch AI estimates that the number of parameters in AI models is currently (as of 2024) scaling at a rate of 2× per year; training floating-point operations (FLOPs) are scaling even faster at 4.2× per year. On the other hand, the same institution estimates that compute performance from dedicated hardware only scales at a rate of 1.3× per year for 32-bit floating-point data, and with similar rates for other data formats. This figure includes gains from both technology node advancements and architectural improvements.
This setup creates an extraordinary challenge for the designers of heterogeneous AI System-on-Chips (SoCs). On the one hand, accelerator designs must scale continuously to match the increasing complexity of AI workloads – and this is true not only for datacenter AI accelerators but also for edge AI devices, whose functionality is expected to become progressively more sophisticated. On the other hand, this scaling also needs to happen at a fast pace, which makes it imperative to design, verify, and tape-out new complex heterogeneous SoCs with a much quicker turnaround time than in traditional cycles – especially for fabless startups.
By merit of its “automatic” cost-sharing principle, the open-source hardware model offers a promising avenue to streamline and accelerate the development of new SoCs, both in terms of cost and time. The principle is simple: instead of allocating significant resources to integrate outsourced IPs from vendors for low-value common baseline, non-differentiating parts of an SoC, one can focus efforts and funding primarily on the development of differentiating proprietary IPs and outsource only those technology-dependent IPs that are of critical importance (e.g., DRAM PHYs). Moreover, one can leverage available high-quality open-source IPs as a “starting point” for their designs, avoiding the need to fund development from scratch.
Since 2013, the academic PULP (Parallel Ultra-Low Power) Platform project has been one of the most active and successful initiatives in designing research IPs and releasing them as open-source. Its portfolio now ranges from processor cores to network-on-chips, peripherals, SoC templates, and full hardware accelerators. In this article, we focus on the PULP experience designing heterogeneous AI acceleration SoCs – an endeavour encompassing SoC architecture definition; development, verification, and integration of acceleration IPs; front- and back-end VLSI design; testing; development of AI deployment software.
To read the full article, click here
Related Semiconductor IP
- Motorola MC6845 Functional Equivalent CRT Controller
- Display Controller – Ultra HD LCD / OLED Panels (AXI4/AXI Bus)
- Display Controller – LCD / OLED Panels (Avalon Bus)
- High-Performance Memory Expansion IP for AI Accelerators
- General use, integer-N 4GHz Hybrid Phase Locked Loop on TSMC 28HPC
Related White Papers
- Paving the way for the next generation audio codec for TRUE Wireless Stereo (TWS) applications - PART 4 : Achieving the ultimate audio experience
- Paving the way for the next generation of audio codec for True Wireless Stereo (TWS) applications - PART 5 : Cutting time to market in a safe and timely manner
- How Efinix is Conquering the Hurdle of Hardware Acceleration for Devices at the Edge
- The benefit of non-volatile memory (NVM) for edge AI