Why the Memory Subsystem is Critical in Inferencing Chips

By Geoff Tate, Flex Logix
EETimes (December 22, 2019)

Good inferencing chips can move data very quickly.

The number of new inferencing chip companies announced this past year is enough to make your head spin. With so many chips and no lack of any quality benchmarks, the industry often forgets one extremely critical piece: the memory subsystem. The truth is, you can’t have a good inference chip unless you have a good memory subsystem. Thus, if an inferencing chip company is only talking about TOPS and having very little discussion around SRAM, DRAM and the memory subsystem in general, they probably don’t have a very good solution.

It’s All About Data Throughput

Good inferencing chips are architected so that they can move data through them very quickly, which means they have to process that data very fast, and move it in and out of memory very quickly. If you look at models using ResNet-50 and YOLOv3, you will see a striking difference not only in their computational side, but also in how they each use memory.

For each image using ResNet-50, it takes 2 billion multiply accumulates (MACs), but for YOLOv3 it takes over 200 billion MACs. That is a hundred times increase. Part of this is due to the fact that there are more weights for YOLOv3 (62 million weights versus approximately 23 million for ResNet-50.) However, the biggest difference is with the image size in the typical benchmark. ResNet-50 uses 224×224 which is the size no one actually uses and YOLOv3 uses 2 megapixels. Thus, the computational load is much greater on YOLOv3.

Using the example above, you can see that we have two different workloads and one takes 100 times more. The obvious question is: does this mean YOLOv3 runs 100 times slower? The only way you can answer that is by looking at the memory subsystem because that is going to tell you the actual throughput on any given chip.

To read the full article, click here

eFPGA IP Selector

Why the Memory Subsystem is Critical in Inferencing Chips

Related Semiconductor IP

Related Articles

Latest Articles

Related Articles

Why Interlaken is a great choice for architecting chip to chip communications in AI chips

Why Software is Critical for AI Inference Accelerators

The Future of Embedded FPGAs - eFPGA: The Proof is in the Tape Out

Radiation Tolerance is not just for Rocket Scientists: Mitigating Digital Logic Soft Errors in the Terrestrial Environment

TensorPool: A 3D-Stacked 8.4TFLOPS/4.3W Many-Core Domain-Specific Processor for AI-Native Radio Access Networks

Assertain: Automated Security Assertion Generation Using Large Language Models

VolTune: A Fine-Grained Runtime Voltage Control Architecture for FPGA Systems

A Lightweight High-Throughput Collective-Capable NoC for Large-Scale ML Accelerators

Quantifying Uncertainty in FMEDA Safety Metrics: An Error Propagation Approach for Enhanced ASIC Verification

Why the Memory Subsystem is Critical in Inferencing Chips

Subscribe to the Semi IP Hub Newsletter

Related Semiconductor IP

Related Articles

Latest Articles