Why the Memory Subsystem is Critical in Inferencing Chips
By Geoff Tate, Flex Logix
EETimes (December 22, 2019)
Good inferencing chips can move data very quickly.
The number of new inferencing chip companies announced this past year is enough to make your head spin. With so many chips and no lack of any quality benchmarks, the industry often forgets one extremely critical piece: the memory subsystem. The truth is, you can’t have a good inference chip unless you have a good memory subsystem. Thus, if an inferencing chip company is only talking about TOPS and having very little discussion around SRAM, DRAM and the memory subsystem in general, they probably don’t have a very good solution.
It’s All About Data Throughput
Good inferencing chips are architected so that they can move data through them very quickly, which means they have to process that data very fast, and move it in and out of memory very quickly. If you look at models using ResNet-50 and YOLOv3, you will see a striking difference not only in their computational side, but also in how they each use memory.
For each image using ResNet-50, it takes 2 billion multiply accumulates (MACs), but for YOLOv3 it takes over 200 billion MACs. That is a hundred times increase. Part of this is due to the fact that there are more weights for YOLOv3 (62 million weights versus approximately 23 million for ResNet-50.) However, the biggest difference is with the image size in the typical benchmark. ResNet-50 uses 224×224 which is the size no one actually uses and YOLOv3 uses 2 megapixels. Thus, the computational load is much greater on YOLOv3.
Using the example above, you can see that we have two different workloads and one takes 100 times more. The obvious question is: does this mean YOLOv3 runs 100 times slower? The only way you can answer that is by looking at the memory subsystem because that is going to tell you the actual throughput on any given chip.
Related Semiconductor IP
- eFPGA
- Radiation-Hardened eFPGA
- eFPGA IP as a synthesizable RTL core
- eFPGA IP and FPGA Software Built on GLOBALFOUNDRIES 22FDX
- eFPGA IP and FPGA Software Built on Samsung Foundry 28nm FDSOI
Related White Papers
- Why Interlaken is a great choice for architecting chip to chip communications in AI chips
- Why is Analog increasingly important in the Digital Era?
- Why Software is Critical for AI Inference Accelerators
- The Future of Embedded FPGAs - eFPGA: The Proof is in the Tape Out
Latest White Papers
- How silicon and circuit optimizations help FPGAs offer lower size, power and cost in video bridging applications
- Sustainable Hardware Specialization
- PCIe IP With Enhanced Security For The Automotive Market
- Top 5 Reasons why CPU is the Best Processor for AI Inference
- CANsec: Security for the Third Generation of the CAN Bus