AI Edge Inference is Totally Different to Data Center
By Geoff Tate, Flex Logix
EETimes (July 23, 2020)
While inference accelerators started out primarily in the data center, they have quickly moved to edge inference with applications such as autonomous driving and medical imaging. Through this transition, customers are finding out, often the hard way, that the same accelerator that did so well processing images in the data center fails badly in edge inference. The reason for this is simple: one processes a pool of data while the other processes a stream.
Streaming throughput is when you process at batch = 1 and a pool is when you process batch = many. In the data center, customers are typically processing pools of data such as photos that are being tagged. The goal is getting through as many photos as possible with the least amount of resources and power consumption, and best latency.
Edge inference applications, on the other hand, need to process a stream of data. These customers usually have a camera that is producing 30 frames per second and each frame is typically 2 megapixels. Typically, that works out to be 33 milliseconds per image for about 30 frames per second. When you have an image that is coming in from a stream, how it is processed depends on what it needs to do.
To read the full article, click here
Related Semiconductor IP
- eFPGA
- Heterogeneous eFPGA architecture with LUTs, DSPs, and BRAMs on GlobalFoundries GF12LP
- eFPGA on GlobalFoundries GF12LP
- eFPGA Hard IP Generator
- Radiation-Hardened eFPGA
Related White Papers
- Data Center Ethernet: Is a Bigger Pipe Enough?
- Why Software is Critical for AI Inference Accelerators
- The Expanding Markets for Edge AI Inference
- Using edge AI processors to boost embedded AI performance
Latest White Papers
- What tamper detection IP brings to SoC designs
- Analyzing Modern NVIDIA GPU cores
- RISC-V in 2025: Progress, Challenges,and What’s Next for Automotive & OpenHardware
- Leveraging RISC-V as a Unified, Heterogeneous Platform for Next-Gen AI Chips
- Design and implementation of a hardened cryptographic coprocessor for a RISC-V 128-bit core