Partitioning to optimize AI inference for multi-core platforms
By Rami Drucker, Ceva
EDN (January 8, 2024)
Not so long ago, artificial intelligence (AI) inference at the edge was a novelty easily supported by a single neural processing unit (NPU) IP accelerator embedded in the edge device. Expectations have accelerated rapidly since then. Now we want embedded AI inference to handle multiple cameras, complex scene segmentation, voice recognition with intelligent noise suppression, fusion between multiple sensors, and now very large and complex generative AI models.
Such applications can deliver acceptable throughput for edge products only when run on multi-core AI processors. NPU IP accelerators are already available to meet this need, extending to eight or more parallel cores and able to handle multiple inference tasks in parallel. But how should you partition expected AI inference workloads for your product to take maximum advantage of all that horsepower?
To read the full article, click here
Related Semiconductor IP
- USB 4.0 V2 PHY - 4TX/2RX, TSMC N3P , North/South Poly Orientation
- FH-OFDM Modem
- NFC wireless interface supporting ISO14443 A and B with EEPROM on SMIC 180nm
- PQC CRYSTALS core for accelerating NIST FIPS 202 FIPS 203 and FIPS 204
- UCIe Controller baseline for Streaming Protocols for ASIL B Compliant, AEC-Q100 Grade 2
Related White Papers
- AI Edge Inference is Totally Different to Data Center
- Why Interlaken is a great choice for architecting chip to chip communications in AI chips
- The Growing Importance of AI Inference and the Implications for Memory Technology
- Top 5 Reasons why CPU is the Best Processor for AI Inference
Latest White Papers
- FastPath: A Hybrid Approach for Efficient Hardware Security Verification
- Automotive IP-Cores: Evolution and Future Perspectives
- TROJAN-GUARD: Hardware Trojans Detection Using GNN in RTL Designs
- How a Standardized Approach Can Accelerate Development of Safety and Security in Automotive Imaging Systems
- SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models