Partitioning to optimize AI inference for multi-core platforms

By Rami Drucker, Ceva
EDN (January 8, 2024)

Not so long ago, artificial intelligence (AI) inference at the edge was a novelty easily supported by a single neural processing unit (NPU) IP accelerator embedded in the edge device. Expectations have accelerated rapidly since then. Now we want embedded AI inference to handle multiple cameras, complex scene segmentation, voice recognition with intelligent noise suppression, fusion between multiple sensors, and now very large and complex generative AI models.

Such applications can deliver acceptable throughput for edge products only when run on multi-core AI processors. NPU IP accelerators are already available to meet this need, extending to eight or more parallel cores and able to handle multiple inference tasks in parallel. But how should you partition expected AI inference workloads for your product to take maximum advantage of all that horsepower?

To read the full article, click here

Partitioning to optimize AI inference for multi-core platforms

Related Semiconductor IP

Related Articles

Latest Articles

Related Articles

MultiVic: A Time-Predictable RISC-V Multi-Core Processor Optimized for Neural Network Inference

AI Edge Inference is Totally Different to Data Center

Why Interlaken is a great choice for architecting chip to chip communications in AI chips

The Growing Importance of AI Inference and the Implications for Memory Technology

FPGA-Accelerated RISC-V ISA Extensions for Efficient Neural Network Inference on Edge Devices

AnaFlow: Agentic LLM-based Workflow for Reasoning-Driven Explainable and Sample-Efficient Analog Circuit Sizing

FeNN-DMA: A RISC-V SoC for SNN acceleration

Multimodal Chip Physical Design Engineer Assistant

Partitioning to optimize AI inference for multi-core platforms

Subscribe to the Semi IP Hub Newsletter

Related Semiconductor IP

Related Articles

Latest Articles