MING: An Automated CNN-to-Edge MLIR HLS framework

By Jiahong Bi, Lars Schütze and Jeronimo Castrillon
Technische Universitat Dresden, Germany

Abstract

Driven by the increasing demand for low-latency and real-time processing, machine learning applications are steadily migrating toward edge computing platforms, where Field-Programmable Gate Arrays (FPGAs) are widely adopted for their energy efficiency compared to CPUs and GPUs. To generate high-performance and low-power FPGA designs, several frameworks built upon High Level Synthesis (HLS) vendor tools have been proposed, among which MLIR-based frameworks are gaining significant traction due to their extensibility and ease of use. However, existing state-of-the-art frameworks often overlook the stringent resource constraints of edge devices. To address this limitation, we propose MING, an Multi-Level Intermediate Representation (MLIR)-based framework that abstracts and automates the HLS design process. Within this framework, we adopt a streaming architecture with carefully managed buffers, specifically designed to handle resource constraints while ensuring low-latency. In comparison with recent frameworks, our approach achieves on average 15x speedup for standard Convolutional Neural Network (CNN) kernels with up to four layers, and up to 200x for single-layer kernels. For kernels with larger input sizes, MING is capable of generating efficient designs that respect hardware resource constraints, whereas state-of-the-art frameworks struggle to meet.

Index Terms — Hardware Architectures, Compilers, High Level Synthesis, Quantized Neural Network, Edge Computing

To read the full article, click here

MING: An Automated CNN-to-Edge MLIR HLS framework

Abstract

Related Semiconductor IP

Related Articles

Latest Articles

Related Articles

DDGEN: An Automated Device Driver Generation Tool for Embedded Systems

An Automated Flow for Reset Connectivity Checks in Complex SoCs having Multiple Power Domains

Modular design framework allows network processor software reuse

Automated verification of configurable IP blocks

MING: An Automated CNN-to-Edge MLIR HLS framework

Fault Tolerant Design of IGZO-based Binary Search ADCs

A 16 nm 1.60TOPS/W High Utilization DNN Accelerator with 3D Spatial Data Reuse and Efficient Shared Memory Access

Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design

IFV: Information Flow Verification at the Pre-silicon Stage Utilizing Static-Formal Methodology

MING: An Automated CNN-to-Edge MLIR HLS framework

Abstract

Subscribe to the Semi IP Hub Newsletter

Related Semiconductor IP

Related Articles

Latest Articles