Our IP inference accelerators enhance AI computations, providing outstanding performance across various applications. Whether your needs involve real-time object detection, natural language processing, image recognition, or other AI tasks, our accelerators empower quicker and more effective AI processing, elevating the competitiveness of your applications.
Parameterizeable AI Accelerator (GenCore) includes several SystemVerilog RTL kernels, such as DataFetcher, Conv+ReLU, MaxPooling and DataWriter.
AI Accelerator Specifically for CNN
Overview
Key Features
- A specialized hardware with controlled throughput and hardware cost/resources, utilizing parameterizeable layers, configurable weights, and precision settings to support fixed-point operations.
- This hardware aim to accelerate inference operations, particulary for CNNs such as LeNet-5, VGG-16, VGG-19, AlexNet, ResNet-50, etc.
- Customers also have flexibility to customize their own CNN models and adapt them to this hardware, specifying the number of layers, weights, and bit configurations supporting fixed-point up to 16 bits.
Benefits
- The cascaded kernels deep pipeline streamlines basic CNN operations, eliminating the need to store interlayer data externally. This reduces memory bandwidth demands, crucial for embedded FPGAs and ASICs.
- We use a single hardware kernel for convolution and fully connected layers, improving overall resource efficiency.
Block Diagram
Applications
- CNN
- Support for Transformer's LLM models is in progress or can be tailored to meet customer needs
Deliverables
- RTL Files
- MATLAB Models
- Guidance Support to Customized CNN Models for Customer