With support for the latest generative AI models and traditional RNN, CNN, and LSTM models, the Origin™ E6 NPUs scale from 16 to 32 TOPS to deliver the optimum balance of performance, efficiency, and features for demanding edge inference applications.
The Origin E6 is a versatile NPU that is customized to match the needs of next-generation smartphones, automobiles, AV/VR, and consumer devices. With support for video, audio, and text-based AI networks, including standard, custom, and proprietary networks, the E6 is the ideal hardware/software co-designed platform for chip architects and AI developers. It offers broad native support for current and emerging AI models, and achieves ultra-efficient workload scheduling and memory management, with up to 90% processor utilization—avoiding dark silicon waste.
The Origin E6 neural engine uses Expedera’s unique packet-based architecture, which is far more efficient than common layer-based architectures. The architecture enables parallel execution across multiple layers achieving better resource utilization and deterministic performance. It also eliminates the need for hardware-specific optimizations, allowing customers to run their trained neural networks unchanged without reducing model accuracy. This innovative approach greatly increases performance while lowering power, area, and latency.
Specifications
Compute Capacity | 8K to 16K INT8 MACs |
Multi-tasking | Run up to 8 Simultaneous Jobs |
Power Efficiency | 18 TOPS/W effective; no pruning, sparsity or compression required (though supported) |
Example Networks Supported | HitNet, Denoise, ResNext, ResNet50 V1.5, ResNet50 V2, Inception V3, RNN-T, MobileNet SSD, MobileNet V1, UNET, BERT, EfficientNet, FSR CNN, CPN, CenterNet, YOLO V3, YOLO v5l, ShuffleNet2, Swin, SSD-ResNet34, DETR, others |
Example Performance | MobileNet V1 (512 x 512): 3629 IPS, 2696 IPS/W (N7 process, 1GHz, no sparsity/pruning/compression applied) |
Layer Support | Standard NN functions, including Conv, Deconv, FC, Activations, Reshape, Concat, Elementwise, Pooling, Softmax, others. Programmable general FP function, including Sigmoid, Tanh, Sine, Cosine, Exp, others, custom operators supported. |
Data types | INT4/INT8/INT10/INT12/INT16 Activations/Weights FP16/BFloat16 Activations/Weights |
Quantization | Channel-wise Quantization (TFLite Specification) Software toolchain supports Expedera, customer-supplied, or third-party quantization |
Latency | Deterministic performance guarantees, no back pressure |
Frameworks | TensorFlow, TFlite, ONNX, others supported |