Overview
The Xilinx® Deep Learning Processor Unit (DPU) is a programmable engine dedicated for convolutional neural network. The unit contains register configure module, data controller module, and convolution computing module. There is a specialized instruction set for DPU, which enables DPU to work efficiently for many convolutional neural networks. The deployed convolutional neural network in DPU includes VGG, ResNet, GoogLeNet, YOLO, SSD, MobileNet, FPN, etc.
The DPU IP can be integrated as a block in the programmable logic (PL) of the selected Zynq®-7000 SoC and Zynq UltraScale™+ MPSoC devices with direct connections to the processing system (PS). To use DPU, you should prepare the instructions and input image data in the specific memory address that DPU can access. The DPU operation also requires the application processing unit (APU) to service interrupts to coordinate data transfer.
Learn more about NPU IP core
Is your NPU DOOMed? Quadric's Chimera GPNPU runs every AI model — and a complete DOOM engine. Find out why Quadric is different.
At Quadric, we have long argued that heterogeneous NPU designs — those that stitch together multiple specialized fixed-function engines — carry an unavoidable hidden cost: data has to move. A lot. And data movement burns power, adds latency, and creates silicon-area overhead that scales with every new generation of AI models. Now, Intel has made that case for us.
The IP industry is no stranger to boom and bust cycles, and it looks to be at the crest of another wave.
AI is evolving faster than the chips designed to run it. Models like large language transformers and generative networks are shifting rapidly–while silicon development cycles remain long and rigid. Traditional NPUs, built around proprietary instruction sets and opaque compilers, simply can’t keep up.
Just bolting a matrix accelerator onto existing processor IP leads to long-term challenges.