Top-down DSP design for FPGAs
By Mike Fingeroff, Dan Gardner, and Matt Hogan, Mentor Graphics
September 05, 2007 -- pldesignline.com
Digital filtering of non-real-time signals has been performed for decades. The increased performance in today's silicon allows for these calculations to be accomplished in real-time, if the right hardware and algorithms are used. Much like the specialization that occurred when digital signal processing (DSP) was initially introduced, FPGAs provide a convenient platform for today's signal processing algorithms.
Over the last few years, FGPA vendors have included dedicated multiplication / addition / subtraction hardware into their devices. No longer must a multiplier be built out of the fabric logic of an FPGA. Dedicated blocks containing mathematical functions, feedback paths, and pipeline stages allow logic designers to easily parallelize the computational-intensive portions of their design within these resources.
The most commonly used functions are the ability to multiply and accumulate (MAC), and multiply and add (MULT-ADD). The internal routing within the dedicated DSP hardware allows these operations to run at extremely high clock frequencies leading to high-performance designs.
It is the capability of massively parallelizing these functions and cascading them inside the FPGA that provides the most benefit. Not only can simple filters be constructed, but also complex algorithms for video compression and cryptography. Often, the hardest part of the design is the mapping of the algorithm to the different FPGA resources, controlling clock latencies and other hardware aspects to meet the timing of the design.
Many of these algorithms are readily expressed in high-level languages, such as C and C++, requiring the designer to translate this high-level algorithm to RTL for implementation in an FPGA. This article will discuss a top-down methodology that allows engineers to describe DSP algorithms in C, automatically generate RTL, and then synthesize to an FPGA implementation.
September 05, 2007 -- pldesignline.com
Digital filtering of non-real-time signals has been performed for decades. The increased performance in today's silicon allows for these calculations to be accomplished in real-time, if the right hardware and algorithms are used. Much like the specialization that occurred when digital signal processing (DSP) was initially introduced, FPGAs provide a convenient platform for today's signal processing algorithms.
Over the last few years, FGPA vendors have included dedicated multiplication / addition / subtraction hardware into their devices. No longer must a multiplier be built out of the fabric logic of an FPGA. Dedicated blocks containing mathematical functions, feedback paths, and pipeline stages allow logic designers to easily parallelize the computational-intensive portions of their design within these resources.
The most commonly used functions are the ability to multiply and accumulate (MAC), and multiply and add (MULT-ADD). The internal routing within the dedicated DSP hardware allows these operations to run at extremely high clock frequencies leading to high-performance designs.
It is the capability of massively parallelizing these functions and cascading them inside the FPGA that provides the most benefit. Not only can simple filters be constructed, but also complex algorithms for video compression and cryptography. Often, the hardest part of the design is the mapping of the algorithm to the different FPGA resources, controlling clock latencies and other hardware aspects to meet the timing of the design.
Many of these algorithms are readily expressed in high-level languages, such as C and C++, requiring the designer to translate this high-level algorithm to RTL for implementation in an FPGA. This article will discuss a top-down methodology that allows engineers to describe DSP algorithms in C, automatically generate RTL, and then synthesize to an FPGA implementation.
To read the full article, click here
Related Semiconductor IP
- 8MHz / 40MHz Pierce Oscillator - X-FAB XT018-0.18µm
- UCIe RX Interface
- Very Low Latency BCH Codec
- 5G-NTN Modem IP for Satellite User Terminals
- 400G UDP/IP Hardware Protocol Stack
Related Articles
- Top Down SoC Floor planning with ReUse
- Implementing floating-point DSP on FPGAs
- How to Design SmartNICs Using FPGAs to Increase Server Compute Capacity
- Why Transceiver-Rich FPGAs Are Suitable for Vehicle Infotainment System Designs
Latest Articles
- SNAP-V: A RISC-V SoC with Configurable Neuromorphic Acceleration for Small-Scale Spiking Neural Networks
- An FPGA Implementation of Displacement Vector Search for Intra Pattern Copy in JPEG XS
- A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA
- VMXDOTP: A RISC-V Vector ISA Extension for Efficient Microscaling (MX) Format Acceleration
- PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability