Top-down DSP design for FPGAs
By Mike Fingeroff, Dan Gardner, and Matt Hogan, Mentor Graphics
September 05, 2007 -- pldesignline.com
Digital filtering of non-real-time signals has been performed for decades. The increased performance in today's silicon allows for these calculations to be accomplished in real-time, if the right hardware and algorithms are used. Much like the specialization that occurred when digital signal processing (DSP) was initially introduced, FPGAs provide a convenient platform for today's signal processing algorithms.
Over the last few years, FGPA vendors have included dedicated multiplication / addition / subtraction hardware into their devices. No longer must a multiplier be built out of the fabric logic of an FPGA. Dedicated blocks containing mathematical functions, feedback paths, and pipeline stages allow logic designers to easily parallelize the computational-intensive portions of their design within these resources.
The most commonly used functions are the ability to multiply and accumulate (MAC), and multiply and add (MULT-ADD). The internal routing within the dedicated DSP hardware allows these operations to run at extremely high clock frequencies leading to high-performance designs.
It is the capability of massively parallelizing these functions and cascading them inside the FPGA that provides the most benefit. Not only can simple filters be constructed, but also complex algorithms for video compression and cryptography. Often, the hardest part of the design is the mapping of the algorithm to the different FPGA resources, controlling clock latencies and other hardware aspects to meet the timing of the design.
Many of these algorithms are readily expressed in high-level languages, such as C and C++, requiring the designer to translate this high-level algorithm to RTL for implementation in an FPGA. This article will discuss a top-down methodology that allows engineers to describe DSP algorithms in C, automatically generate RTL, and then synthesize to an FPGA implementation.
September 05, 2007 -- pldesignline.com
Digital filtering of non-real-time signals has been performed for decades. The increased performance in today's silicon allows for these calculations to be accomplished in real-time, if the right hardware and algorithms are used. Much like the specialization that occurred when digital signal processing (DSP) was initially introduced, FPGAs provide a convenient platform for today's signal processing algorithms.
Over the last few years, FGPA vendors have included dedicated multiplication / addition / subtraction hardware into their devices. No longer must a multiplier be built out of the fabric logic of an FPGA. Dedicated blocks containing mathematical functions, feedback paths, and pipeline stages allow logic designers to easily parallelize the computational-intensive portions of their design within these resources.
The most commonly used functions are the ability to multiply and accumulate (MAC), and multiply and add (MULT-ADD). The internal routing within the dedicated DSP hardware allows these operations to run at extremely high clock frequencies leading to high-performance designs.
It is the capability of massively parallelizing these functions and cascading them inside the FPGA that provides the most benefit. Not only can simple filters be constructed, but also complex algorithms for video compression and cryptography. Often, the hardest part of the design is the mapping of the algorithm to the different FPGA resources, controlling clock latencies and other hardware aspects to meet the timing of the design.
Many of these algorithms are readily expressed in high-level languages, such as C and C++, requiring the designer to translate this high-level algorithm to RTL for implementation in an FPGA. This article will discuss a top-down methodology that allows engineers to describe DSP algorithms in C, automatically generate RTL, and then synthesize to an FPGA implementation.
To read the full article, click here
Related Semiconductor IP
- HBM4 PHY IP
- Ultra-Low-Power LPDDR3/LPDDR2/DDR3L Combo Subsystem
- HBM4 Controller IP
- IPSEC AES-256-GCM (Standalone IPsec)
- Parameterizable compact BCH codec
Related Articles
- Top Down SoC Floor planning with ReUse
- Implementing floating-point DSP on FPGAs
- How to Design SmartNICs Using FPGAs to Increase Server Compute Capacity
- Shift Left for More Efficient Block Design and Chip Integration
Latest Articles
- A 14ns-Latency 9Gb/s 0.44mm² 62pJ/b Short-Blocklength LDPC Decoder ASIC in 22FDX
- Pipeline Stage Resolved Timing Characterization of FPGA and ASIC Implementations of a RISC V Processor
- Lyra: A Hardware-Accelerated RISC-V Verification Framework with Generative Model-Based Processor Fuzzing
- Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval
- Extending and Accelerating Inner Product Masking with Fault Detection via Instruction Set Extension