How to build ultra-fast floating-point FFTs in FPGAs
By Ray Andraka, Andraka Consulting Group
April 30, 2007 -- dspdesignline.com
Engineers targeting DSP to FPGAs have traditionally used fixed-point arithmetic, mainly because of the high cost associated with implementing floating-point arithmetic. That cost comes in the form of increased circuit complexity and often degraded maximum clock performance. Certain applications demand the dynamic range offered by floating-point hardware but require speeds and circuit sizes usually associated with fixed-point hardware. The fast Fourier transform (FFT) is one DSP building block that frequently requires floating-point dynamic range and high speed.
A textbook construction of a pipelined floating-point FFT engine capable of continuous input entails dozens of floating-point adders and multipliers. The complexity of these circuits quickly exceeds the resources available on a single FPGA. We fit the FFT design into a single FPGA without sacrificing speed or floating-point performance by using an alternative FFT algorithm and a hybrid of fixed- and floating-point hardware.
The resulting design has IEEE single-precision floating-point inputs and outputs that match the precision obtained with more conventional designs, yet is capable of as much as 1.2 gigasamples-per-second continuous data throughput and fits into one Xilinx Virtex-4 XC4VSX55 FPGA. The design performs 32-, 64-, 128-, 256-, 512-, 1,024-, or 2,048-point complex input Fourier transform, with size selected on the fly.
April 30, 2007 -- dspdesignline.com
Engineers targeting DSP to FPGAs have traditionally used fixed-point arithmetic, mainly because of the high cost associated with implementing floating-point arithmetic. That cost comes in the form of increased circuit complexity and often degraded maximum clock performance. Certain applications demand the dynamic range offered by floating-point hardware but require speeds and circuit sizes usually associated with fixed-point hardware. The fast Fourier transform (FFT) is one DSP building block that frequently requires floating-point dynamic range and high speed.
A textbook construction of a pipelined floating-point FFT engine capable of continuous input entails dozens of floating-point adders and multipliers. The complexity of these circuits quickly exceeds the resources available on a single FPGA. We fit the FFT design into a single FPGA without sacrificing speed or floating-point performance by using an alternative FFT algorithm and a hybrid of fixed- and floating-point hardware.
The resulting design has IEEE single-precision floating-point inputs and outputs that match the precision obtained with more conventional designs, yet is capable of as much as 1.2 gigasamples-per-second continuous data throughput and fits into one Xilinx Virtex-4 XC4VSX55 FPGA. The design performs 32-, 64-, 128-, 256-, 512-, 1,024-, or 2,048-point complex input Fourier transform, with size selected on the fly.
To read the full article, click here
Related Semiconductor IP
- 12-bit, 400 MSPS SAR ADC - TSMC 12nm FFC
- 10-bit Pipeline ADC - Tower 180 nm
- NoC Verification IP
- Simulation VIP for Ethernet UEC
- Automotive Grade PLLs, Oscillators, SerDes PMAs, LVDS/CML IP
Related Articles
- How to build a fast, custom FFT from C
- How to build a better DC/DC regulator using FPGAs
- How to Design SmartNICs Using FPGAs to Increase Server Compute Capacity
- How to achieve better IoT security in Wi-Fi modules
Latest Articles
- Analog Foundation Models
- Modeling and Optimizing Performance Bottlenecks for Neuromorphic Accelerators
- RISC-V Based TinyML Accelerator for Depthwise Separable Convolutions in Edge AI
- Exclude Smart in Functional Coverage
- A 0.32 mm² 100 Mb/s 223 mW ASIC in 22FDX for Joint Jammer Mitigation, Channel Estimation, and SIMO Data Detection