How to build ultra-fast floating-point FFTs in FPGAs
By Ray Andraka, Andraka Consulting Group
April 30, 2007 -- dspdesignline.com
Engineers targeting DSP to FPGAs have traditionally used fixed-point arithmetic, mainly because of the high cost associated with implementing floating-point arithmetic. That cost comes in the form of increased circuit complexity and often degraded maximum clock performance. Certain applications demand the dynamic range offered by floating-point hardware but require speeds and circuit sizes usually associated with fixed-point hardware. The fast Fourier transform (FFT) is one DSP building block that frequently requires floating-point dynamic range and high speed.
A textbook construction of a pipelined floating-point FFT engine capable of continuous input entails dozens of floating-point adders and multipliers. The complexity of these circuits quickly exceeds the resources available on a single FPGA. We fit the FFT design into a single FPGA without sacrificing speed or floating-point performance by using an alternative FFT algorithm and a hybrid of fixed- and floating-point hardware.
The resulting design has IEEE single-precision floating-point inputs and outputs that match the precision obtained with more conventional designs, yet is capable of as much as 1.2 gigasamples-per-second continuous data throughput and fits into one Xilinx Virtex-4 XC4VSX55 FPGA. The design performs 32-, 64-, 128-, 256-, 512-, 1,024-, or 2,048-point complex input Fourier transform, with size selected on the fly.
April 30, 2007 -- dspdesignline.com
Engineers targeting DSP to FPGAs have traditionally used fixed-point arithmetic, mainly because of the high cost associated with implementing floating-point arithmetic. That cost comes in the form of increased circuit complexity and often degraded maximum clock performance. Certain applications demand the dynamic range offered by floating-point hardware but require speeds and circuit sizes usually associated with fixed-point hardware. The fast Fourier transform (FFT) is one DSP building block that frequently requires floating-point dynamic range and high speed.
A textbook construction of a pipelined floating-point FFT engine capable of continuous input entails dozens of floating-point adders and multipliers. The complexity of these circuits quickly exceeds the resources available on a single FPGA. We fit the FFT design into a single FPGA without sacrificing speed or floating-point performance by using an alternative FFT algorithm and a hybrid of fixed- and floating-point hardware.
The resulting design has IEEE single-precision floating-point inputs and outputs that match the precision obtained with more conventional designs, yet is capable of as much as 1.2 gigasamples-per-second continuous data throughput and fits into one Xilinx Virtex-4 XC4VSX55 FPGA. The design performs 32-, 64-, 128-, 256-, 512-, 1,024-, or 2,048-point complex input Fourier transform, with size selected on the fly.
To read the full article, click here
Related Semiconductor IP
- eUSB2V2.0 Controller + PHY IP
- I/O Library with LVDS in SkyWater 90nm
- 50G PON LDPC Encoder/Decoder
- UALink Controller
- RISC-V Debug & Trace IP
Related Articles
- How to build a fast, custom FFT from C
- How to build a better DC/DC regulator using FPGAs
- How to Design SmartNICs Using FPGAs to Increase Server Compute Capacity
- How to achieve better IoT security in Wi-Fi modules
Latest Articles
- ChipBench: A Next-Step Benchmark for Evaluating LLM Performance in AI-Aided Chip Design
- COVERT: Trojan Detection in COTS Hardware via Statistical Activation of Microarchitectural Events
- A Reconfigurable Framework for AI-FPGA Agent Integration and Acceleration
- Veri-Sure: A Contract-Aware Multi-Agent Framework with Temporal Tracing and Formal Verification for Correct RTL Code Generation
- FlexLLM: Composable HLS Library for Flexible Hybrid LLM Accelerator Design