Ultra High Speed FFT/IFFT processor

Overview

Noesis Technologies ntFFT_UHS IP implements a customized FFT/IFFT programmable fixed point (Decimation in Frequency - DIF) transform processor, supporting low latency, streaming, ultra-parallel complex samples per clock cycle in natural order. Input, internal and output 2’s complement fixed point precision are fully configurable before IP synthesis.
Radix-2, Radix-4 or Mixed-Radix design may be selected with parallel butterflies deployment, depending on the implemented transform sizes. Each stage features its own permutation network buffer, implemented optionally as either Register File or BRAM primitives. Twiddle factors fixed-point precision is selected via parameterization and the values are precalculated and stored in small distributed LUTs next to the respective butterflies, using a scalable design methodology. The permutation of each buffer stage is necessarily custom-made, since it is dependent on the parallel samples per clock cycle configuration and the supported FFT transform sizes. An optional Circular Shift buffer can be instantiated for those applications that need to correct a detected Carrier Frequency Offset in the frequency domain (FFT), with range of circular shifts correction relevant to both the FFT transform size and the parallel samples per clock cycle. Additional Overlap-Save (OLS) method wrappers may be provided to support real time high bandwidth filtering applications.

Key Features

  • Highly pre-synthesis design configurability with detailed generic/parameter values, in order to meet desired implementation trade-offs.
  • Matlab 100% bit-true reference model with performance metrics and quality markers.
  • Natural order input and output.
  • Programmable FFT/IFFT transform selection and active transform size.
  • Deployed Radix-2, Radix-4 or Mixed-Radix data-path units as needed. Radix selection also provides a trade-off for implemented multipliers.
  • Uniform signal energy by scaling in every FFT/IFFT processing stage. For Radix-2 stages optional sqrt(2) scaling with multipliers or with add/sub approximation trade-off.
  • Optional programmable Cyclic Shifter for Carrier Frequency Offset (CFO) correction.
  • Optional Overlap-Save wrapper logic for both FFT and IFFT operation.
  • Example architecture #1: 256-parallel programmable 1024/256 FFT low latency, pipelined streaming implementation [I/O FI S7.6, INT. FI S11.8].
  • Example architecture #1 is synthesized & proven in 5nm process node, achieving ~960 MHz clock with throughput rate ~245 IQ Gsps (complex Giga-samples per second) for ntFFT.
  • Example architecture #2: 32-parallel programmable 256/128/64 FFT low latency, pipelined streaming implementation [I/O FI S7.6, INT. FI S11.8].
  • Synchronous clock design.

Applications

  • Communication Systems.
  • Spectrum Analysis.
  • OFDM modems.
  • Image processing.
  • Streaming Filtering applications.
  • Defense Receivers and Signal Monitoring.
  • Medical and Scientific Instruments.

Deliverables

  • Fully commented synthesizable VHDL / System Verilog source code or FPGA netlist.
  • VHDL / System Verilog test bench and example configura-tion files.
  • Matlab fixed point model.
  • Comprehensive technical documentation.
  • Technical support.

Technical Specifications

Foundry, Node
TSMC 28nm
Maturity
Silicon Proven
Availability
Now
×
Semiconductor IP