How to get more performance in 65 nm FPGA designs
By Adrian Cosoroaba and Frederic Rivoallon, Xilinx
November 07, 2006
This ''How To'' explores how FPGA designers can benefit from the latest FPGA building blocks in their quest for higher system-level performance.
Getting the most performance out of today's FPGA designs grows ever more challenging as system complexities increase and functional requirements become more demanding. Maximizing system performance in an FPGA system design requires a balanced mix of performance-efficient components comprising logic fabric, on-chip memory, DSP blocks, and I/O bandwidth. This article explores how FPGA designers can benefit from the latest FPGA building blocks in their quest for higher system-level performance. We will explore key features of new 65 nm fabric architecture with examples that quantify the anticipated performance improvements for logic and arithmetic functions.
Hard IP blocks are essential in sustaining a desired performance level that may be limited by potential bottlenecks outside of the fabric, like on-chip memory buffers, DSP blocks or I/Os. Analysis of various design benchmarks are provided to better understand the impact of new product and technology innovations and to better quantify expectations.
Extracting maximum performance from an FPGA design also depends heavily on the ability of software tools to optimally map the RTL code onto the FPGA technology cells. Tuning the design with the latest software options requires squeezing more MHz or Mbps from the design implementation. This article provides actual examples on how the latest physical synthesis options can make a difference in meeting timing requirements.
November 07, 2006
This ''How To'' explores how FPGA designers can benefit from the latest FPGA building blocks in their quest for higher system-level performance.
Getting the most performance out of today's FPGA designs grows ever more challenging as system complexities increase and functional requirements become more demanding. Maximizing system performance in an FPGA system design requires a balanced mix of performance-efficient components comprising logic fabric, on-chip memory, DSP blocks, and I/O bandwidth. This article explores how FPGA designers can benefit from the latest FPGA building blocks in their quest for higher system-level performance. We will explore key features of new 65 nm fabric architecture with examples that quantify the anticipated performance improvements for logic and arithmetic functions.
Hard IP blocks are essential in sustaining a desired performance level that may be limited by potential bottlenecks outside of the fabric, like on-chip memory buffers, DSP blocks or I/Os. Analysis of various design benchmarks are provided to better understand the impact of new product and technology innovations and to better quantify expectations.
Extracting maximum performance from an FPGA design also depends heavily on the ability of software tools to optimally map the RTL code onto the FPGA technology cells. Tuning the design with the latest software options requires squeezing more MHz or Mbps from the design implementation. This article provides actual examples on how the latest physical synthesis options can make a difference in meeting timing requirements.
To read the full article, click here
Related Semiconductor IP
- Chiplet Die-to-Die Interconnect IP Solution
- High speed MACsec Engine 100G/200G/400G/800G/1.6T
- Temperature/Voltage sensors
- AMBA Bus Host to eSPI Controller/Target
- AMBA Bus Host to eSPI Controller
Related Articles
- How FPGA technology is evolving to meet new mid-range system requirements
- How to Verify Complex RISC-V-based Designs
- How to achieve better IoT security in Wi-Fi modules
- How to manage changing IP in an evolving SoC design
Latest Articles
- ZK-Flex: A Flexible and Scalable Framework for Accelerating Zero-Knowledge Proofs
- ITP-STDP: An Intrinsic-Timing Power-of-Two Learning Engine for On-Chip SNN Training
- OpenEye: A Scalable Open-Source Hardware Accelerator for DNNs
- CHIMERA: A Flexible and Scalable 3.1 TOPS/W AI-MCU with Transformer Accelerator and 563 Gb/s Shared-L2 Memory Subsystem with QoS Guarantees
- CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST