Understanding Peak Floating-Point Performance Calculations
Michael Parker, Altera
EETimes (10/20/2014 02:03 PM EDT)
DSPs, GPUs, and FPGAs serve as accelerators for many CPUs, providing both performance and power efficiency benefits. Given the variety of computing architectures available, designers need a uniform method to compare performance and power efficiency. The accepted method is to measure floating-point operations per second (FLOPs), where a FLOP is defined as either an addition or multiplication of single (32 bit) or double (64 bit) precision numbers in conformance with the IEEE 754 standard. All higher-order functions, such as division, square root, and trigonometric operators, can be constructed using adders and multipliers. As these operators, as well as other common functions such as fast Fourier transforms (FFTs) and matrix operators, require both adders and multipliers. There is commonly a 1:1 ratio of adders and multipliers in all these architectures.
Let's look at how we go about comparing the performance of the DSP, GPU, and FPGA architectures based on their peak FLOPS rating. The peak FLOPS rating is determined by multiplying the sum of the adders and multipliers by the maximum operation frequency. This represents the theoretical limit for computations, which can never be achieved in practice, since it is generally not possible to implement useful algorithms that can keep all the computational units occupied all the time. It does however provide a useful comparison metric.
To read the full article, click here
Related Semiconductor IP
- ReRAM NVM in DB HiTek 130nm BCD
- UFS 5.0 Host Controller IP
- PDM Receiver/PDM-to-PCM Converter
- Voltage and Temperature Sensor with integrated ADC - GlobalFoundries® 22FDX®
- 8MHz / 40MHz Pierce Oscillator - X-FAB XT018-0.18µm
Related Articles
- Understanding and selecting higher performance NAND architectures
- Understanding the "e" verification language
- Understanding the Semiconductor Intellectual Property (SIP) Business Process
- Understanding the MAC impact of 802.11e: Part 2 (By Simon Chung and Kamila Piechota, Silicon and Software Systems)
Latest Articles
- An FPGA-Based SoC Architecture with a RISC-V Controller for Energy-Efficient Temporal-Coding Spiking Neural Networks
- Enabling RISC-V Vector Code Generation in MLIR through Custom xDSL Lowerings
- A Scalable Open-Source QEC System with Sub-Microsecond Decoding-Feedback Latency
- SNAP-V: A RISC-V SoC with Configurable Neuromorphic Acceleration for Small-Scale Spiking Neural Networks
- An FPGA Implementation of Displacement Vector Search for Intra Pattern Copy in JPEG XS