Medical imaging process accelerated in FPGA 82X faster than software
Zhongho Chen, Alvin W.Y. Su, Ming-Ting Sun, and Scott Hauck
EETimes (6/21/2011 3:42 PM EDT)
Medical imaging tasks can require high-performance signal processing to convert sensor data into imagery to help with medical diagnostics. FPGAs are a compelling platform for these systems, since they can perform heavily pipelined operations customized to the exact needs of a given computation. In previous work we have benchmarked a CT scanner back-projection algorithm. In this article we focus on an FPGA platform and a high level synthesis tool called Impulse C to speed up a statistical line of reaction (LOR) estimation for a high-resolution Positron Emission Tomography (PET) scanner. The estimation algorithm provides a significant improvement over conventional methods, but the execution time is too long to be practical for clinic applications. Impulse C allows us to rapidly map a C program into a platform with a host processor and an FPGA coprocessor. In this article, we describe some successful optimization methods for the algorithm using Impulse C. The results show that the FPGA implementation can obtain an 82x speedup over the optimized software.
To read the full article, click here
Related Semiconductor IP
- Chiplet Die-to-Die Interconnect IP Solution
- High speed MACsec Engine 100G/200G/400G/800G/1.6T
- Temperature/Voltage sensors
- AMBA Bus Host to eSPI Controller/Target
- AMBA Bus Host to eSPI Controller
Related Articles
- IP Gate Count Estimation Methodology during Micro-Architecture Phase
- Unified Methodology for Effective Correlation of SoC Power Estimation and Signoff
- Exec goes to bat for standard design methodology
- Verification methodology serves memory subsystem
Latest Articles
- ZK-Flex: A Flexible and Scalable Framework for Accelerating Zero-Knowledge Proofs
- ITP-STDP: An Intrinsic-Timing Power-of-Two Learning Engine for On-Chip SNN Training
- OpenEye: A Scalable Open-Source Hardware Accelerator for DNNs
- CHIMERA: A Flexible and Scalable 3.1 TOPS/W AI-MCU with Transformer Accelerator and 563 Gb/s Shared-L2 Memory Subsystem with QoS Guarantees
- CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST