Medical imaging process accelerated in FPGA 82X faster than software
Zhongho Chen, Alvin W.Y. Su, Ming-Ting Sun, and Scott Hauck
EETimes (6/21/2011 3:42 PM EDT)
Medical imaging tasks can require high-performance signal processing to convert sensor data into imagery to help with medical diagnostics. FPGAs are a compelling platform for these systems, since they can perform heavily pipelined operations customized to the exact needs of a given computation. In previous work we have benchmarked a CT scanner back-projection algorithm. In this article we focus on an FPGA platform and a high level synthesis tool called Impulse C to speed up a statistical line of reaction (LOR) estimation for a high-resolution Positron Emission Tomography (PET) scanner. The estimation algorithm provides a significant improvement over conventional methods, but the execution time is too long to be practical for clinic applications. Impulse C allows us to rapidly map a C program into a platform with a host processor and an FPGA coprocessor. In this article, we describe some successful optimization methods for the algorithm using Impulse C. The results show that the FPGA implementation can obtain an 82x speedup over the optimized software.
To read the full article, click here
Related Semiconductor IP
- HBM4 PHY IP
- eFuse Controller IP
- Secure Storage Solution for OTP IP
- Ultra-Low-Power LPDDR3/LPDDR2/DDR3L Combo Subsystem
- MIPI D-PHY and FPD-Link (LVDS) Combinational Transmitter for TSMC 22nm ULP
Related Articles
- IP Gate Count Estimation Methodology during Micro-Architecture Phase
- Unified Methodology for Effective Correlation of SoC Power Estimation and Signoff
- SoCs: Supporting Socketization -> Methodology key to quality
- Exec goes to bat for standard design methodology
Latest Articles
- Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference
- Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory
- ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
- A 14ns-Latency 9Gb/s 0.44mm² 62pJ/b Short-Blocklength LDPC Decoder ASIC in 22FDX
- Pipeline Stage Resolved Timing Characterization of FPGA and ASIC Implementations of a RISC V Processor