Hit performance goals with configurable processors
By Steven Leibson, Tensilica
October 17, 2006 -- dspdesignline.com
With mainstream DSPs, code must be hand-tuned using assembly code in order to meet performance goals. A more productive approach is to tailor the processor to the algorithm. This article explains why, using FFT, Viterbi, and MPEG4 examples.
For more than 30 years, the fixed-ISA (instruction set architecture) microprocessor has largely defined electronic system design. Similarly, single-chip DSPs have dominated DSP system design since they were introduced 20 years ago. Fixed-ISA processors have addressed application performance problems in two ways. The first approach has been to increase clock rate, mirroring the same trend in PC processors. If power dissipation is not important, rapidly escalating clock rates can go a long way towards force-fitting a particular processor to an application. The alternative approach for fixed-ISA processors is to provide more computational resources that operate in parallel so that the processor can perform more work per clock.
Designers of fixed-ISA microprocessors and DSPs attempt to develop architectures that are good at executing a wide range of algorithms, but not tailored to any specific application. This design approach can reduce clock-rate requirements, but it doesn't match the clock-rate or gate-usage efficiencies of a tailored solution. The design of complex portable and battery-powered processor-based systems increasingly calls for taking a different approach, one that more closely matches the processing resources to the application tasks so that very high clock rates (and correspondingly high power and heat dissipations) are not required. FPGA, ASIC and SOC technologies provide an ideal medium for tailoring processors to specific applications, using newly available configurable and extensible processor technology.
October 17, 2006 -- dspdesignline.com
With mainstream DSPs, code must be hand-tuned using assembly code in order to meet performance goals. A more productive approach is to tailor the processor to the algorithm. This article explains why, using FFT, Viterbi, and MPEG4 examples.
For more than 30 years, the fixed-ISA (instruction set architecture) microprocessor has largely defined electronic system design. Similarly, single-chip DSPs have dominated DSP system design since they were introduced 20 years ago. Fixed-ISA processors have addressed application performance problems in two ways. The first approach has been to increase clock rate, mirroring the same trend in PC processors. If power dissipation is not important, rapidly escalating clock rates can go a long way towards force-fitting a particular processor to an application. The alternative approach for fixed-ISA processors is to provide more computational resources that operate in parallel so that the processor can perform more work per clock.
Designers of fixed-ISA microprocessors and DSPs attempt to develop architectures that are good at executing a wide range of algorithms, but not tailored to any specific application. This design approach can reduce clock-rate requirements, but it doesn't match the clock-rate or gate-usage efficiencies of a tailored solution. The design of complex portable and battery-powered processor-based systems increasingly calls for taking a different approach, one that more closely matches the processing resources to the application tasks so that very high clock rates (and correspondingly high power and heat dissipations) are not required. FPGA, ASIC and SOC technologies provide an ideal medium for tailoring processors to specific applications, using newly available configurable and extensible processor technology.
To read the full article, click here
Related Semiconductor IP
- HBM4 PHY IP
- eFuse Controller IP
- Secure Storage Solution for OTP IP
- Ultra-Low-Power LPDDR3/LPDDR2/DDR3L Combo Subsystem
- MIPI D-PHY and FPD-Link (LVDS) Combinational Transmitter for TSMC 22nm ULP
Related Articles
- Tuning Fork - A Tool For Optimizing Parallel Configurable Processors
- Configurable processors or RTL -- evaluating the tradeoffs
- Configurable Processors: Ready for Prime Time
- Configurable Processors for Video Processing SOCs
Latest Articles
- Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference
- Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory
- ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
- A 14ns-Latency 9Gb/s 0.44mm² 62pJ/b Short-Blocklength LDPC Decoder ASIC in 22FDX
- Pipeline Stage Resolved Timing Characterization of FPGA and ASIC Implementations of a RISC V Processor