When Your Embedded Processor Runs Out of Steam, Try Parallelism
Ron Wilson, Altera
March 24, 2015
The scenario is becoming increasingly familiar. You have a working embedded design, perhaps backed by years of deployment with customers and hundreds of thousands of lines of debugged code. Along comes marketing with a new set of performance specifications, or R/D with a new computer-crushing algorithm. Your existing CPU family just can’t handle it.
At this point your options can look uniformly dismal. Perhaps you can move to a higher-performance CPU family without totally losing instruction-set compatibility. But there will almost certainly be enough differences to require a new operating-system (OS) version and re-verification of the code. And the new CPU will have new hardware implications, from increased power consumption to different DRAM interfaces.
Of course you can also move to a faster CPU with a different instruction set. But a new tool chain, from compiler to debug, plus the task of finding all those hidden instruction-set dependencies in your legacy code, can make this move genuinely frightening. And changing SoC vendors will have system-level hardware implications too.
Or you could try a different approach: you could identify the performance hot spots in your code—or in that new algorithm—and break them into multiple threads that can be executed in parallel. Then you could execute this new code in a multicore CPU cluster. Unless you are currently using something quite strange, there is a good chance that there is a chip out there that has multiple instances of the CPU core you are using now. Or, if there is inherent parallelism in your data, you could rewrite those threads to run on a graphics processor (GPU), a multicore digital signal processing (DSP) chip, or a purpose-built hardware accelerator in an FPGA or ASIC. All these choices require new code—but only for the specific segments you are accelerating. The vast majority of that legacy code can remain unexamined.
To read the full article, click here
Related Semiconductor IP
- Video Tracking FPGA IP core for Xilinx and Altera
- Video Tracking FPGA IP core for Xilinx and Altera
- Video Tracking FPGA IP core for Xilinx and Altera
- NAND flash Controller using Altera PHY Lite
- Aurora-like 64b/66b @14Gbps for ALTERA Devices
Related White Papers
- Is a single-chip SOC processor right for your embedded project?
- Software Infrastructure of an embedded Video Processor Core for Multimedia Solutions
- SoC Test and Verification -> Assertions speed processor core verification
- High-Performance DSPs -> Processor boards: Architecture drives performance
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience