Tutorial: Programming High-Performance DSPs, Part 1
This first of a three-part series explains the features of high-performance DSPs, with a focus on VLIW pipelines and multi-level memory architectures. It shows how to write code for these advanced architectures. It also introduces Direct Memory Access (DMA), and explains how to use it.
By Rob Oshana, Texas Instruments
November 27, 2006 -- dspdesignline.com
INTRODUCTION
Many of today's digital signal processing (DSP) applications are subject to real-time constraints. And it seems many applications eventually grow to a point where they are stressing the available CPU and memory resources. Many of these applications seem like trying to fit ten pounds of algorithms into a five pound sack. Understanding the architecture of the DSP, as well as the compiler can speed up applications, sometimes by an order of magnitude. This article will summarize some of the techniques used in practice to gain orders of magnitude speed increases from high performance DSPs.
Make the common case fast
The fundamental rule in computer design as well as programming real time systems is "make the common case fast, and favor the frequent case." This is really just Amdahl's Law that says the performance improvement to be gained using some faster mode of execution is limited by how often you use that faster mode of execution. So don't spend time trying to optimize a piece of code that will hardly ever run. You won't get much out of it, no matter how innovative you are. Instead, if you can eliminate just one cycle from a loop that executes thousands of times, you will see a bigger impact on the bottom line.
By Rob Oshana, Texas Instruments
November 27, 2006 -- dspdesignline.com
INTRODUCTION
Many of today's digital signal processing (DSP) applications are subject to real-time constraints. And it seems many applications eventually grow to a point where they are stressing the available CPU and memory resources. Many of these applications seem like trying to fit ten pounds of algorithms into a five pound sack. Understanding the architecture of the DSP, as well as the compiler can speed up applications, sometimes by an order of magnitude. This article will summarize some of the techniques used in practice to gain orders of magnitude speed increases from high performance DSPs.
Make the common case fast
The fundamental rule in computer design as well as programming real time systems is "make the common case fast, and favor the frequent case." This is really just Amdahl's Law that says the performance improvement to be gained using some faster mode of execution is limited by how often you use that faster mode of execution. So don't spend time trying to optimize a piece of code that will hardly ever run. You won't get much out of it, no matter how innovative you are. Instead, if you can eliminate just one cycle from a loop that executes thousands of times, you will see a bigger impact on the bottom line.
To read the full article, click here
Related Semiconductor IP
- HBM4 PHY IP
- Ultra-Low-Power LPDDR3/LPDDR2/DDR3L Combo Subsystem
- MIPI D-PHY and FPD-Link (LVDS) Combinational Transmitter for TSMC 22nm ULP
- HBM4 Controller IP
- IPSEC AES-256-GCM (Standalone IPsec)
Related Articles
- Optimizing High Performance CPUs, GPUs and DSPs? Use logic and memory IP - Part II
- Performance optimization using smart memory controllers, Part 1
- Understanding layers in the JESD204B specification: A high speed ADC perspective, Part 1
- Optimizing high-performance CPUs, GPUs and DSPs? Use logic and memory IP - Part I
Latest Articles
- ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
- A 14ns-Latency 9Gb/s 0.44mm² 62pJ/b Short-Blocklength LDPC Decoder ASIC in 22FDX
- Pipeline Stage Resolved Timing Characterization of FPGA and ASIC Implementations of a RISC V Processor
- Lyra: A Hardware-Accelerated RISC-V Verification Framework with Generative Model-Based Processor Fuzzing
- Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval