Tutorial: Programming High-Performance DSPs, Part 1
This first of a three-part series explains the features of high-performance DSPs, with a focus on VLIW pipelines and multi-level memory architectures. It shows how to write code for these advanced architectures. It also introduces Direct Memory Access (DMA), and explains how to use it.
By Rob Oshana, Texas Instruments
November 27, 2006 -- dspdesignline.com
INTRODUCTION
Many of today's digital signal processing (DSP) applications are subject to real-time constraints. And it seems many applications eventually grow to a point where they are stressing the available CPU and memory resources. Many of these applications seem like trying to fit ten pounds of algorithms into a five pound sack. Understanding the architecture of the DSP, as well as the compiler can speed up applications, sometimes by an order of magnitude. This article will summarize some of the techniques used in practice to gain orders of magnitude speed increases from high performance DSPs.
Make the common case fast
The fundamental rule in computer design as well as programming real time systems is "make the common case fast, and favor the frequent case." This is really just Amdahl's Law that says the performance improvement to be gained using some faster mode of execution is limited by how often you use that faster mode of execution. So don't spend time trying to optimize a piece of code that will hardly ever run. You won't get much out of it, no matter how innovative you are. Instead, if you can eliminate just one cycle from a loop that executes thousands of times, you will see a bigger impact on the bottom line.
By Rob Oshana, Texas Instruments
November 27, 2006 -- dspdesignline.com
INTRODUCTION
Many of today's digital signal processing (DSP) applications are subject to real-time constraints. And it seems many applications eventually grow to a point where they are stressing the available CPU and memory resources. Many of these applications seem like trying to fit ten pounds of algorithms into a five pound sack. Understanding the architecture of the DSP, as well as the compiler can speed up applications, sometimes by an order of magnitude. This article will summarize some of the techniques used in practice to gain orders of magnitude speed increases from high performance DSPs.
Make the common case fast
The fundamental rule in computer design as well as programming real time systems is "make the common case fast, and favor the frequent case." This is really just Amdahl's Law that says the performance improvement to be gained using some faster mode of execution is limited by how often you use that faster mode of execution. So don't spend time trying to optimize a piece of code that will hardly ever run. You won't get much out of it, no matter how innovative you are. Instead, if you can eliminate just one cycle from a loop that executes thousands of times, you will see a bigger impact on the bottom line.
To read the full article, click here
Related Semiconductor IP
- JESD204E Controller IP
- eUSB2V2.0 Controller + PHY IP
- I/O Library with LVDS in SkyWater 90nm
- 50G PON LDPC Encoder/Decoder
- UALink Controller
Related Articles
- Optimizing High Performance CPUs, GPUs and DSPs? Use logic and memory IP - Part II
- Performance optimization using smart memory controllers, Part 1
- Understanding layers in the JESD204B specification: A high speed ADC perspective, Part 1
- Optimizing high-performance CPUs, GPUs and DSPs? Use logic and memory IP - Part I
Latest Articles
- System-Level Isolation for Mixed-Criticality RISC-V SoCs: A "World" Reality Check
- CVA6-CFI: A First Glance at RISC-V Control-Flow Integrity Extensions
- Crypto-RV: High-Efficiency FPGA-Based RISC-V Cryptographic Co-Processor for IoT Security
- In-Pipeline Integration of Digital In-Memory-Computing into RISC-V Vector Architecture to Accelerate Deep Learning
- QMC: Efficient SLM Edge Inference via Outlier-Aware Quantization and Emergent Memories Co-Design