Architecture-oriented C optimization, part 2: Memory and more

Here's how to optimize C to account for memory alignment, cache features, endianness, and application specific instructions.

By Eran Belaish, CEVA
dspdesignline.com (September 03, 2008)

Memory related guidelines

Alignment considerations

Architectures may allow or disallow unaligned memory access. While no special guidelines are required when unaligned memory access is allowed, if disallowed, the programmer must be careful. Ignoring alignment considerations causes severe performance issues and even malfunctions. To avoid malfunctions, all memory accesses need to be executed with the proper alignment. To improve performance, the compiler needs to be aware of the alignment of pointers and arrays in the program. Optimizing compilers normally track pointer arithmetic to identify alignment at each stage of the code in order to apply SIMD (Single Instruction Multiple Data) memory accesses and maintain correctness. In some cases the compiler can tell that a pointer alignment allows memory access optimization (for example, when a pointer to a 16-bit variable is aligned to 32 bits) and then SIMD memory operations are emitted. In other cases, the pointers are not aligned. Then the only option is to make them aligned by copying them to aligned buffers or by using the linker.

In most cases, the compiler simply cannot tell the alignment. It therefore assumes the worst case scenario and avoids memory access optimization as a consequence. To overcome this lack of information, advanced compilers offer a user interface for specifying the alignment of a given pointer. The compiler then uses this information when considering memory access optimization for the pointer. For loops with excessive memory accesses (such as copy loops), this feature allows two and even four times acceleration.

To read the full article, click here

Architecture-oriented C optimization, part 2: Memory and more

Related Semiconductor IP

Related Articles

Latest Articles

Related Articles

Architecture Oriented C Optimizations

A Multi-Objective Optimization Model for Energy and Performance Aware Synthesis of NoC Architecture

Architecture-oriented C optimization, part 1: DSP features

BBOPlace-Bench: Benchmarking Black-Box Optimization for Chip Placement

RISC-V Functional Safety for Autonomous Automotive Systems: An Analytical Framework and Research Roadmap for ML-Assisted Certification

Emulation-based System-on-Chip Security Verification: Challenges and Opportunities

A 129FPS Full HD Real-Time Accelerator for 3D Gaussian Splatting

SkipOPU: An FPGA-based Overlay Processor for Large Language Models with Dynamically Allocated Computation

TensorPool: A 3D-Stacked 8.4TFLOPS/4.3W Many-Core Domain-Specific Processor for AI-Native Radio Access Networks

Architecture-oriented C optimization, part 2: Memory and more

Subscribe to the Semi IP Hub Newsletter

Related Semiconductor IP

Related Articles

Latest Articles