Scalable UHD JPEG Decoder – Ultra-High Throughput, 8/10/12-bit per component

Overview

The UHT-JPEG-D core from Alma Technologies is a very high performance 8-bit Baseline and 12-bit Extended JPEG decoder, designed to enable the massive pixel rates of 4K & 8K UHD resolutions and high frame rate video applications in highly cost-effective FPGA and ASIC implementations.

The UHT-JPEG-D accepts the standalone and standard compliant JPEG byte stream generated by the Alma Technologies UHT-JPEG-E encoder, or other compatible JPEG byte stream, and outputs the decoded data in interleaved raster scan format. It supports 4:4:4, 4:2:2, 4:2:0 and 4:0:0 (grayscale) image or video streams, in 8-, 10- or 12-bit per component sample depths. The UHT-JPEG-D can be implemented using only on-chip memory resources, while using off-chip memory too is also natively supported. Designed with a user configurable architecture, the decoder scales to offer a sustained decoding throughput from 1 to 32 samples per clock cycle.

Using multiple internal processing engines, the UHT-JPEG-D core offers the needed performance through its scalable parallel architecture. The input JPEG stream is split internally into chunks and each chunk is assigned to one of multiple internal decoding units. This is done in a way which is totally transparent to the system utilizing the IP, abstracting all the parallelization complexity from the rest SoC components. The number of internal decoding units is configurable before synthesis, adapting to the implementation technology speed, and non-critical resources are shared between the multiple engines.

The UHT-JPEG-D uses a single compressed data input interface and produces raster scan interleaved decoded image or video data through a single – multiple pixels – output interface. Its operation is completely standalone, without needing any host CPU or GPU power.

The UHT-JPEG-D core is designed with simple, fully controllable and FIFO-like, streaming input and output interfaces. Being carefully designed and rigorously verified, the UHT-JPEG-D is a reliable and easy-to-use and integrate IP.

Key Features

  • High-Performance, Compliant and Standalone Operation
    • Ultra high throughput in low-end silicon using scalable and transparent parallel processing
    • ITU T.81 compliance
    • 4:4:4, 4:2:2, 4:2:0 and 4:0:0 (grayscale) image or video data decoding
    • 8-, 10- and 12-bit per component sample depth
    • Single JPEG byte stream input and single - multiple pixels - raster scan interleaved output
    • Motion JPEG payload decoding
    • CPU-less, complete and standalone operation
  • Advanced Implementation
    • Up to 32 samples per clock cycle decoding
    • Algorithmic decoding latency of approximately 32 scan lines for 4:2:0 and 16 scan lines for all other sampling formats
    • Configurable full on-chip or mixed on/off-chip memories implementation
    • Flexible optional off-chip memory interface
      • Independent of external memory type
      • Tolerant to latencies
      • Allows for shared memory access
      • Can optionally operate on independent clock domain
    • Avalon-ST and AXI4-Stream compliant streaming data I/O
  • Trouble-Free Technology Map and Implementation
    • Fully portable, self-contained RTL source code
    • Strictly positive edge triggered design
    • D-type only Flip-Flops
    • Safe CDC transfers when using more than one clock domain
    • No special timing constraints required
      • No false or multi-cycle paths within the same clock domain
      • No CDC transfers that need to be constrained (all CDC paths can be excluded)
    • Limitations Applying When Using a Third-Party JPEG Encoder
      • Single decoding engine operation may be needed
      • Single scan interleaved JPEG format only
      • The image dimensions declared in the SOF marker must be an integer multiple of 8
      • No DNL marker

    Block Diagram

    Scalable UHD JPEG Decoder – Ultra-High Throughput, 8/10/12-bit per component Block Diagram

    Deliverables

    • Clear text VHDL source for ASIC designs, or pre-synthesized & verified Netlist for Altera, Lattice, Microsemi and Xilinx FPGA and SoC devices
    • Release Notes, Design Specification and Integration Manual documents
    • Bit Accurate Model (BAM) and test vector generation binaries, including sample scripts
    • Self checking testbench environment, including sample BAM generated test cases
    • Simulation and sample Synthesis (for ASICs) or Place & Route (for FPGAs) scripts

    Technical Specifications

    Availability
    NOW
×
Semiconductor IP