Anatomy of a hardware video codec
Tensilica, Inc.
If bandwidth and storage were infinite and free, there would be no need for video compression. However, bandwidth is always limited; the per-byte price of storage decreases over time but it's never free; and so we need video compression. In fact, since its introduction in the early 1990s, video compression has grown increasingly important for the design of modern electronic products because it aids in the quest to deliver high-quality video using limited transmission bandwidth and storage capacity.
Video compression and decompression each consist of many steps. Video encoding starts with a series of still images (frames) captured at a certain frame rate (usually 15, 25, or 30 frames/sec). Video cameras capture images using CCD or CMOS sensors that sense red/green/blue (RGB) light intensity but these RGB images do not directly correspond to the way the human eye works.
The human eye uses rods to sense light intensity (luma) and cones to sense color (chroma) and the eye is much more sensitive to luma than chroma because it contains many more rods than cones. Consequently, most video-compression systems first transform RGB pictures into luma and chroma (YUV) images. Then they downsample the chroma portion, which reduces the number of bits in the video stream even before digital compression occurs. Thus most digital video compression schemes take a series of YUV images and produce compressed data while video decompression streams expand a compressed video stream into a series of YUV still images.
Because video streams start with a series of still images, video compression streams can use many of the compression techniques developed to compress still images. Many of these techniques are "lossy" (as opposed to "lossless"). Lossy compression techniques identify and discard portions of an image that cannot be perceived or are nearly invisible. One of the most successful lossy compression schemes used for still images has been the ubiquitous JPEG standard. Video streams are even better suited to lossy compression than still images because any image imperfections that result from the compression appear fleetingly and are often gone in a fraction of a second. The eye is much more forgiving with moving pictures than still images.
Most video-compression algorithms slice pictures into small pixel blocks and then transform these blocks from the spatial domain into a series of coefficients in the frequency domain. The most common transform used to perform this conversion has been the DCT (discrete cosine transform), first widely used for JPEG image compression. Most video-compression schemes prior to the H.264/AVC digital-video standard employ the DCT on 8x8-pixel blocks but H.264/AVC uses the more advanced, integer-based Hadamard transform on 16x16-pixel blocks.
Because the eye is more sensitive to lower frequencies, the pixel blocks' frequency-domain representations can be passed through a low-pass filter, which reduces the number of bits needed to represent that block. In addition, video-compression algorithms can represent low-frequency coefficients with more precision using more bits while using fewer bits to represent the high-frequency coefficients.
Determining the frequency-dependent coefficients takes two steps. First, the coefficients are quantized to discrete levels using perceptual weighting to limit the number of bits needed for the coefficients. Quantized coefficients are then coded using a lossless variable-length-coding (VLC) technique that codes frequently occurring numbers with fewer bits, which again reduces the size of the video bitstream.
To read the full article, click here
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related White Papers
- Paving the way for the next generation of audio codec for True Wireless Stereo (TWS) applications - PART 5 : Cutting time to market in a safe and timely manner
- Codec from Canada, CRC-WVC, outperforms H.264 video with wavelets
- Tutorial: The H.264 Scalable Video Codec (SVC)
- Mobile video: ARM vs. DSP vs. hardware
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience