The H264-HP-E core from Alma Technologies is an advanced ITU-T H.264 High profiles hardware encoder. It supports real time encoding of 4:2:0 and 4:2:2 video streams, in 8-, 10- or 12-bit per component sample depths. The core is also available in ALL-Intra (H264-HPI-E) and Light Motion Estimation engine (H264-HP-E-LME) encoding configurations and supports the real time encoding of video streams up to Profile Level 5.2. The Constrained Baseline and Main profiles encoding is also supported.
The encoder accepts the uncompressed video data in planar, interleaved, or macroblock scan format. It outputs standalone, standard compliant, Annex B NAL byte stream. No post processing on the output stream, other than (for example) stroring, muxing or transmitting, is required. The output NAL byte stream can be decoded, as is, by any ITU-T H.264 compliant decoder that satisfies the Level requirements of the stream and conforms to the corresponding ITU-T H.264 profile.
H264-HP-E requires minimal host intervention as it only needs to be programmed once per video sequence. Once programmed, it can encode an arbitrary number of video frames without needing any CPU or other type of support by the host system.
The H264-HP-E core implements a simple and flexible, requests based, external memory interface with independent read and write data paths. This makes the H264-HP-E independent of memory type, supporting for example operation with SRAM, SDRAM, DDR, DDR2 and DDR3 types of memory. The encoder is designed to be tolerant to memory delays and latencies, which may be present on shared memory system architectures.
The H264-HP-E core is designed with simple, fully controllable and FIFO-like, streaming input and output interfaces. Being carefully designed, rigorously verified and silicon-proven, the H264-HP-E is a reliable and easy-to-use and integrate IP.
H.264 High Profiles Encoder - High 10, High 4:2:2 and High 4:4:4 (12 bit 4:2:2 or 4:2:0) Profiles
Overview
Key Features
- Standard Compliant and Standalone Operation
- Full compliance to the ITU-T H.264 specification
- High 10, High 10 intra, High 4:2:2, High 4:2:2 intra, High 4:4:4 (12 bit 4:2:2 or 4:2:0), and High 4:4:4 intra (12 bit 4:2:2 or 4:2:0) profiles encoding
- Multi-format 4:2:0 and 4:2:2 YCbCr digital video input
- 8-, 10- and 12-bit per component sample depth encoding
- ITU-T H.264 Annex B compliant NAL byte stream output
- Profile Level up to 5.2
- No host CPU assisted, standalone operation
- Advanced H264 Implementation
- 16 video lines algorthmic encoding latency
- True H.264 compression efficiency and perceptually optimized Image Quality
- Advanced Motion Estimation
- Full search
- Variable block size
- Full, half and quarter pixel
- Up to 4 motion vectors per macroblock
- Advanced Intra prediction
- All 4 Intra 16x16 prediction modes
- All 4 Intra Chroma prediction modes
- All 9 Intra 4x4 prediction modes
- Intra in P (all prediction modes are always examined)
- High throughput implementation: Sustained 2.5 (4:2:0) or 2.75 (4:2:2) clock cycles per pixel worst case processing rate
- CABAC or CAVLC entropy coding
- CQP - VBR encoding mode
- CBR encoding mode
- Fully customizable through runtime encoding settings
- HRD CPB compliant CBR NAL output
- Intra Refresh encoding mode available for sub-frame contribution to the end-to-end latency
- On-the-fly bitrate changes supported
- Error resilient encoding options
- Multiple slices per frame encoding option
- Motion vectors can be optionally constrained within slice boundaries
- Deblocking filter can be optionally constrained within slice boundaries
- Smooth System Integration
- Full abstraction of the internal implementation details and the H.264 complexity from the top level I/O and its operation
- Simple, microcontroller like, programming interface
- High-speed, flow controllable, streaming I/O data interfaces
- Simple and FIFO like
- Avalon-ST compliant (ready latency 0)
- AXI4-Stream compliant
- Low requirements in external memory bandwidth
- Flexible external memory interface
- Independent of external memory type
- Tolerant to latencies
- Allows for shared memory access
- Can optionally operate on independent clock domain
- Trouble-Free Technology Map and Implementation
- Fully portable, self-contained RTL source code
- Strictly positive edge triggered design
- D-type only Flip-Flops
- Safe CDC transfers when using more than one clock domain
- No special timing constraints required
- No false or multi-cycle paths within the same clock domain
- No CDC transfers that need to be constrained (all CDC paths can be excluded)
- Clear text VHDL or Verilog RTL source for ASIC designs, or pre-synthesized & verified Netlist for Altera, Lattice, Microsemi and Xilinx FPGA and SoC devices
- Release Notes, Design Specification and Integration Manual documents
- Bit Accurate Model (BAM) and test vector generation binaries, including sample scripts
- Self checking testbench environment, including sample BAM generated test cases
- Simulation and sample Synthesis (for ASICs) or Place & Route (for FPGAs) scripts
Block Diagram
Deliverables
Technical Specifications
Maturity
Silicon Proven
Availability
NOW
Related IPs
- High-performance 2D (sprite graphics) GPU IP combining high pixel processing capacity and minimum gate count.
- 2D (vector graphics) & 3D GPU IP A GPU IP combining 3D and 2D rendering features with high performance, low power consumption, and minimum CPU load
- High performance, flexible, 32 bit Energy Efficient Embedded Microcontroller
- High performance, flexible, extendible 32 bit microcontroller core featuring excellent code density
- High performance 8-bit micro-controller with 256 bytes on-chip Data RAM, three 16-bit timer/counters, and two 16-bit dptr; 0.25um UMC Logic process.
- High Throughput and Low Latency Data Compression Engine