Tensilica Introduces Four Video Processor Engines Including Main Profile H.264 Support

New Drop-In Diamond Standard Processors for H.264, VC-1/WMV9, MPEG-4 and MPEG-2 Video for SOC Design

SANTA CLARA, CA, – December 4, 2006 – Tensilica, Inc. today introduced four new Diamond Standard VDO (ViDeO) processor engines customized for multi-standard, multi-resolution video in System-on-Chip (SOC) designs. Targeted at mobile handsets and personal media players (PMPs), these video subsystems are fully programmable to support all popular VGA and standard definition (SD, also known as D1) video codecs with resolutions up to 720x480 (NTSC) and 720x576 (PAL) including H.264 Main Profile, VC-1 Main Profile, MPEG-4 Advanced Simple Profile (ASP), and MPEG-2 Main Profile, each of which is available from Tensilica. Lower resolutions such as QCIF, QVGA, CIF and VGA are also supported.

The Diamond Standard VDO engines host all the key video processing functions in software on the cores – including the network abstraction layer, picture layer, slice layer, bit-stream parsing and entropy decoding and encoding. This includes the computationally demanding CABAC (Context Adaptive Binary Arithmetic Coding) decoding in the H.264 Main profile decoder that most other solutions omit, implement in a separate and complex non-programmable hardware block or necessitate more than 700 MHz of general CPU workload which significantly increases power consumption. By implementing CABAC in instruction set extensions, Tensilica was able to create a low MHz and power efficient version of CABAC in less than half the area of a typical CABAC hardware block.

The Diamond VDO family offers both Baseline and Main profile solutions – Main profile offers superior data compression and video quality and is the preferred coding scheme at resolutions of D1 and higher for advanced handset and PMP applications. Most other video solutions for SOC design only implement Baseline profile video.

Four different Diamond Standard VDO engines are being introduced to meet the varying needs of the market.

  • Diamond 381VDO – Provides decode only for the Baseline and Simple profiles, making this ideal for mainstream mobile phones, PMPs, and other mobile entertainment devices. This product provides:
    o H.264 decode – Baseline Profile @ D1, 5 mbps, 30fps
    o MPEG-4 decode –Simple Profile @ D1, 6 mbps, 30fps
    o VC-1/WMV9 decode – Simple Profile @ D1, 6 mbps, 30fps
    o MPEG-2 decode – Main Profile @ D1, 6 mbps, 30fps
  • Diamond 383VDO – Provides decode and encode for the Baseline and Simple profiles. This product works with all of the decoders used by the Diamond 381VDO plus MPEG-4 encode – Simple Profile @ D1, 6 mbps, 30fps.
  • Diamond 385VDO – Provides decode only for the Main and ASP profiles, making this ideal for advanced handsets and PMPs. The Diamond Standard 385VDO supports:
    o H.264 decode – Main Profile @ D1, 5 mbps, 30fps
    o MPEG-4 decode – Advanced Simple Profile @ D1, 6 mbps, 30fps
    o VC-1/WMV9 decode – Main Profile @ D1, 6 mbps, 30fps
    o MPEG-2 decode – Main Profile @ D1, 8 mbps, 30fps
  • Diamond 388VDO – Provides decode and encode for the Main profiles. This adds MPEG-4 encode – Advanced Simple Profile @ D1, 6 mbps, 30 fps – to the decoders available for the Diamond 385VDO.

“We’re ready to bring these fully tested drop-in solutions to semiconductor manufacturers and system OEMs who want to develop new products with high-quality video,” stated Chris Rowen, Tensilica’s president and CEO. “We expect this to be as successful as our HiFi Audio Engine, which has been designed into dozens of mobile handset devices including the new lineup of Motorola KRZR and RIZR phones.”

Architecture Leverages Xtensa Processor Technology

To build the new Diamond Standard VDO family Tensilica used its Xtensa configurable and extensible processor technology to create a dual-processor subsystem block, complete with an integrated DMA engine, that delivers full D1 Main profile decoding and ASP encoding at extremely low clock rates (needing only 172 MHz for full H.264 Main profile decode, and only 156 MHz for MPEG-4 ASP decode).

The Diamond VDO dual-core architecture includes one Xtensa processor configured as a Stream Processor and another as a Pixel Processor. The Stream Processor instruction set is optimized for serial processing of video data (entropy decoding, motion-vector prediction, etc.). The Stream Processor requires 32K bytes of local data memory and 40K bytes of local instruction memory. The instruction width is optimized to 32 bits. The Pixel Processor instruction set is optimized for parallel processing of pixel data using SIMD (single instruction, multiple data) techniques. The Pixel Processor requires 40K bytes of local data memory and only 24K bytes of local instruction memory. Inter-processor communication is via a 128-bit interface and the external video engine interface is through 2 32-bit buses.

Tensilica defined over 400 video-specific instructions in the Diamond VDO series to significantly boost performance compared to general-purpose DSPs or general purpose 32-bit microprocessors. These instructions are optimized for the most performance-intensive algorithms used in video processing, including: CABAC, which achieves higher compression in H.264 main profile video; CAVLC (Context-adaptive variable-length coding), which is a lower-complexity compression algorithm used in the H.264 baseline and main profiles; deblocking, which reduces the appearance of block-like artifacts that appear in highly compressed video streams; transforms, which perform spatial compression, analogous to JPEG; and motion compensation and motion estimation, algorithms used to achieve high image quality at lower bit rates.

Tensilica Supplies Full Software Suite Including Decoders and Encoders

Tensilica has developed encoders and decoders for the new Diamond VDO engines, so these are complete solutions with the hardware and software available directly from Tensilica. SOC designers do not need to rely on third-party application providers. Tensilica also provides a complete matching software development tool-chain including an advanced integrated development environment based on the ECLIPSE framework, a world-class compiler, a cycle-accurate SystemC-compatible instruction set simulator, and the full industry-standard GNU toolchain. In addition, Tensilica’s wide partner network provides operating systems, debug probes, ICE solutions, and other support needed to help get Tensilica’s processors designed in quickly.

The Flexibility of Processor-Based Video Decoding

These new Diamond VDO engines compare quite favorably to the traditional approach of using pure hardware based video accelerators in tandem with conventional CPUs. First, the Diamond VDO cores offload the full video decode task – including all bit-stream parsing – from the system host CPU. Conventional hardware accelerators only offload the pixel processing functions like motion estimation, and leave a large compute burden {often more than 100 MHz of continuous host CPU overhead} on the system controller.

Second, conventional solutions consisting of a CPU plus a hardware accelerator burn a huge amount of wasted power in the system bus when shuffling data to and from the CPU and accelerator – power that is often conveniently not counted by other IP vendors that boast that their HW accelerator block itself burns only a small amount of power.

Third, when the Diamond VDO engines are not being used to perform video tasks, they are a ready resource of over 500 Dhrystone MIPS of general-purpose CPU power available to perform other system tasks – whereas a dedicated video HW block can never be reused.

Fourth, the Diamond VDO engines are programmable and, therefore, can host future video standards that emerge in the coming years.

And finally, the Diamond VDO engines deliver all these benefits in a compact footprint, consuming as little as 8 mm2 (including processor logic and attached local memories) in 130nm silicon processes.

Low Area, Power Solutions for SOC Design

The Diamond Standard VDO family is optimized for mobile applications and requires a smaller area and consumes less power than competing solutions. Through the use of fine grained clock gating, a feature of the Xtensa processor architecture, and the integration of power management instructions which provide programmability to throttle power under varying video work loads, active power is further minimized. Additional power efficiency is achieved through the implementation of the DMA engine and interface to the Stream and Pixel Processors that minimizes the external memory bandwidth requirements.

In area efficiency for example, the Diamond 383VDO consumes only 10 mm2, including memories. The full-featured Diamond 388VDO delivers full Main profile H.264 support for decode and MPEG-4 ASP encode at D1 resolution yet consumes only 12 mm2, including memories, and runs at 200 MHz in TSMC 0.13G process technology. The Diamond 388VDO, while running the “Foreman” video stream at 200 MHz, consumes 52.8 mW plus 20.9 mW for the memory, for a total power consumption of 73.7 mW, based on TSMC 0.13G pre-layout with wireload estimates.

Developed with Ittiam Systems

The Diamond Standard 38xVDO engine hardware and software is a joint development of Tensilica and Ittiam Systems (www.ittiam.com). Ittiam has a long and well established track record as a leader in providing digital media processing and communication Intellectual Property (IP) solutions. ”We have leveraged our extensive video processing experience to enhance the performance and range of Diamond Standard VDO engines. We look forward to this partnership with Tensilica to enable system-on-chip customers for next generation video” commented Srini Rajam, chairman and chief executive officer, Ittiam Systems.

Pricing and Availability

All of the Diamond Standard VDO processor subsystems are delivered as Verilog RTL in a single block. Software deliverables include an XTMP (XTensa Modeling Protocol) C-level model, API source code, and a full set of software tools. The Diamond Standard VDO family pricing starts at $300,000 for a single-use license for the Diamond 381VDO. Decoders and encoders are priced separately starting at $66,000 for the MPEG-2 decoder. The Diamond 388VDO hardware and all software will be available by the end of March 2007. These are available directly through Tensilica and its distribution partners, including Fujitsu Microelectronics America, Global Unichip Corp., and NEC Electronics America, Inc.

About Tensilica

Tensilica offers the broadest line of controller, CPU and specialty DSP processors on the market today, in both an off-the-shelf format via the Diamond Standard Series cores and with full designer configurability with the Xtensa processor family. Tensilica’s low-power, benchmark proven processors have been designed into high-volume products at industry leaders in the digital consumer, networking and telecommunications markets. All Tensilica processor cores are complete with a matching software development tool environment, portfolio of system simulation models, and hardware implementation tool support. For more information on Tensilica's patented approach to the creation of application-specific building blocks for SOC design, visit www.tensilica.com.

×
Semiconductor IP