Real-Time HD Harmonic Inc. Real Time, Single Chip High Definition Video Encoder! December 22, 2004
Real-Time HD CONFIDENTIAL | 2 Agenda TVP2000 Processor overview Architecture highlights Performance Benchmarks Software tools Status Roadmap
Real-Time HD CONFIDENTIAL | 3 Telairity’s Market Leading supplier of Video chips for the Broadcast, Professional and Digital Imaging markets
Real-Time HD CONFIDENTIAL | 4 EagleEye - Video Encoder Module TVP2000 Video Processor Voltage Regulator DRAM 512MB DDR2 SPI Clk – 67.5 MHz – 135 Mbps Serial 20 Bit YCbCr Interrupt +5 Volts Video In Compressed Video Out Reset DIMM MODULE 20 Bit YCbCr Reconstructed Video Out
Real-Time HD CONFIDENTIAL | 5 TVP2000-Video Processor Video Controller 512 MB DRAM (8-DDR2) Processor P0 TVP400 Processor P1 TVP400 Processor P2 TVP400 Processor P3 TVP400 Processor P4 TVP400 DMA & SDRAM Controller 128 bit Bit Packing Unit
Real-Time HD CONFIDENTIAL | 6 0 TVP400 – Vector DSP Core I/O Interface Vector Registers 128KB VECTOR SRAM Scalar Registers Scalar Unit 8 KB Scratch 32KB I Cache 4 KB D Cache DMA ControllerPIO Controller Vector Units bit 8GB/s DRAM 8GB/s 48GB/s 12GB/s 24GB/s 8GB/s
Real-Time HD CONFIDENTIAL | 7 H.264 Partitioning & Performance Budget Sub-sample2% Motion estimation40% Transform & Quantization5% Transform size & rate control6% Reorder2% Entropy coding20% Inverse quantization & transform5% De-blocking filter4% Up-sample2% System control4% Total 90%
Real-Time HD CONFIDENTIAL | 8 TVP Performance Benchmark Motion estimation ~50% of problem Typically implemented in a programmable machine Hardwired approaches are not necessarily applicable N-Step Search algorithm was chosen : Exposes the need for a “Sum of Absolute Differences” compound instruction Exposes the cache memory line splitting problem Exposes the cache memory line replacement efficiency Exposes the inherent parallelism available in the algorithm
Real-Time HD CONFIDENTIAL | 9 TVP Entropy Coding CABAC Cycle count for Binarization of Arithmetic Encoding 8 – 4*4 Transform blocks, 9 non zero coefficients Benchmark done on TVP2000 1GHz Apogee C compiler only and with vector intrinsics AMD Opteron 2.4GHz GCC-O2 compiler Results TVP GHz C only 201 GHz w/ Vector intrinsics AMD 2.4GHz TVP2000 chip is ~ 49 times more powerful than AMD Opteron chip for Binarization of CABAC encoding
Real-Time HD CONFIDENTIAL | 10 Scalable Encoders Broadcast Applications Video Quality 4:2:2, 10b TVP2000 4:2:2, 8b 4:2:0, 8b
Real-Time HD CONFIDENTIAL | 11 N-Step-Search Algorithm This Algorithm is most widely known in its three-step form, the three-step-search (TSS)