Low-Power Wireless Video System Advisor: Professor Alex Doboli Students: Christian Austin Artur Kasperek Edward Safo.

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

Categories of I/O Devices
DSPs Vs General Purpose Microprocessors
Image Data Representations and Standards
H.264 Intra Frame Coder System Design Özgür Taşdizen Microelectronics Program at Sabanci University 4/8/2005.
MPEG4 Natural Video Coding Functionalities: –Coding of arbitrary shaped objects –Efficient compression of video and images over wide range of bit rates.
Basics of MPEG Picture sizes: up to 4095 x 4095 Most algorithms are for the CCIR 601 format for video frames Y-Cb-Cr color space NTSC: 525 lines per frame.
MPEG-4 Objective Standardize algorithms for audiovisual coding in multimedia applications allowing for Interactivity High compression Scalability of audio.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
VIPER DSPS 1998 Slide 1 A DSP Solution to Error Concealment in Digital Video Eduardo Asbun and Edward J. Delp Video and Image Processing Laboratory (VIPER)
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Spring 2004 Spring 2004 Virtex II-Pro Dynamical Test Application Part.
Fall 2006Lecture 16 Lecture 16: Accelerator Design in the XUP Board ECE 412: Microcomputer Laboratory.
Hardware accelerator for PPC microprocessor Final presentation By: Instructor: Kopitman Reem Fiksman Evgeny Stolberg Dmitri.
1 EE249 Discussion A Method for Architecture Exploration for Heterogeneous Signal Processing Systems Sam Williams EE249 Discussion Section October 15,
Introduction to Video Transcoding Of MCLAB Seminar Series By Felix.
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
1 Background The latest video coding standard H.263 -> MPEG4 Part2 -> MPEG4 Part10/AVC Superior compression performance 50%-70% bitrate saving (H.264 v.s.MPEG-2)
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
1 Design and Implementation of an Efficient MPEG-4 Interactive Terminal on Embedded Devices Yi-Chin Huang, Tu-Chun Yin, Kou-Shin Yang, Yan-Jun Chang, Meng-Jyi.
Motivation Mobile embedded systems are present in: –Cell phones –PDA’s –MP3 players –GPS units.
Platform-based Design for MPEG-4 Video Encoder Presenter: Yu-Han Chen.
H.264 Deblocking Filter Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South Korea Copyright © solarlits.com.
A New Reference Design Development Environment for JPEG 2000 Applications Bill Finch CAST, Inc. Warren Miller AVNET Design Services DesignCon 2003 January.
MPEG-1 and MPEG-2 Digital Video Coding Standards Author: Thomas Sikora Presenter: Chaojun Liang.
Institute of Information Sciences and Technology Towards a Visual Notation for Pipelining in a Visual Programming Language for Programming FPGAs Chris.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
Data Compression. Compression? Compression refers to the ways in which the amount of data needed to store an image or other file can be reduced. This.
L28:Lower Power Algorithm for Multimedia Systems(2) 성균관대학교 조 준 동
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
Copyright © 2003 Texas Instruments. All rights reserved. DSP C5000 Chapter 18 Image Compression and Hardware Extensions.
CH10 Input/Output DDDData Transfer EEEExternal Devices IIII/O Modules PPPProgrammed I/O IIIInterrupt-Driven I/O DDDDirect Memory.
L/O/G/O Input Output Chapter 4 CS.216 Computer Architecture and Organization.
MPEG-4 Systems Introduction & Elementary Stream Management
By Edward A. Lee, J.Reineke, I.Liu, H.D.Patel, S.Kim
MPEG-4: Multimedia Coding Standard Supporting Mobile Multimedia System -MPEG-4 Natural Video Coding April, 2001.
Spatiotemporal Saliency Map of a Video Sequence in FPGA hardware David Boland Acknowledgements: Professor Peter Cheung Mr Yang Liu.
11 Using Checksum to Reduce Power Consumption of Display Systems for Low-Motion Content Kyungtae Han*, Zhen Fang, Paul Diefenbaugh, Richard Forand, Ravi.
A Programmable Single Chip Digital Signal Processing Engine MAPLD 2005 Paul Chiang, MathStar Inc. Pius Ng, Apache Design Solutions.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Dec 1, 2005 Part 2.
1 Modular Refinement of H.264 Kermin Fleming. 2 What is H.264? Mobile Devices Low bit-rate Video Decoder –Follow on to MPEG-2 and H.26x Operates on pixel.
Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower than CPU.
Video Compression and Standards
Low-Power Wireless Video System Advisor: Professor Alex Doboli Students: Christian Austin Artur Kasperek Edward Safo.
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
Input Output Techniques Programmed Interrupt driven Direct Memory Access (DMA)
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Hierarchical Systolic Array Design for Full-Search Block Matching Motion Estimation Noam Gur Arie,August 2005.
PRESENTED BY: MOHAMAD HAMMAM ALSAFRJALANI UFL ECE Dept. 3/31/2010 UFL ECE Dept 1 CACHE OPTIMIZATION FOR AN EMBEDDED MPEG-4 VIDEO DECODER.
Niagara: A 32-Way Multithreaded Sparc Processor Kongetira, Aingaran, Olukotun Presentation by: Mohamed Abuobaida Mohamed For COE502 : Parallel Processing.
Automated Software Generation and Hardware Coprocessor Synthesis for Data Adaptable Reconfigurable Systems Andrew Milakovich, Vijay Shankar Gopinath, Roman.
JPEG Compression What is JPEG? Motivation
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
CS644 Advanced Topics in Networking
Architecture & Organization 1
FPGAs in AWS and First Use Cases, Kees Vissers
Highly Efficient and Flexible Video Encoder on CPU+FPGA Platform
Architecture & Organization 1
Sum of Absolute Differences Hardware Accelerator
VLIW DSP vs. SuperScalar Implementation of a Baseline H.263 Encoder
Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara
MPEG4 Natural Video Coding
DSP Architectures for Future Wireless Base-Stations
♪ Embedded System Design: Synthesizing Music Using Programmable Logic
Presentation transcript:

Low-Power Wireless Video System Advisor: Professor Alex Doboli Students: Christian Austin Artur Kasperek Edward Safo

Objective Establish a low-power wireless client/server streaming video system. Use a multimedia standard amenable to wireless networks. Apply hardware software co-design techniques to reduce the power used by the system’s clients.

Hardware/Software Co-Design Design methodology that splits a computer system’s design between hardware and software in an effort to improve some feature of the system. Partitioning targets low power consumption in this design. Achieved by relocating the functionality of high power sections of code to specialized hardware.

Project Flow Decide on a multimedia standard. Software. Hardware. Functional testing and hardware power analysis. Design software from scratch. Find and analyze existing software. Isolate high power sections of software for a hardware port. Determine a hardware architecture. Hardware tuning for lower power consumption.

Multimedia Standard MPEG-4 was a good match for the system’s requirements. What is MPEG-4? Object based video compression and decoding standard. New object based compression technique compresses objects, rather than frames. Objects are distinct entities in a scene; information can be associate with each one. Builds on previous MPEG and H.263 standards.

MPEG-4 Framework

Why Use MPEG-4? Non-proprietary standard. High compression makes streaming over low bandwidth network practical (e.g. wireless). Adjustable resolution coding allows for video continuity/quality trade off. High bit-rate yields better quality video at the expense of lost frames… Robust error resilience over noisy channels. Emerging standard. Superset of previous MPEG standards.

Object Based Compression Video Scenes defined as a composition of objects in space at an instant in time. Object color defined by pixel chrominance and luminance values; shape is defined by an alpha mask. Object and bounding rectangle called Video Object Plane (VOP). Each object compressed separately. Main reason for improved compression. Block based encoding scheme extended to handle arbitrary shaped objects.

Compression Illustration Transparent Macroblocks. Carry no information. Boundary Macroblocks. Compressed using block based scheme after padding. Opaque Macroblocks. Compressed as is using block based scheme.

Software Decisions Used Open source MPEG-4 client and server software. Darwin Streaming Server by Apple. MPEG4IP, an open source project at Sourceforge. Why Open Source? Implementation of a video server was not an objective. Design of software from scratch was not practical given the time constraints.

Locating Power Intensive Code Hardware power measurement. Accurate measurement requires expensive hardware. Power measurement using software. Instruction level power estimation. SimplePower developed at Penn State. Software profiling. No direct power measurements. Begin looking for high power sections of code in computationally intensive areas of code. GPROF or Visual Studio.

The Inverse Discrete Cosine Transform (IDCT) Highly utilized code. Used each time a macroblock is decoded. Computationally Intensive. Inherent nested loop structure. High frequency of memory accesses. Results in elevated power consumption.

IDCT in an MPEG-4 Decoder An MPEG-4 decoder consists of more than the IDCT

Hardware Requirements An economical FPGA with a large gate equivalence. A fast interface to the FPGA. The hardware will implement a time critical function of an MPEG-4 decoder. Peripheral memory, which the FPGA can use as a buffer for IDCT blocks.

Spartan-II 200 PCI Board 200, 000 gate equivalent Xilinx Spartan-II FPGA. 32-bit PCI interface. 8 MB on-board memory. JTAG interface ISP PROM

PCI Core PCI was the best solution for a high transfer rate interface. Need to interface IDCT design to PCI Bus. Xilinx LogiCore provides a PCI front end for the IDCT design. Abstracts the details of the PCI specification away from the IDCT design.

Hardware Implementation IDCT hardware design considerations. Low power is primary concern, but design size and speed are also important. Procedure. Design an IDCT architecture in terms of a functional unit block diagram. Code the design in VHDL. Write a driver with an API that maps to the hardware’s functions. Synthesize and place and route the design.

IDCT Architecture Decodes an 8X8 block of IDCT coefficients. Uses onboard memory as buffer for fetching and storing inputs. Less CPU intervention. Performs two 1-D IDCTs. First half of data path performs 1-D IDCT on each row vector of the 8X8 input macroblock matrix. Row results stored in an 8X8, transposed, and used as inputs to the second half of the data path. Second half of data path performs another 1-D IDCT on each of the column vectors of its 8X8 input matrix, completing the 2-D IDCT of the macroblock.

Architecture Block Diagram

Architecture Features Pipelined design for increased throughput and power reduction. Exploits Symmetry of IDCT coefficient matrix. Breaks 8X8 matrix operation into two 4X4 matrix operations and butterfly operations. Parallel multiply and addition operations perform two 4X4 matrix multiplications in parallel. Speed up of IDCT’s repetitive matrix operations.

Power Reduction Clock Isolation. Add additional logic to isolate sections of logic from the clock when not in use. Glitch reduction. Balance the number of synthesized logic levels. Duplicate resources instead of sharing them. Increase amount of pipeline registers.

Goals and Applications Demonstrate that a low-power wireless video system is practical. Design for a power constrained, low bandwidth PDA. Applications: Interactive shopping. Request video of product information while shopping. Multimedia preview. Preview movie before buying or renting; watch music video while previewing new album.

Any Questions?