Case Study: Implementing the MPEG-4 AS Profile on a Multi-core System on Chip Architecture R92921054 楊峰偉 R92942035 張哲瑜 R92942081 陳 宸.

Slides:



Advertisements
Similar presentations
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
Advertisements

DSPs Vs General Purpose Microprocessors
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.
Cost-Effective Pipeline FFT/IFFT VLSI Architecture for DVB-H System Present by: Yuan-Chu Yu Chin-Teng Lin and Yuan-Chu Yu Department of Electrical and.
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, ICT '09. TAREK OUNI WALID AYEDI MOHAMED ABID NATIONAL ENGINEERING SCHOOL OF SFAX New Low Complexity.
A Performance Analysis of the ITU-T Draft H.26L Video Coding Standard Anthony Joch, Faouzi Kossentini, Panos Nasiopoulos Packetvideo Workshop 2002 Department.
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
Design center Vienna Donau-City-Str. 1 A-1220 Vienna Vers SVEN Scalable Video Engine Gerald Krottendorfer.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
 Understanding the Sources of Inefficiency in General-Purpose Chips.
In God We Trust Class presentation for the course: “Custom Implementation of DSP systems” Presented by: Mohammad Haji Seyed Javadi May 2013 Instructor:
Digital Signal Processing and Field Programmable Gate Arrays By: Peter Holko.
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
1 Single Reference Frame Multiple Current Macroblocks Scheme for Multiple Reference IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY Tung-Chien.
A System Solution for High- Performance, Low Power SDR Yuan Lin 1, Hyunseok Lee 1, Yoav Harel 1, Mark Woh 1, Scott Mahlke 1, Trevor Mudge 1 and Krisztian.
Introduction to ARM Architecture, Programmer’s Model and Assembler Embedded Systems Programming.
1 Implementation of VLD and Constant Division on PAC DSP Platform Student: Chung-Yen Tsai Advisor: Prof. David W. Lin Date:
Programmable System on Chip Fully Configurable Mixed Signal Array Allows for Completely Customizable System Designs Capable of Internal MCU.
Embedded Systems Programming
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
1 HW-SW Framework for Multimedia Applications on MPSoC: Practice and Experience Adviser : Chun-Tang Chao Adviser : Chun-Tang Chao Student : Yi-Ming Kuo.
Real time DSP Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria.
Motivation Mobile embedded systems are present in: –Cell phones –PDA’s –MP3 players –GPS units.
Platform-based Design for MPEG-4 Video Encoder Presenter: Yu-Han Chen.
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
Architectures for mobile and wireless systems Ese 566 Report 1 Hui Zhang Preethi Karthik.
Develop and Implementation of the Speex Vocoder on the TI C64+ DSP
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
SYSTEM-ON-CHIP (SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY.
Real-Time HD Harmonic Inc. Real Time, Single Chip High Definition Video Encoder! December 22, 2004.
Software Defined Radio 長庚電機通訊組 碩一 張晉銓 指導教授 : 黃文傑博士.
Performance Enhancement of Video Compression Algorithms using SIMD Valia, Shamik Jamkar, Saket.
ELEC692/04 course_des 1 ELEC 692 Special Topic VLSI Signal Processing Architecture Fall 2004 Chi-ying Tsui Department of Electrical and Electronic Engineering.
Codec structuretMyn1 Codec structure In an MPEG system, the DCT and motion- compensated interframe prediction are combined. The coder subtracts the motion-compensated.
ARM for Wireless Applications ARM11 Microarchitecture On the ARMv6 Connie Wang.
RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.
Lu Hao Profiling-Based Hardware/Software Co- Exploration for the Design of Video Coding Architectures Heiko Hübert and Benno Stabernack.
Figure 1.a AVS China encoder [3] Video Bit stream.
Area: VLSI Signal Processing.
Hardware Image Signal Processing and Integration into Architectural Simulator for SoC Platform Hao Wang University of Wisconsin, Madison.
PERFORMANCE ANALYSIS OF AVS-M AND ITS APPLICATION IN MOBILE ENVIRONMENT By Vidur Vajani ( ) Under the guidance of Dr.
Development of Programmable Architecture for Base-Band Processing S. Leung, A. Postula, Univ. of Queensland, Australia A. Hemani, Royal Institute of Tech.,
Advanced Signal Processing Systems and Applications Main research areas Applications Applications –biomedical, media, communications, security Algorithms.
Implementing algorithms for advanced communication systems -- My bag of tricks Sridhar Rajagopal Electrical and Computer Engineering This work is supported.
-BY KUSHAL KUNIGAL UNDER GUIDANCE OF DR. K.R.RAO. SPRING 2011, ELECTRICAL ENGINEERING DEPARTMENT, UNIVERSITY OF TEXAS AT ARLINGTON FPGA Implementation.
OPTIMIZING DSP SCHEDULING VIA ADDRESS ASSIGNMENT WITH ARRAY AND LOOP TRANSFORMATION Chun Xue, Zili Shao, Ying Chen, Edwin H.-M. Sha Department of Computer.
Encoding Stored Video for Streaming Applications IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 2, FEBRUARY 2001 I.-Ming.
SIMD Implementation of Discrete Wavelet Transform Jake Adriaens Diana Palsetia.
Architectural Effects on DSP Algorithms and Optimizations Sajal Dogra Ritesh Rathore.
PRESENTED BY: MOHAMAD HAMMAM ALSAFRJALANI UFL ECE Dept. 3/31/2010 UFL ECE Dept 1 CACHE OPTIMIZATION FOR AN EMBEDDED MPEG-4 VIDEO DECODER.
NCTU, CS VLSI Information Processing Research Lab 研究生 : ABSTRACT Introduction NEW Recursive DFT/IDFT architecture Low computation cycle  1/2: Chebyshev.
DaVinci Overview (features and programming) Kim dong hyouk.
ECE354 Embedded Systems Introduction C Andras Moritz.
Evaluating Register File Size
Embedded Systems Design
FPGAs in AWS and First Use Cases, Kees Vissers
Multi-core SOC for Future Media Processing
Vector Processing => Multimedia
Digital Signal Processors
Dynamically Reconfigurable Architectures: An Overview
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
VLIW DSP vs. SuperScalar Implementation of a Baseline H.263 Encoder
DSPs for Future Wireless Base-Stations
12/5/2018.
CUI BIN AVS team of the MPL at UTA
What Choices Make A Killer Video Processor Architecture?
DSP Architectures for Future Wireless Base-Stations
DSPs for Future Wireless Base-Stations
Presentation transcript:

Case Study: Implementing the MPEG-4 AS Profile on a Multi-core System on Chip Architecture R 楊峰偉 R 張哲瑜 R 陳 宸

Outline Introduction Comparison of MPEG-4 SP & ASP HiBRID-Soc Multi-core Architecture –HiPAR-DSP –Macroblock processor –Stream processor Conclusion

Introduction Dedicated Architecture –Single-purpose –Lower flexibility higher performance Hybrid Architecture –Programmable CPU + dedicated hardware accelerator –Higher flexibility lower performance FPGA or DSP –Fully programmable –Slow and only for evaluation

Comparison of MPEG-4 SP & ASP

HiBRID-SoC Multi-core Architecture

HiPAR-DSP 16 parallel data paths steered by a single RISC controller in SIMD style Each data path consists of three VLIW controlled arithmetical units: 16 bit MAC, 32bit ALU, and shift & round units. External connection can be provided via a modular DMA controller. A GNU based c/c++ complier is available.

Architecture: HiPAR-DSP

Architecture: Matrix Memory

Macroblock processor Shift with round to 0 / ∞ unsigned, signed –Transform, filter (QMC, deblocking) Average value with rounding control –Sub-pel motion compensation Addition of absolute value Controlled Addition/Subtraction –Dequantization Permute instruction –Motion compensation, deblocking Branch on vector status registers –Deblocking mode selection

Macroblock processor

Stream processor Application Audio/Video stream generation and separation Characteristics in MPEG encoding Multiplexing of different parts of bitstream Run-Length coding of DCT coefficients Variable length coding of coded DCT coefficients (using Huffman table)

Software development environment Optimizing assemblers are available Data parallelism via SIMD or subword parallelism Instruction parallelism via VLIW Special instruction optimized for video and image processing algorithm

Simulation Result

Implementation Result

PSoC Architecture

Comparison Between PSoC and HiBRID PSoC ( Programmable SoC ) Array of Analog Blocks Array of Digital Blocks General Purpose Architecture HiBRID More suitable for Multimedia Application Three cores for different class of functionality

Conclusion Use the appropriate DSP architecture for different applications. Multiple codecs can be efficiently implemented on a single platform Hybrid SoC architecture is the optimal solution for various kinds of video and image applications.

“The M-PIRE MPEG-4 codec DSP and its macroblock engine,” in Proc IEEE Int. Symp. Circuits Syst., pp. II “HIPAR-DSP 16, A Scalable Highly Parallel DSP Core For System On A Chip Video And Image Processing Applications,” in Proc IEEE Int. Conf. Acoust. Speech Signal Processing, May ARM Ltd. (1999, May). AMBA Specification Rev [Online]. Available: “Instruction set extensions for MPEG-4 video” J. VLSI Signal Processing Syst., vol. 23, pp ,Oct “VLSI Architecture for Mpeg-4” Peter Pirsch, Mladen Berekovic, Hans-Joachim Stolberg, and Jom Jachalsky. “Open multimedia application platform: enabling multimedia applications in third generation wireless terminals through a combined RISC/DSP architecture” Jamil Chaoui, Ken Cyr, Sebastien de Gregorio, Jean-Pierre Giacalone, Lennifer Webb, Yves Masse. “HIBRID-SOC: A multi-core architecture for image and video applications” M. Berekovic, S. flugel, H-J. stolberg, L. Friebe, S. Moch, M. B. Kulaczewski, P. pirsch Reference