Parallelizing Video Transcoding With Load Balancing On Cloud Computing Song Lin, Xinfeng Zhang, Qin Y, Siwei Ma Circuits and Systems, 2013 IEEE.

Slides:



Advertisements
Similar presentations
Packet Video Error Concealment With Auto Regressive Model Yongbing Zhang, Xinguang Xiang, Debin Zhao, Siwe Ma, Student Member, IEEE, and Wen Gao, Fellow,
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Hadi Goudarzi and Massoud Pedram
Practical techniques & Examples
A Graph-Partitioning-Based Approach for Multi-Layer Constrained Via Minimization Yih-Chih Chou and Youn-Long Lin Department of Computer Science, Tsing.
LOGO Video Packet Selection and Scheduling for Multipath Streaming IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 3, APRIL 2007 Dan Jurca, Student Member,
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
Development of Parallel Simulator for Wireless WCDMA Network Hong Zhang Communication lab of HUT.
Software Architecture of High Efficiency Video Coding for Many-Core Systems with Power- Efficient Workload Balancing Muhammad Usman Karim Khan, Muhammad.
Embedded Software Optimization for MP3 Decoder Implemented on RISC Core Yingbiao Yao, Qingdong Yao, Peng Liu, Zhibin Xiao Zhejiang University Information.
1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.
Evaluation of Data-Parallel Splitting Approaches for H.264 Decoding
Sang-Chun Han Hwangjun Song Jun Heo International Conference on Intelligent Hiding and Multimedia Signal Processing (IIH-MSP), Feb, /05 Feb 2009.
Shaobo Zhang, Xiaoyun Zhang, Zhiyong Gao
A Parallel Computational Model for Heterogeneous Clusters Jose Luis Bosque, Luis Pastor, IEEE TRASACTION ON PARALLEL AND DISTRIBUTED SYSTEM, VOL. 17, NO.
Reference: Message Passing Fundamentals.
Low-complexity mode decision for MVC Liquan Shen, Zhi Liu, Ping An, Ran Ma and Zhaoyang Zhang CSVT
1 Single Reference Frame Multiple Current Macroblocks Scheme for Multiple Reference IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY Tung-Chien.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Congestion Control in Distributed Media Streaming Lin Ma Wei Tsang Ooi School of Computing National University of Singapore IEEE INFOCOM 2007.
1 A Unified Rate-Distortion Analysis Framework for Transform Coding Student : Ho-Chang Wu Student : Ho-Chang Wu Advisor : Prof. David W. Lin Advisor :
Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang, and Ya-Qin Zhang IEEE TRANSACTIONS ON MULTIMEDIA,
Unequal Loss Protection: Graceful Degradation of Image Quality over Packet Erasure Channels Through Forward Error Correction Alexander E. Mohr, Eva A.
1. 1. Problem Statement 2. Overview of H.264/AVC Scalable Extension I. Temporal Scalability II. Spatial Scalability III. Complexity Reduction 3. Previous.
Final Project: Video Transcoding on Cloud Environments Queenie Wong CMPT 880.
Parallelization: Conway’s Game of Life. Cellular automata: Important for science Biology – Mapping brain tumor growth Ecology – Interactions of species.
Liquan Shen Zhi Liu Xinpeng Zhang Wenqiang Zhao Zhaoyang Zhang An Effective CU Size Decision Method for HEVC Encoders IEEE TRANSACTIONS ON MULTIMEDIA,
Seyed Mohamad Alavi, Chi Zhou, Yu Cheng Department of Electrical and Computer Engineering Illinois Institute of Technology, Chicago, IL, USA ICC 2009.
MobSched: An Optimizable Scheduler for Mobile Cloud Computing S. SindiaS. GaoB. Black A.LimV. D. AgrawalP. Agrawal Auburn University, Auburn, AL 45 th.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
Load Balancing in Distributed Computing Systems Using Fuzzy Expert Systems Author Dept. Comput. Eng., Alexandria Inst. of Technol. Content Type Conferences.
A Unified Modeling Framework for Distributed Resource Allocation of General Fork and Join Processing Networks in ACM SIGMETRICS
Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Split-Row: A Reduced Complexity, High Throughput.
An Architecture for Distributed High Performance Video Processing in the Cloud 作者 :Pereira, R.; Azambuja, M.; Breitman, K.; Endler, M. 出處 :2010 IEEE 3rd.
Adaptive Rate Control for HEVC Visual Communications and Image Processing (VCIP), 2012 IEEE Junjun Si, Siwei Ma, Xinfeng Zhang, Wen Gao 1.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
Parallelizing Video Transcoding Using Map-Reduce-Based Cloud Computing Speaker : 童耀民 MA1G0222 Feng Lao, Xinggong Zhang and Zongming Guo Institute of Computer.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
A Robust Luby Transform Encoding Pattern-Aware Symbol Packetization Algorithm for Video Streaming Over Wireless Network Dongju Lee and Hwangjun Song IEEE.
The World Leader in High Performance Signal Processing Solutions Multi-core programming frameworks for embedded systems Kaushal Sanghai and Rick Gentile.
ICC Module 3 Lesson 1 – Computer Architecture 1 / 13 © 2015 Ph. Janson Information, Computing & Communication Computer Architecture Clip 2 – Von Neumann.
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
A Two-Tier Heterogeneous Mobile Ad Hoc Network Architecture and Its Load-Balance Routing Problem C.-F. Huang, H.-W. Lee, and Y.-C. Tseng Department of.
1 Using Network Coding for Dependent Data Broadcasting in a Mobile Environment Chung-Hua Chu, De-Nian Yang and Ming-Syan Chen IEEE GLOBECOM 2007 Reporter.
Concurrency and Performance Based on slides by Henri Casanova.
Genetic algorithms for task scheduling problem J. Parallel Distrib. Comput. (2010) Fatma A. Omara, Mona M. Arafa 2016/3/111 Shang-Chi Wu.
Hierarchical Systolic Array Design for Full-Search Block Matching Motion Estimation Noam Gur Arie,August 2005.
1 Hierarchical Parallelization of an H.264/AVC Video Encoder A. Rodriguez, A. Gonzalez, and M.P. Malumbres IEEE PARELEC 2006.
1 Munther Abualkibash University of Bridgeport, CT.
Ioannis E. Venetis Department of Computer Engineering and Informatics
L. Benini, G. DeMicheli Stanford University, USA A. Macii, E. Macii, M
Introduction | Model | Solution | Evaluation
Conception of parallel algorithms
From Algorithm to System to Cloud Computing
Parallel Programming By J. H. Wang May 2, 2017.
Fei Li Jinjun Xiong University of Wisconsin-Madison
Parallel Programming in C with MPI and OpenMP
Scalable Speech Coding for IP Networks: Beyond iLBC
Online Graph-Based Tracking
Parallel Programming in C with MPI and OpenMP
Kyriakos Kritikos and Dimitris Plexousakis ICS-FORTH
Presentation transcript:

Parallelizing Video Transcoding With Load Balancing On Cloud Computing Song Lin, Xinfeng Zhang, Qin Y, Siwei Ma Circuits and Systems, 2013 IEEE

Outline Introduction Related work Problem formulation and system architecture Proposed method Experiment Results Conclusion

Introduction #1 Parallel programming Share memory Pthread – data dependency Message passing MPI – time delay

Introduction #2 Issues Data dependency Cost of data passing Load balance

Introduction #3 Cloud computation Data segmentation Computing capacity

Introduction #4 GOP-based encoding Independence between GOPs

Introduction #5 Paralleling in GOP-based Thread1 Thread2 Thread3

Related work #1 FCFS - First come first server [2] Easy to implement Load balancing problem is still exist

Related work #3 MCT – Minimal complete time [6] Map-Reduce-based

Problem formulation and system architecture #1 Load balance problem on cloud computing Executing time Delay time Data passing C is complexity and P is computing capacity

Problem formulation and system architecture #2 The overall completion time of set S k is. Goal.

Problem formulation and system architecture #3 Optimal solution. n means n task and m means m cores

Problem formulation and system architecture #4 Flow chart of the proposed method

Problem formulation and system architecture #5 For video coding, if the input sequence has instantaneous decoder refresh (IDR) frame, this video coding task can be divided into sub- tasks.[7]

Problem formulation and system architecture #6 For complexity estimation of video transcoding tasks, the existing algorithms [8] [9] can be utilized.

Proposed method #1 The framework includes three modules Task pre-analysis Adaptive threshold segmentation Minimal finish time

Proposed method #2 The threshold of segmentation

Proposed method #3

Proposed method #4 The optical finish time The finish time

Proposed method #5 Assign all the tasks sequentially in descending complexity order For each unassigned task j, the cores are judged in their descending computing capacity order according to the following criterion: assuming the task j is assigned to core k, if Τ κ T thr, the assignment is verified. Otherwise, we will judge the next core.

Proposed method #6 If all the cores are traversed and all the computing time are beyond T thr, the task j will be assigned by MCT algorithm. and T thr is updated to be the new finish time of the received core T k

Proposed method #7

Experiment results #1

Experiment results #2

Experiment results #3

Conclusion Load balancing problem is a NP-hard problem. The proposed algorithm has strong robustness to the task launching delay.