CloudStream: delivering high-quality streaming videos through a cloud-based SVC proxy Zixia Huang1, Chao Mei1, Li Erran Li2, Thomas Woo2 This paper was.

Slides:



Advertisements
Similar presentations
1 P2P Layered Streaming for Heterogeneous Networks in PPSP K. Wu, Z. Lei, D. Chiu James Zhibin Lei 17/03/2010.
Advertisements

Parallelizing Video Transcoding With Load Balancing On Cloud Computing Song Lin, Xinfeng Zhang, Qin Y, Siwei Ma Circuits and Systems, 2013 IEEE.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Parallel H.264 Decoding on an Embedded Multicore Processor
Hadi Goudarzi and Massoud Pedram
Pouya Ostovari and Jie Wu Computer and Information Sciences
Building Cloud-ready Video Transcoding System for Content Delivery Networks(CDNs) Zhenyun Zhuang and Chun Guo Speaker: 饒展榕.
LOGO Video Packet Selection and Scheduling for Multipath Streaming IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 3, APRIL 2007 Dan Jurca, Student Member,
Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
Scalable ROI Algorithm for H.264/SVC-Based Video Streaming Jung-Hwan Lee and Chuck Yoo, Member, IEEE.
Pervasive Web Content Delivery with Efficient Data Reuse Chi-Hung Chi and Cao Yang School of Computing National University of Singapore
Network Coding in Peer-to-Peer Networks Presented by Chu Chun Ngai
Internet Video By Mo Li. Video over the Internet Introduction Video & Internet: the problems Solutions & Technologies in use Discussion.
Adaptive Video Streaming in Vertical Handoff: A Case Study Ling-Jyh Chen, Guang Yang, Tony Sun, M. Y. Sanadidi, Mario Gerla Computer Science Department,
1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.
{ Fast Disparity Estimation Using Spatio- temporal Correlation of Disparity Field for Multiview Video Coding Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen.
Evaluation of Data-Parallel Splitting Approaches for H.264 Decoding
1 Complexity of Network Synchronization Raeda Naamnieh.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada End-to-End Secure Delivery of Scalable Video Streams Mohamed Hefeeda (Joint.
A Quality-Driven Decision Engine for Live Video Transmission under Service-Oriented Architecture DALEI WU, SONG CI, HAIYAN LUO, UNIVERSITY OF NEBRASKA-LINCOLN.
A Comparison of Layering and Stream Replication Video Multicast Schemes Taehyun Kim and Mostafa H. Ammar.
Introduction Future wireless systems will be characterized by their heterogeneity - availability of multiple access systems in the same physical space.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
Distributed Video Streaming Over Internet Thinh PQ Nguyen and Avideh Zakhor Berkeley, CA, USA Presented By Sam.
Quality of Service in IN-home digital networks Alina Albu 23 October 2003.
Prefix Caching assisted Periodic Broadcast for Streaming Popular Videos Yang Guo, Subhabrata Sen, and Don Towsley.
Error Concealment For Fine Granularity Scalable Video Transmission Hua Cai; Guobin Shen; Feng Wu; Shipeng Li; Bing Zeng; Multimedia and Expo, Proceedings.
1 Efficient Multithreading Implementation of H.264 Encoder on Intel Hyper- Threading Architectures Steven Ge, Xinmin Tian, and Yen-Kuang Chen IEEE Pacific-Rim.
Introduction to Video Transcoding Of MCLAB Seminar Series By Felix.
1 Slice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding Michael Roitzsch Technische Universität Dresden ACM & IEEE international.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.
1. 1. Problem Statement 2. Overview of H.264/AVC Scalable Extension I. Temporal Scalability II. Spatial Scalability III. Complexity Reduction 3. Previous.
Exploiting Virtualization for Delivering Cloud based IPTV Services Speaker : 吳靖緯 MA0G IEEE Conference on Computer Communications Workshops.
Liquan Shen Zhi Liu Xinpeng Zhang Wenqiang Zhao Zhaoyang Zhang An Effective CU Size Decision Method for HEVC Encoders IEEE TRANSACTIONS ON MULTIMEDIA,
Using Multimedia on the Web
Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
Kai-Chao Yang Hierarchical Prediction Structures in H.264/AVC.
Network Aware Resource Allocation in Distributed Clouds.
Exploiting Proxy-Based Transcoding to Increase the User Quality of Experience in Networked Applications Maarten Wijnants Patrick Monsieurs Peter Quax Wim.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Optimal Partitioning of Fine-Grained Scalable Video Streams Mohamed Hefeeda.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
A Comparison of Layering and Stream Replication Video Multicast Schemes Taehyun Kim and Mostafa H. Ammar Networking and Telecommunications Group Georgia.
Advanced Spectrum Management in Multicell OFDMA Networks enabling Cognitive Radio Usage F. Bernardo, J. Pérez-Romero, O. Sallent, R. Agustí Radio Communications.
Paper # – 2009 A Comparison of Heterogeneous Video Multicast schemes: Layered encoding or Stream Replication Authors: Taehyun Kim and Mostafa H.
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
On the Optimal Scheduling for Media Streaming in Data-driven Overlay Networks Meng ZHANG with Yongqiang XIONG, Qian ZHANG, Shiqiang YANG Globecom 2006.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Parallelizing Video Transcoding Using Map-Reduce-Based Cloud Computing Speaker : 童耀民 MA1G0222 Feng Lao, Xinggong Zhang and Zongming Guo Institute of Computer.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
An Adaptive Video Streaming Control System: Modeling, Validation, and Performance Evaluation PRESENTED BY : XI TAO AND PRATEEK GOYAL DEC
Cooperative Mobile Live Streaming Considering Neighbor Reception SPEAKER: BO-YU HUANG ADVISOR: DR. HO-TING WU 2015/10/15 1.
Peer-to-Peer Streaming of Scalable Video in Future Internet Application Speaker : 吳靖緯 MA0G0101 Communications Magazine, IEEE, On page(s): 128.
Adaptive Content-Aware Scaling for Improved Video Streaming. Avanish Tripathi Advisor: Mark Claypool Reader: Bob Kinicki.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
CloudStream: delivering high-quality streaming videos through a cloud-based SVC proxy Authors: Zixia Huang1, Chao Mei1, Li Erran Li2, Thomas Woo2 1Department.
Progressive transmission of spatial data Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Technical Seminar Presentation Presented by : SARAT KUMAR BEHERA NATIONAL INSTITUTE OF SCIENCE AND TECHNOLOGY [1] Presented By SARAT KUMAR BEHERA Roll.
Adaptive Block Coding Order for Intra Prediction in HEVC
Dynamic Graph Partitioning Algorithm
Steven Ge, Xinmin Tian, and Yen-Kuang Chen
Distributed Multimedia Systems
Myoungjin Kim1, Yun Cui1, Hyeokju Lee1 and Hanku Lee1,2,*
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Smita Vijayakumar Qian Zhu Gagan Agrawal
Bongsoo Jung, Byeungwoo Jeon
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

CloudStream: delivering high-quality streaming videos through a cloud-based SVC proxy Zixia Huang1, Chao Mei1, Li Erran Li2, Thomas Woo2 This paper was presented as part of the Mini- Conference at IEEE INFOCOM 2011 Speaker : Lin-you Wu

Outline I. INTRODUCTION II. METRICS AFFECTING STREAMING QUALITY III. CLOUD-BASED VIDEO TRANSCODING FRAMEWORK IV. PERFORMANCE EVALUATION

I. INTRODUCTION Existing media providers such as YouTube and Hulu deliver videos by turning it into a progressive download. This can result in frequent video freezes under varying network dynamics.

Users are demanding uninterrupted delivery of increasingly higher quality videos over the Internet, in both wireline and wireless. Internet media providers (like YouTube or Hulu) have taken the easy way out and changed the problem to that of a progressive download via a content distribution network.

In such a framework, they are using a non- adaptive codec, but ultimately, the delivery variabilities is handled by freezing, which significantly degrades the user experience. In this paper, we propose and study the development of a H.264/SVC (Scalable Video Coding ) based video proxy situated between the users and media servers.

That can adapt to changing network conditions using scalable layers at different data rates. The two major functions of this proxy are: (1) video transcoding from original formats to SVC. (2) video streaming to different users under Internet dynamics.

Because of codec incompatibilities, a video proxy will have to decode an original video into an intermediate format and re-encode it to SVC. While the video decoding overhead is negligible, the encoding process is highly complex that the transcoding speed is relatively slow even on a modern multicore processor.

Specifically, our proxy solution partitions a video into clips and maps them to different compute nodes (instances) configured with one or multiple CPUs in order to achieve encoding parallelization.

In general, video clips with the same duration can demand different computation time (Fig. 1) because of the video content heterogeneity.

In performing encoding parallelization in the cloud, there are three main issues to consider. First, multiple video clips can be mapped to compute nodes at different time (Map time) due to the availability of the cloud computation resources and the heterogeneity in the computation overhead of previous clips (Fig. 2).

Second, the default first-task first-serve scheme in the cloud can introduce unbalanced computation load on different nodes. Third, the transcoding component should not speed up video encoding at the expense of degrading the encoded video quality.

Our contributions. We present CloudStream: a cloudbased SVC proxy that delivers high- quality Internet streaming videos. We focus on its transcoding component design in this paper.

We design a multi-level parallelization framework under the computation resource constraints with two mapping options (Hallsh- based Mapping and Lateness-first Mapping) to increase the transcoding speed and reduce the transcoding jitters without losing the encoded video quality.

II. METRICS AFFECTING STREAMING QUALITY A. Attributes of Streaming Quality Video freezes are caused by the unavailability of new video data at their scheduled playback time due to the combined contribution of transcoding and streaming jitters.

For the i-th video clip, we use ci to denote the Reduce time (the actual completion time). Each video clip has a encoding time pi. In order to enable real-time transcoding, the expected encoding completion time (Reduce time) of the video clips di is defined as di = pe +(i−1)×ΔT. pe is the expected encoding time of a video clip

B. Metrics Characterizing Video Content The characteristics of the video content can affect the transcoding speed which decides the streaming quality. We use two cost-effective metrics in ITU-T P.910 to describe the video content heterogeneity:

The temporal motion metric TM and the spatial detail metric SD. TM can be captured by the differences of pixels at the same spatial location of two pictures, while SD is computed from the differences of all spatially neighboring pixels within a picture.

III. CLOUD-BASED VIDEO TRANSCODING FRAMEWORK A. Multi-level Encoding Parallelization We propose our parallelization framework. Previous studies have achieved the SVC encoding parallelization on a single multi-core computer with and without GPU support, but none of them provide the real-time support.

SVC coding structures. SVC can divide a video into nonoverlapping coding-independent groups of pictures (GOP). Each picture may include several layers with different spatial resolutions and encoding qualities.

Each layer in a picture can be divided into one or more coding-independent slices. Each slice includes a sequence of non- overlapping 16x16 macro-blocks (MB) for luma components and 8x8 MB for chroma components.

Inter-node and intra-node parallelism. To exploit the compute nodes returned by the cloud at full potential, we need to parallelize the encoding scheme across different compute nodes (inter-node parallelism) which do not share the memory space.

Choice of our parallelization scheme. From the above discussions above, we propose our multi- level encoding parallelization scheme. The idea of our scheme is shown in Fig. 3. In our scheme, since the coding-independent GOPs have the largest work granularity, encoding at the GOP level is an ideal candidate to be parallelized across different compute nodes.

B. Intra-node Parallelism Because a cloud node will not return part of the encoded SVC data until the whole GOP is encoded, the inter-node parallelism alone is not sufficient to shorten the access time.

We limit TGOP(M) (using M parallel slices) within the upper bound Tth. Each GOP includes SG pictures (each with L spatial and quality layers in JSVM configuration), i.e.,Tpic(M) × SG < Tth, where Tpic(M) = ΣL i=1Tpic,i(M).Tpic,i(M) can be computed in approximation by: Tpic,i(M) = Tslice,i(M) × Nslice,i/M +ΔTi

Hence given SG and TMB,i(M) = TMB,i(1) (because we do not parallelize across or within MBs in a slice), the minimum M that can satisfy Tth is:

C. Inter-node Parallelism While the inter-node parallelism is adopted to achieve real-time transcoding, the variations of GOP encoding time can introduce the transcoding jitters.

Our goal is to minimize both transcoding jitters and the number of compute nodes (denoted as N) used for inter-node parallelism. Solution I: Hallsh-based Mapping (HM) Solution II: Lateness-first Mapping (LFM)

Solution I: Hallsh-based Mapping (HM) We first set an upper bound of τ and then compute the minimal number of N that satisfies the upper bound. In this paper, we present a lightweight algorithm called minMS2approx that makes use of the algorithm in as a blackbox. We refer to the identical machine scheduling algorithm in as Hallsh.

1. We pick = mini{(di − pi)/τ}. 2. We run HallSh algorithm by increasing the number of machines k until the maximum lateness maxi{l(k)i} among all the Q jobs using k machines satisfies maxi{l(k)i} < (1 +)× τ. We set the machine number at this point to be K, i.e.maxi{l(K)i} < (1 + ) × τ and maxi{l(K−1)i} ≥ (1 + ) × τ.We flag the machines 1,2,...,K as used.

3. The HallSh algorithm returns the scheduling results of all jobs given K. For a job with ci − di > τ on a particular machine j, we move it, and along with all future jobs on the same machine K, to a new machine K + j. We flag the machine K+j as used. We then compute the new completion time ci for all jobs on the machine K + j.

4. N is computed as the number of used machines. We claim that the algorithm is a 2- approximation algorithm. Lemma 3.2: The optimal number of machines Nopt for the minMS problem is greater than K.

Proof. K is the first value that satisfies maxi{l(K)i} K − 1. Because Nopt is an integer, Nopt ≥ K.

Solution II: Lateness-first Mapping (LFM) We first compute the minimal number of N based on the deadline of each job and design a job scheduling scheme to minimize τ given N compute nodes. 1) Deciding the minimum N: To allow real-time videotranscoding, we should satisfy:Tpic(M) × R < SG × N

IV. PERFORMANCE EVALUATION We set the input as P video GOPs, each with 8 pictures. Each picture contains 4 temporal layers, 2 spatial layers (856x480 and 428x240) and 1 quality layer. According to our parallelization scheme, each GOP is mapped to an available compute node in the cloud for encoding.

To evaluate the performance of inter-node parallelism, we set pe (defined in Section II.A) to be the maximum estimated GOP processing time. We compute corresponding di which will be used in both HM and LFM.