ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network CS598 Advanced Multimedia Systems Fall 2017 Prof. Klara.

ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network
CS598 Advanced Multimedia Systems Fall 2017 Prof. Klara Nahrstedt

Acknowledgments This deck of slides has been created for educational purposes and a few slides have been borrowed in part or at times in entirety using the below sources: Conference Slides from SIGCOMM 2012 by the authors.

Today’s Paper ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network Keywords: Performance, Layered, Live streaming,

Authors Yale University, Google , PP Live SIGCOMM 2012, Finland

Motivation Live streaming is a major Internet application today
Evaluation of live streaming Lab/testbed, simulation, modeling Scalability realism Live testing

Challenge Protection Real views’ QoE
Masking failures from real viewers Orchestration Orchestrating desired experimental scenarios (e.g., flash-crowd) Without disturbing QoE

Modern Live Streaming Complex hybrid systems Peer-to-peer network
Content delivery network BitTorrent-like Tracker  peers watching same channel  overlay network topology Basic unit: pieces

Modern Live Streaming Modules P2P topology management CDN management
Buffer and playpoint management Rate allocation Download/upload scheduling Viewer-interfaces Share bottleneck management Flash-crowd admission control Network-friendliness

Metrics Piece missing ratio
Pieces not received by the playback deadline Channel supply ratio Total bandwidth capacity (CDN+P2P) to total streaming bandwidth demand

Misleading results Small-Scale
EmuLab: 60 clients vs. 600 clients Supply ratio Small: 1.67 Large: 1.29 Content bottleneck!

Misleading results Small-Scale
With connection limit(Limited by Peer Uploading BW) CDN server’s neighbor connections are exhausted by those clients that join earlier Both experiments illustrate that observations from smaller scale do not translate to larger scale.

Misleading results Missing Realistic Feature
Network diversity Network connectivity Amount of network resource Network protocol implementation Router policy Background traffic

Misleading results Missing Realistic Feature
LAN-like network vs. ADSL-like network Hidden buffers ADSL has larger buffer but limited upload bandwidth

Testing in an environment that is different from Real setting can give different results.

System Architecture Shadowstream makes two extensions to a traditional live streaming network: It introduces a lightweight experiment orchestrator, which extends the functions of a typical tracker to coordinate experiments It introduces new capabilities in traditional streaming clients to support live testing. ShadowStream complements analysis, simulation, and testbed to provide a more complete experimentation framework.

Streaming Machine Self-complete set of algorithms to download and upload pieces Multiple streaming machines experiment (E) Play buffer – Key data structure. Play buffer keeps track of pieces that are already downloaded as well as pieces that needs to be still downloaded from the neighbors. Now what are source points and playpoints? The shaded pieces are pieces that are downloaded. Here 91 needs to be downloaded; if the client is not able to download, then piece 91 is said to be missed.

Analogy to real-time systems
Each piece ~ task, deadline ~ playpoint Sequence of tasks arrive, needs to be computed(downloaded), and the results are visible to the user. Each piece missing its deadline(playpoint) ~ failure

R+E to Mask Failures Another streaming machine
For protection of viewer’s QoE. Repair (R) Failure of E should not be visible to the end user, so they use the concept of virtual playpoint.

R+E to Mask Failures Virtual playpoint Introducing a slight delay
To hide the failure from real viewers R = rCDN Dedicated CDN resources Bottleneck

R = production Production streaming engine
Fine-tuned algorithms (hybrid architecture) Larger resource pool More scalable protection Serving clients before experiment starts Production leads to a much more unified approach of handling both protection and orchestration.

Problem of R = production
Systematic bias Competition between experiment and production to share resources Protect QoE  higher priority for production  underestimate experiment

PCE R = P + C C: CDN (rCDN) with bounded resource P: production δ

PCE rCDN as a filter It “lowers” the piece missing ratio curve of experiment visible by production down by δ Inserting a small filter between experiment and production to handle pieces missed by experiment Another view is that it “lowers” the piece missing ratio curve of experiment visible by production down by delta.

Implementation(Client)
Modular process for streaming machines- should be agnostic of modes of other machines – production or experiment Key observation: Use Sliding window to partition downloading tasks

Recap The objective of PCE is to assign a task first to experiment
If experiment fails, reassign it to rCDN If rCDN fails, reassign it to production. To accomplish this, ShadowStream introduces a simple streaming hypervisor to implement the PCE.

Streaming hypervisor Task window management: sets up sliding window
Data distribution control: copies data among streaming machines Network resource control: bandwidth scheduling among stream machines Experiment transition To accomplish this, ShadowStream introduces a simple streaming hypervisor to implement the PCE.

Streaming hypervisor

Task window Management
Informs a streaming machine about the pieces that it should download The length of the task window is set to the maximum lag when the streaming machine is run alone. Note that the source point is different for different entities. For E, it is the real sourcepoint, for C it is (sourcepoint – e_lag), and for P it is (sourcepoint – e_lag – c_lag)

Data Distribution Control
Data store Shared data store Each streaming machine  pointer Data distribution, as the name suggests distributes the data among the streaming machines. The propagation of data should be unidirectional. Why? E should not upload what it couldn’t download. Recall that this is a torrent-like design. From an implementation point of view, all the streaming machines have a shared data store. When an earlier machine downloads a piece, the pointers of the later machines are also updated, but not vice versa.

Data Distribution Control
writepiece( ) Checkpiece( ) Deliverpiece( ) – streaming machine notifies that the piece is ‘ready’ to deliver to the media player, but HYPERVISOR DECIDES WHEN TO ACTUALLY DO THE DELIVERY. Data distribution, as the name suggests distributes the data among the streaming machines. The propagation of data should be unidirectional. Why? E should not upload what it couldn’t download. Recall that this is a torrent-like design. From an implementation point of view, all the streaming machines have a shared data store. When an earlier machine downloads a piece, the pointers of the later machines are also updated, but not vice versa.

Network resource control
Production bears higher priority LED-BAT to perform bandwidth estimation Avoid hidden buffer network congestion

Network resource control
Bandwidth allocation among streaming machines. No streaming machine can exceed network bandwidth to create hidden congestion. APIs : sendMessage( ), recvMessage( ).

Anything left out? We have seen so far how the pieces are partitioned, downloaded and distributed. We have also seen the network resources managed in the hypervisor. But, How do we START and STOP the experiment streaming machine? How do we calculate the testing times? This is where experiment orchestration comes in.

Experiment Orchestration
Triggering Arrival Experiment Transition Departure

Specification and Triggering
Testing behavior pattern Multiple classes : Cable, DSL, S/W version etc Each class Arrival rate function during interval Duration function L: f(arrival times, video quality) Triggering condition A testing behavior pattern defines one or more classes of clients. Class is defined by its properties. The duration of a class is dependent on the video quality and arrival times. This is as mentioned by Professor Klara. She mentioned work of Professor Dah Ming Chiu on video popularity dynamics. Exp(t) -> expected number of clients active at any time t. This expression is an upper bound . Explain with graph. Some clients may leave. Delta from Autoregressive Integrated Moving Average Model. – Wait for network channel to evolve when testing can be triggered. tstart

Arrival Independent arrivals to achieve global arrival pattern (All in real-time) Network-wide common parameters tstart, texp and λ(t) Included in keep-alive message to all clients Orchestrator needs to tell the client, in real-time about the starting and stopping time. Orchestrator can’t send commands at appropriate times – does not scale. Needs to start too many sessions. Easier for client to send to orchestrator. Shadow stream introduces distributed orchestration- decouples scenario parameters from execution.

Arrival Orchestrator does not compute individual arrival /dep. times of each of the clients. Cox-Lewis theorem. Orchestrator needs to tell the client, in real-time about the starting and stopping time. Orchestrator can’t send commands at appropriate times – does not scale. Needs to start too many sessions. Easier for client to send to orchestrator. Shadow stream introduces distributed orchestration- decouples scenario parameters from execution. Client arrives at a_(e,i).

Experiment Transition
Current t0, join at ae,i  [t0, ae,i] Connectivity Transition Production  neighbor’s production (not in test) Production rejoins Suppose current time is t0. The client will join at a_(e,i). The client is prepared from [t0, a_(e,i)] while starting the experiment.

Experiment Transition
Playbuffer State Transition Legacy removal R_pre is the dedicated CDN capacity to handle flash crowds. All download should be done in time a_(e,i) – e_lag to a_(e,i).

Departure Early departure Capturing client state  snapshot
Using disconnection message Substitution Arrival process again Only equal or more frequent than the real viewer departure pattern (acceptable?)

EVALUATIONS

Software Implementation (C++)
Compositional Runtime Modular design, including scheduler, dynamic loading of blocks, etc. 3400 lines of code Pre-packaged blocks HTTP integration, UDP sockets and debugging 500 lines of code Live streaming machine 4200 lines of code ShadowStream system is fully implemented. Hereby we give the client side statistics of streaming machine and hypervisor. It is around 10,000 line of codes in total. (Click)

Experimental Opportunities
Live testing provides evaluation scales that are not possible in any existing testbed. Real traces from two live streaming testing channels. The green channel can accommodate a testing of 100,000 clients for at least 60 minutes. The red channel can host a test with smaller number of clients, but with longer duration. (Click)

Protection and Accuracy
Piece Missing Ratio Virtual Playpoint Real Playpoint Buggy 8.73% N/A R=rCDN 8.72% 0% R=rCDN w/ bottleneck (4%) 8.81% 5.42% We evaluate protection and accuracy by running fully implemented ShadowStream clients on Emulab. The benefit is that we can run an experiment multiple times repeatedly, all in the exact same setting. A dedicated Experiment machine is customized for the purpose of evaluation by injecting a bug. First we evaluate the performance of pure CDN repair. (Click) The buggy machine running alone would have a piece missing ratio of 8.73% (Click) Without network bottleneck, the dedicated CDN repair can get relatively satisfied results, and the viewer is not affected (Click) When there exists bottleneck, the missing pieces can not be fully repaired. We see there are over 5% piece missing observed by real viewers. Next we evaluate the PCE design (Click) .

Protection and Accuracy
Piece Missing Ratio Virtual Playpoint Real Playpoint PCE bottleneck 9.13% 0.15% PCE w/ higher bottleneck 8.85% 0% (Click) When there is a bottleneck, the PCE result is misleading, but the Production machines are repairing the pieces, and the affection to users is minimized. (Click) When bottleneck is increased, all missing pieces can be repaired by bounded CDN. We get relatively accurate result, and the user QoE is guaranteed.

Orchestration: Distributed Activation
Next we evaluate orchestration control. This is the result of a trace-driven simulation. There are over 300,000 arrivals simulated. From the figure, we observe that distributed arrivals result in a close match to the target arrival rate function. (Click)

Utility on Top: Deterministic Replay
Event Message Random seeds Control non-deterministic inputs Practical per-client log size We build a replay capability on top of ShadowStream so that real tests in production streaming can be played back step-by-step offline for detailed analysis. A streaming machine is typically an event driven system and it is not computationally-intensive. We implement each streaming machine in a single thread and control all the non-deterministic inputs. We observe that the sizes of logged data to achieve replay are practical. (Click) Log Size 100 clients; 650 seconds 223KB 300 clients; 1,800 seconds 714KB

Review/Critique Motivation and background very clearly developed.
Performance evaluation as an intrinsic capability in production networks Scalable (PCE) protection of QoE despite large-scale Experiment failures Transparent orchestration for flexible testing Illustrations such as graphs, timelines and tables for the API. Increases readability. Large-scale deployment and evaluation. The paper assumes we have P(production). Is this always valid? The systematic bias when using production could have been more cleanly explained in the paper.

Questions/Discussion

ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network CS598 Advanced Multimedia Systems Fall 2017 Prof. Klara.

Similar presentations

Presentation on theme: "ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network CS598 Advanced Multimedia Systems Fall 2017 Prof. Klara."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network CS598 Advanced Multimedia Systems Fall 2017 Prof. Klara.

Similar presentations

Presentation on theme: "ShadowStream: Performance Evaluation as a Capability in Production Internet Live Stream Network CS598 Advanced Multimedia Systems Fall 2017 Prof. Klara."— Presentation transcript:

Similar presentations

About project

Feedback