How does video quality impact user engagement?

Slides:



Advertisements
Similar presentations
Recommender System A Brief Survey.
Advertisements

Data: Quantitative (Histogram, Stem & Leaf, Boxplots) versus Categorical (Bar or Pie Chart) Boxplots: 5 Number Summary, IQR, Outliers???, Comparisons.
Akamai Media Analytics: Driving Engagement
Making Inferences about Causality In general, children who watch violent television programs tend to behave more aggressively toward their peers and siblings.
Making Inferences about Causality In general, children who watch violent television programs tend to behave more aggressively toward their peers and siblings.
A Quest for an Internet Video Quality-of-Experience Metric
Different Methods of Impact Evaluation
Junchen Jiang (CMU) Vyas Sekar (Stony Brook U)
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
1 Developing a Predictive Model for Internet Video Quality-of-Experience Athula Balachandran, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica,
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
1 Developing a Predictive Model of Quality of Experience for Internet Video Athula Balachandran Carnegie Mellon University.
Robin L. Donaldson May 5, 2010 Prospectus Defense Florida State University College of Communication and Information.
BA 555 Practical Business Analysis
Lecture 5: Learning models using EM
Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Correlation and Regression Analysis
- Conviva Confidential - Understanding and Improving Video Quality Vyas Sekar, Ion Stoica, Hui Zhang.
Basic Concepts of Research Basis of scientific method Making observations in systematic way Follow strict rules of evidence Critical thinking about evidence.
Chapter 5 Research Methods in the Study of Abnormal Behavior Ch 5.
Selecting the Correct Statistical Test
N318b Winter 2002 Nursing Statistics Specific statistical tests: Correlation Lecture 10.
SIGCOMM Outline  Introduction  Datasets and Metrics  Analysis Techniques  Engagement  View Level  Viewer Level  Lessons  Conclusion.
Developing a Predictive Model of Quality of Experience for Internet Video Athula Balachandran -CMU.
Quantitative and Qualitative Data Analysis Stephanie Gardner & Miriam Segura-Totten.
© 2007 The McGraw-Hill Companies, Inc. All Rights Reserved Slide 1 Research Methods In Psychology 2.
Lecture 6: Video Streaming 2-1. Outline  Network basics:  HTTP protocols  Studies on HTTP performance from different views:  Browser types [NSDI 2014]
BPS - 3rd Ed. Chapter 211 Inference for Regression.
A Quest for an Internet Video Quality-of-Experience Metric A. Balachandran, V. Sekar, A. Akella, S. Seshan, I. Stoica and H. Zhang In Proceedings of the.
Design Experimental Control. Experimental control allows causal inference (IV caused observed change in DV) Experiment has internal validity when it fulfills.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Experimental Research Methods in Language Learning Chapter 11 Correlational Analysis.
Hypothesis of Association: Correlation
Correlational Research Chapter Fifteen Bring Schraw et al.
Selecting and Recruiting Subjects One Independent Variable: Two Group Designs Two Independent Groups Two Matched Groups Multiple Groups.
1 The Quest for the Optimal Experiment RecSys
Intro: “BASIC” STATS CPSY 501 Advanced stats requires successful completion of a first course in psych stats (a grade of C+ or above) as a prerequisite.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Methodology Matters: Doing Research in the Behavioral and Social Sciences ICS 205 Ha Nguyen Chad Ata.
Optimizing Shipping Times Using Fractional Factorial Designs Steven Walfish June 6, 2002.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Correlation and Linear Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
Experimental Design Econ 176, Fall Some Terminology Session: A single meeting at which observations are made on a group of subjects. Experiment:
Management Plane Analytics Aaron Gember-Jacobson, Wenfei Wu, Xiujun Li, Aditya Akella, Ratul Mahajan 1.
Lecture 7: Video Streaming (II) 2-1. Outline  Network basics:  HTTP protocols  Studies on HTTP performance from different views:  Browser types [NSDI.
Understanding the Impact of Network Dynamics on Mobile Video User Engagement M. Zubair Shafiq (Michigan State University) Jeffrey Erman (AT&T Labs - Research)
Handbook for Health Care Research, Second Edition Chapter 13 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 13 Statistical Methods for Continuous Measures.
SHADOWSTREAM: PERFORMANCE EVALUATION AS A CAPABILITY IN PRODUCTION INTERNET LIVE STREAM NETWORK ACM SIGCOMM CING-YU CHU.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Research in Psychology Chapter Two 8-10% of Exam AP Psychology.
Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Definition Slides Unit 2: Scientific Research Methods.
Definition Slides Unit 1.2 Research Methods Terms.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Research Methods. Define the Milgram experiment An experiment in which Milgram wanted to determine whether participants would administer painful shocks.
Research Methods In Psychology
Correlation I have two variables, practically „equal“ (traditionally marked as X and Y) – I ask, if they are independent and if they are „correlated“,
Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation Junchen Jiang (CMU) Shijie Sun (Tsinghua Univ.)
CFA: A Practical Prediction System for Video Quality Optimization
Research design I: Experimental design and quasi-experimental research
Correlated-Groups and Single-Subject Designs
Lecture 24, Computer Networks (198:552)
Vocab unit 2 Research.
CS 594: Empirical Methods in HCC Experimental Research in HCI (Part 1)
Presentation transcript:

How does video quality impact user engagement? Vyas Sekar, Ion Stoica, Hui Zhang Acknowledgment: Ramesh Sitaraman (Akamai,Umass)

Attention Economics Overabundance of information implies a scarcity of user attention! Onus on content publishers to increase engagement Why should we care about engagement in generall – we can go back to gherb simon’s theory of attention economics .. The diminisgh cost of contentn creation and dissemination is increasing the onus on content providers to make users are engaged .. Otehrwise users attention span is pretty short.

Understanding viewer behavior holds the keys to video monetization Abandonment Engagement Repeat Viewers VIDEO MONETIZATION Subscriber Base Loyalty Ad opportunities Are You Ready? Video providers have a subscription-based, ad-based, or play-per-view based model.

What impacts user behavior? Content/Personal preference The natural question is what factors impact engagement .. The obvious answer from a psychogological point of view is the users personal taste preferences and the Value of the content itself – some movies are obviously boring others might be more engaging .. For instance one of the largest live broadcass was the moon landing even though video was pretty fuzzy .. Showing he value of content. A Finamore et al, YouTube Everywhere: Impact of Device and Infrastructure Synergies on User Experience IMC 2011

Does Quality Impact Engagement? How? Buffering . . . . Our focus in this section is on a slightly different question – content is obviously important but that’s not something we can objectively predict, at least not yet. Our focus is on what we as a net/sys community can help – how does quality impact engagement - -what are the cticial metrics? How much does optimizing a metric help etc.

Traditional Video Quality Assessment Objective Score (e.g., Peak Signal to Noise Ratio) Subjective Scores (e.g., Mean Opinion Score) S.R. Gulliver and G. Ghinea. Defining user perception of distributed multimedia quality. ACM TOMCCAP 2006. W. Wu et al. Quality of experience in distributed interactive multimedia environments: toward a theoretical framework. In ACM Multimedia 2009

Internet video quality Subjective Scores MOS Engagement measures (e.g., Fraction of video viewed) VISION – PAUSE TAKEAWAY Objective Scores PSNR Join Time, Avg. bitrate, …

Key Quality Metrics JoinFailures(JF) BufferingRatio(BR) JoinTime (JT) RateOfBuffering(RB) AvgBitrate(AB) To understand the quality metrics let us look at the life of a video player as it goes through .. RenderingQuality(RQ)

Engagement Metrics View-level Viewer-level Play time Viewer-level Total play time Total number of views Not covered: “heat maps”, “ad views”, “clicks”

Challenges and Opportunities with “BigData” Measurement Video Streaming Content Providers Globally-deployed plugins that runs inside the media player Visibility into viewer actions and performance metrics from millions of actual end-users

Natural Questions Which metrics matter most? Is there a causal connection? Are metrics independent? What kind of questions do we want to ask here .. And what are the right kinds of data/statistical tools we need to use? How do we quantify the impact? Dobrian et al Understanding the Impact of Quality on User Engagement, SIGCOMM 2011. S Krishnan and R Sitaraman Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Design IMC 2012

Questions  Analysis Techniques Which metrics matter most?  (Binned) Kendall correlation Are metrics independent?  Information gain How do we quantify the impact?  Regression What kind of questions do we want to ask here .. And what are the right kinds of data/statistical tools we need to use? Is there a causal connection?  QED

“Binned” rank correlation Traditional correlation: Pearson Assumes linear relationship + Gaussian noise Use rank correlation to avoid this Kendall (ideal) but expensive Spearman pretty good in practice Use binning to avoid impact of “samplers” Add a quick definition plus explanation .. Why kendall why not pearson etc

LVoD: BufferingRatio matters most Join time is pretty weak at this level

Questions  Analysis Techniques Which metrics matter most?  (Binned) Kendall correlation Are metrics independent?  Information gain How do we quantify the impact?  Regression Is there a causal connection?  QED

Correlation alone is insufficient Correlation can miss such interesting phenomena

Information gain background “high” “low” X P(X) A 0.7 B 0.1 C 0.1 D 0.1 Entropy of a random variable: X P(X) A 0.15 B 0.25 C 0.25 D 0.25 Conditional Entropy “high” “low” X Y A L B M B N X Y A L A M B N B O Information Gain Nice reference: http://www.autonlab.org/tutorials/

Why is information gain useful? Makes no assumption about “nature” of relationship (e.g., monotone, inc/dec) Just exposes that there is some relation Commonly used in feature selection Very useful to uncover hidden relationships between variables!

LVoD: Combination of two metrics BR, RQ combination doesn’t add value

Questions  Analysis Techniques Which metrics matter most?  (Binned) Kendall correlation Are metrics independent?  Information gain How do we quantify the impact?  Regression Is there a causal connection?  QED

Why naïve regression will not work Not all relationships are “linear” E.g., average bitrate vs engagement? Use only after confirming roughly linear relationship

Quantitative Impact 1% increase in buffering reduces engagement by 3 mins

Viewer-level Join time is critical for user retention

Questions  Analysis Techniques Which metrics matter most?  (Binned) Kendall correlation Are metrics independent?  Information gain How do we quantify the impact?  Regression Is there a causal connection?  QED

Randomized Experiments Idea: Equalize the impact of confounding variables using randomness. (R.A. Fisher 1937) Randomly assign individuals to receive “treatment” A. Compare outcome B for treated set versus the “untreated” control group. Treatment = Degradation in Video Performance Hard to do: Operationally Cost Effectively Legally Ethically

Idea: Quasi Experiments Idea: Isolate the impact of video performance and by equalizing confounding factors such as content, geography, connectivity. Treated (Poor video perf) Control or Untreated (Good video perf) Randomly pair up viewers with same values for the confounding factors Outcome Statistically highly significant results:100,000+ randomly matched pairs Hypothesis: PerformanceBehavior +1: supports hypothesis -1: rejects hypothesis 0: Neither Talk about adapting the technique from social and medical sciences. No control over who gets treatment. Examples: 1854: John Snow: water contaminants -> cholera (natural experiement) 1992: Kreuger Schooling -> Salary. Every year of schooling is 12-18% extra 298 twins. Also Campbell & Stanley 1963 Must know which are the confounding variables. Contrast with users studies or surveys that only have 100s or 1000s. Also say this technique is of independent interest applicable for other areas of network measurement.

Quasi-Experiment for Viewer Engagement Treated (video froze for ≥ 1% of duration) Control or Untreated (No Freezes) Same geography, connection type, same point in time within same video Hypothesis: More Rebuffers Smaller Play time Outcome For each pair, outcome = playtime(untreated) – playtime(treated) S Krishnan and R Sitaraman Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Design IMC 2012

Normalized Rebuffer Delay (γ%) Results of Quasi-Experiment Normalized Rebuffer Delay (γ%) Net Outcome 1 5.0% 2 5.5% 3 5.7% 4 6.7% 5 6.3% 6 7.4% 7 7.5% The findings from earlier are not just incidental there does seem to be a causation effect in place. A viewer experiencing rebuffering for 1% of the video duration watched 5% less of the video compared to an identical viewer who experienced no rebuffering.

(e.g., Fraction of video viewed) Are we done? Unified? Quantiative? Predictive? Subjective Scores MOS Engagement (e.g., Fraction of video viewed) Objective Scores PSNR Join Time, Avg. bitrate,.. A Balachandran et al A Quest for an Internet Video QoE Metric, HotNets 2012

Challenge: Capture complex relationships Non-monotonic Engagement Average bitrate Engagement Quality Metric Engagement Rate of switching Threshold And rate of switching and engagement – threshold effect Measurement study by Dobrian et al. in Sigcomm 2011 show many of these relationships.

Challenge: Capture interdependencies Join Time Avg. bitrate Rate of switching Rate of buffering There might be several other dependencies. Buffering Ratio

Challenge: Confounding factors Devices User Interest Connectivity

Some lessons…

Importance of systems context RQ is negative, but effect of player optimizations!

Need for multiple lenses Correlation alone can miss such interesting phenomena

Watch out for confounding factors Lots of them! due to user behaviors, due to delivery system artifact Need systematic frameworks for identifying E.g., QoE, learning techniques For incorporating impacts E.g., refined machine learning model

Useful references Check out: http://www.cs.cmu.edu/~internet-video For an updated bibliography