In five years, YouTube has completely reshaped the Internet, media, and political landscapes.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

AP Stats Review. Assume that the probability that a baseball player will get a hit in any one at-bat is Give an expression for the probability.
Linear Regression Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 8 Linear Regression.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Correlation and regression
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Objectives (BPS chapter 24)
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
 The first video game device was called the Cathode Ray Tube Amusement Device, which was created in 1947  The first game console that was available.
Stat 217 – Day 24 Analysis of Variance Have yesterday’s handout handy.
BA 555 Practical Business Analysis
Simple Linear Regression
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
Lecture 24: Thurs., April 8th
Lecture 23 Multiple Regression (Sections )
Introduction to Probability and Statistics Linear Regression and Correlation.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
C HAPTER 2 S CATTER PLOTS, C ORRELATION, L INEAR R EGRESSION, I NFERENCES FOR R EGRESSION By: Tasha Carr, Lyndsay Gentile, Darya Rosikhina, Stacey Zarko.
Introduction to Regression Analysis, Chapter 13,
Correlation & Regression
Correlation and Linear Regression
Active Learning Lecture Slides
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Correlation and Linear Regression
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Correlation and Regression
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
 Graph of a set of data points  Used to evaluate the correlation between two variables.
Correlation & Regression
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
12.1 WS Solutions. (b) The y-intercept says that if there no time spent at the table, we would predict the average number of calories consumed to be
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
PRESENTATION SOFTWARE ASSIGNMENT Group 12: Megan Tucker Jeff Kerns Nicole Tulga.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Warm-up Ch.11 Inference for Linear Regression (Day 1) 1. The following is from a particular region’s mortality table. What is the probability that a 20-year-old.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Manny and Andrew. What is Starcraft II RTS (real time strategy game) Made by Blizzard entertainment Came out in 2010 Sequel to original Starcraft (1998)
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
By: Kristen Lawlor and Katie Walsh. Egyptians – Used reddish-brown stains derived from henna to color nails and fingertips – Signified social order Chinese.
ANOVA, Regression and Multiple Regression March
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
AP Statistics Final Project Philadelphia Phillies Attendance Kevin Carter, Devon Dundore, Ryan Smith.
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Statistics 8 Linear Regression. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Inference about the slope parameter and correlation
Chapter 14 Introduction to Multiple Regression
Regression and Correlation
Section 11.2 Day 3.
Inference for Least Squares Lines
Correlation and Regression
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Unit 3 – Linear regression
When You See (This), You Think (That)
Presentation transcript:

In five years, YouTube has completely reshaped the Internet, media, and political landscapes. February 14, Chad Hurley, Steve Chen, and Jawed Karim begin work on a "Flickr or HotorNot for video." They register youtube.com the next day. April 23, "Me at the Zoo,” 19 seconds of Karim in front of the elephants at the San Diego Zoo, is the first video posted to the site. December 15, YouTube officially debuts. October 9, Google buys YouTube for $1.65 billion. October 12, YouTube passes 1 billion videos a day but remains unprofitable.

Autos & Vehicles Comedy Education Entertainment Film & Animation Gaming Howto & Style Music News & Politics Nonprofits & Activism People & Blogs Pets & Animals Science & Technology Sports Travel & Events

What we analyzed: – 30 random videos from each of the 15 YouTube video categories (every third video) – Length of video – Number of views – Number of comments How we collected data: – Split up the categories among the three of us- 5 categories each – Recorded the above mentioned variables of every third video in the assigned category

r=0.0158

Negative Linear Very Weak Scattered Residual Plot Correlation (r)= Variance (r2)= % of the change in the number of views is due to the change in the length of the video Overall, for our population of YouTube viewers, as the length of the video increases, the number of views does slightly decrease; however, our data is not sufficient enough to show a strong enough relationship between the two variables. Our variance was so small that we could not determine any true relationship between the two.

Assumptions 2 independent SRS True relationship is linear Checks Check Assumed

Ho: β= 0Ha: β< 0 We fail to reject Ho because P-value is > α=0.05. We have sufficient evidence that the slope of the population regression line of length of video versus number of views is 0. Thus, as the length of video increases, the number of views is not affected.

r=

Positive Linear Moderately Strong Scattered Residual Plot Correlation (r)= Variance (r 2 )= % of the change in the number of comments is due to the change in the number of views Overall, for our population of YouTube viewers, as the number of views increases, the number of comments will also increase. Thus, as more people continue to view the video, the number of comments will go up.

Assumptions 2 independent SRS True relationship is linear Checks Check Assumed

Ho: β= 0Ha: β> 0 We reject Ho because P-value is < α=0.05. We have sufficient evidence that the slope of the population regression line of number of views versus number of comments is greater than 0. Thus, as the number of views increases, the number of comments also increases.

Category: Means: Autos and Vehicles Comedy Education Entertainment Film and Animation Gaming Howto and Style Music News and Politics Nonprofits and Activism People and Blogs Pets and Animals Science and Technology Sports Travel and Events

Music had the most extreme average number of views of all of the categories at 49,350,400. Sports, Entertainment and Gaming had the next highest average number of views respectively, although none of them came close to Music. Thus, the majority of YouTube viewers use it for Music. Travel and Events, Education and Nonprofits and Activism had the least average number of views respectively. This is most likely due to the fact that people use YouTube for entertainment purposes.

Assumptions:Checks: Check 30=30 SRS Normal Population or n≥30

Category: We are 95% confident that the true mean number of views for the following categories are: Autos and Vehicles Comedy Education Entertainment Film and Animation Gaming Howto and Style Music News and Politics Nonprofits and Activism People and Blogs Pets and Animals Science and Technology Sports Travel and Events ( , ) ( , ) ( , ) ( , ) (5052.9, ) (51236, ) ( , ) ( x 10 7, x 10 7 ) ( , 44344) ( , ) ( , ) ( , ) (3529.7, ) ( , ) ( , )

Music had the largest difference between max and min of the interval at 68,225,100, which is why we removed it from the rest of the data in order to get a better view of the other categories. Travel and Events had the smallest difference between max and min of the interval at 2, We think that Music had the largest interval because the standard deviation was so large at 50,547,300 due to several outliers in the hundred million views. Following Music was Sports and then Entertainment. We think these intervals were large for the same reasons as Music because there were a few data points that were outliers. The smaller intervals following Travel and Events included Science and Technology, Nonprofits and Activism and Education.

For the population of YouTube viewers, as the length of the video increases, the number of views does slightly decrease; however, our data is not sufficient enough to show a strong enough relationship between the two variables. Thus, the length of the video will not deter the average viewer. For the population of YouTube viewers, as the number of views increases, the number of comments will also increase. Thus, as more people continue to view the video, the more likely it is that people will comment. The extreme variation in average number of views is most likely due to the fact that people use YouTube for entertainment purposes. YouTube is a social site, so popularity and recognition will affect whether a person watches the video or not.

Even if you do not watch the video all of the way through it will register as a view. There are many videos on YouTube all dispersed within the given categories. Because they must fit in one, some videos may be more loosely grounded in the topic. – Some of the categories leave less room for variation, such as news and politics, while others, like comedy, could encompass many different types of video. Because there is a variation in amount of videos in each category, in those with less videos there is less variation in the videos that we choose. There are some videos that do not have much appeal, whether because of subject matter or other better versions in existence, because of this they are not going to get views regardless of time. There were some videos that disabled comments, which we counted as zero views. Perhaps they would have received comments if allowed. There are some strong outliers, like the hour and a half long video.

Overall, the length of the video did not affect the number of views. This surprised us because we expected that people would not want to wait to watch a longer video. The margin of error for the mean number of views for Music was surprisingly extreme. We attributed this to Justin Bieber because his three videos included in the data had enormous numbers of views. We were not surprised that the less “entertaining” categories had a lesser number of views because YouTube is a social site, so popularity and recognition will affect whether a person watches the video or not.