Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

Similar presentations


Presentation on theme: "1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University."— Presentation transcript:

1 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University of Calgary, Canada ² Enterprise Systems & Software Lab, HP Labs, USA ³ Dept. of Computer Science and Engineering, IIT Delhi, India

2 2 Introduction The way people use the Web is changing. Creation and sharing of media: Fast, easy, cheap! Volume of data associated with extremely popular online media.

3 3 What is Web 2.0? User generated content Text: Wordpress, Blogspot Photos: Flickr, Facebook Video: YouTube, MySpace Social Networking Facebook, MySpace Tagging Flickr, YouTube

4 4 YouTube: Facts and Figures Founded in February 2005 Enabled users to easily share movies by converting them to Flash Largest video sharing Website on the Internet [Alexa2007] Sold to Google for $1.65 billion in November 2006

5 5 How YouTube Works (1/2) GET: /watch?v=wQVEPFzkhaM OK (text/html) GET: /vi/fNaYQ4kM4FE/2.jpg OK (img/jpeg)

6 6 How YouTube Works (2/2) GET: swfobject.js OK (application/x-javascript) GET: /p.swf OK (video/flv) GET: /get_video?video_id=wQVEPFzkhaM OK (application/shockwave-flash)

7 7 Our Contributions Efficient measurement framework One of the first extensive characterizations of Web 2.0 traffic File properties File access patterns Transfer properties Implications for network and content providers

8 8 Outline Introduction & Background Contributions Methodology Results Implications Conclusions

9 9 Our View Points Edge (University Campus) 28,000 students 5,300 faculty & staff /16 address space 300Mb/s full-duplex network link Global Most popular videos

10 10 Campus Data Collection Goals: Collect data on all campus YouTube usage Gather data for an extended period of time Protect user privacy Challenges: YouTube’s popularity Monitor limitations Volume of campus Internet usage

11 11 Our Methodology Identify servers providing YouTube content Use bro to summarize each HTTP transaction in real time Restart bro daily and compress the daily log Map visitor identifier to a unique ID

12 12 Categories of Transactions Complete – the entire transaction was parsed successfully Interrupted – TCP connection was reset Gap – monitor missed a packet Failure – transaction could not be parsed

13 13 Categories of Transactions (2) Status% of Total% of Video Complete90.8224.66 Interrupted1.8824.25 Gap1.5651.09 Failure5.75-

14 14 Our Traces Start Date: Jan. 14, 2007 End Date: Apr. 8, 2007 Total Valid Transactions: 23,250,438 Total Bytes: 6.54 TB Total Video Requests: 625,593 Total Video Bytes: 6.45 TB Unique Video Requests: 323,677 Unique Video Bytes: 3.26 TB

15 15 HTTP Response Codes Code% of Responses% of Bytes 200 (OK)75.8089.78 206 (Partial Content)1.2910.22 302 (Found)0.050.00 303 (See Other)5.330.00 304 (Not Modified)17.340.00 4xx (Client Error)0.190.00 5xx (Server Error)0.010.00

16 16 Global Data Collection Crawling all videos is infeasible Focus on top 100 most popular videos Four time frames: daily, weekly, monthly and all time. 2 step data collection: Retrieve pages of most popular videos Use YouTube API to get details on these videos

17 17 Outline Introduction & Background Contributions Methodology Results Implications Conclusions

18 18 Results Campus Usage Patterns File Properties File Access Patterns Transfer Properties

19 19 Campus Usage Patterns Reading Break

20 20 Results Campus Usage Patterns File Properties File Access Patterns Transfer Properties

21 21 Unique File Sizes Video data is significantly larger than the other content types

22 22 Time Since Modification Videos and images rarely modified Text and application data modified more frequently

23 23 Video Durations Spike around 3 minutes likely music videos Campus videos are relatively short: μ=3.3 min

24 24 Summary of File Properties Video content is much larger than other content types Image and video content is more static than application and text content Video durations are relatively short Videos viewed on campus tend to be more than 1 month old

25 25 Results Campus Usage Patterns File Properties File Access Patterns Transfer Properties

26 26 Relative Popularity of Videos Video popularity follows a weak Zipf distribution Possibly due to edge network point of view β = 0.56

27 27 Commonality of Videos ~10% commonality between consecutive days during the week ~5% commonality between consecutive days on the weekend

28 28 Summary of File Referencing Zipf distribution is weak when observed from the edge of the network There is some overlap between videos viewed on consecutive days Significant amount of content viewed on campus is non-unique

29 29 Results Campus Usage Patterns File Properties File Access Patterns Transfer Properties

30 30 Transfer Sizes Flash player (p.swf, player2.swf) Javascripts

31 31 Transfer Durations Video transfers have significantly longer durations than other content types

32 32 Summary of Transfer Properties Javascript and flash objects have an impact on the size of files transferred Video transfers have significantly larger sizes and durations

33 33 Outline Introduction & Background Contributions Methodology Results Implications Conclusions

34 34 Implications for Network Providers Web 2.0 poses challenges to caching Larger multimedia files More diversity in content Meta data may be used to improve caching efficiency

35 35 Implications for Content Providers Multimedia content is large! 65,000 videos/day x 10MB/video = 19.5 TB/month Long tail effect -> much of the content will be unpopular Cheap storage solutions Longer transfer durations for video files more CPU cycles required for transfers

36 36 Conclusions Multimedia content has much larger transfer sizes and durations than other content types From the edge of the network, video popularity follows a weak Zipf distribution Web 2.0 facilitates diversity in content which poses challenges to caching New approaches are needed to efficiently handle the resource demands of Web 2.0 sites

37 37 Questions? Contact psessini@ucalgary.ca

38 38 Ignore the slides after this one

39 39 Download to Bitrate-Ratio

40 40 Time of Day and Day of Week Traffic Patterns

41 41 Video Ages 73% of campus videos are older than 1 month 5% of campus videos are older than 1 year

42 42 Absolute Growth in Working Set Half the video content transferred is non-unique

43 43 What is different about Web 2.0? Web 1.0:

44 44 What is different about Web 2.0? Web 2.0:


Download ppt "1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University."

Similar presentations


Ads by Google