1 Why is P2P the Most Effective Way to Deliver Internet Media Content Xiaodong Zhang Ohio State University In collaborations with Lei Guo, Yahoo! Songqing.

Slides:



Advertisements
Similar presentations
A Measurement Study of Peer-to-Peer File Sharing Systems Presented by Cristina Abad.
Advertisements

A Systematic Study of the Mobile App Ecosystem Thanasis Petsas, Antonis Papadogiannakis, Evangelos P. Markatos Michalis PolychronakisThomas Karagiannis.
Rarest First and Choke Algorithms are Enough Arnaud LEGOUT INRIA, Sophia Antipolis France G. Urvoy-Keller and P. Michiardi Institut Eurecom France.
ICNP’07, Beijing, China1 PSM-throttling: Minimizing Energy Consumption for Bulk Data Communications in WLANs Enhua Tan 1, Lei Guo 1, Songqing Chen 2, Xiaodong.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon.
ICDCS’07, Toronto, Canada1 SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming Enhua Tan 1, Lei Guo 1, Songqing Chen 2, Xiaodong Zhang.
Measurement, Modeling, and Analysis of a Peer-2-Peer File-Sharing Workload Presented For Cs294-4 Fall 2003 By Jon Hess.
Measurement, Modeling and Analysis of a Peer-to-Peer File-Sharing Workload Krishna Gummadi, Richard Dunn, Stefan Saroiu Steve Gribble, Hank Levy, John.
1 10 Web Workload Characterization Web Protocols and Practice.
GlobeTraff A traffic workload generator for the performance evaluation of ICN architectures K.V. Katsaros, G. Xylomenos, G.C. Polyzos A.U.E.B. (presented.
An Empirical Study of Real Audio Traffic A. Mena and J. Heidemann USC/Information Sciences Institute In Proceedings of IEEE Infocom Tel-Aviv, Israel March.
1 School of Computing Science Simon Fraser University, Canada Modeling and Caching of P2P Traffic Mohamed Hefeeda Osama Saleh ICNP’06 15 November 2006.
Fresh Analysis of Streaming Media Stored on the Web Rabin Karki M.S. Thesis Presentation Advisor: Mark Claypool Reader: Emmanuel Agu 10 Jan, 2011.
One-Click Hosting Services: A File-Sharing Hideout Demetris Antoniades Evangelos P. Markatos ICS-FORTH Heraklion,
A survey of BitTorrent study Jian Liang EL933 Prof. Yong Liu.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
Measurement, Modeling, and Analysis of a Peer-to-Peer File sharing Workload Krishna P. Gummadi, Richard J. Dunn, Stefan Saroiu, Steven D. Gribble, Henry.
Delving into Internet Streaming Media Delivery: Written by: Lei Guo, Enhua Tan, Songqing Chen, Zhen Xiao, Oliver Spatscheck, Xiaodong Zhang Presented by:
1 Simultaneous Distribution Control and Privacy Protection for Proxy based Media Distribution George Mason University Songqing Chen (George Mason University)
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
An Overlay Multicast Infrastructure for Live/Stored Video Streaming Visual Communication Laboratory Department of Computer Science National Tsing Hua University.
Performance Evaluation of Peer-to-Peer Video Streaming Systems Wilson, W.F. Poon The Chinese University of Hong Kong.
Understanding Churn in Peer-to-Peer Networks Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Internet Measurement Conference.
1 Characterizing Files in the Modern Gnutella Network: A Measurement Study Shanyu Zhao, Daniel Stutzbach, Reza Rejaie University of Oregon SPIE Multimedia.
Web Caching Robert Grimm New York University. Before We Get Started  Illustrating Results  Type Theory 101.
A Hybrid Caching Strategy for Streaming Media Files Jussara M. Almeida Derek L. Eager Mary K. Vernon University of Wisconsin-Madison University of Saskatchewan.
Caching And Prefetching For Web Content Distribution Presented By:- Harpreet Singh Sidong Zeng ECE Fall 2007.
1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.
Internet Streaming Media Delivery:
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.
Web Caching and Content Delivery. Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve.
Can Internet Video-on-Demand Be Profitable? SIGCOMM 2007 Cheng Huang (Microsoft Research), Jin Li (Microsoft Research), Keith W. Ross (Polytechnic University)
1 Analyzing Patterns of User Content Generation in Online Social Networks Lei Guo, Yahoo! Enhua Tan, Ohio State University Songqing Chen, George Mason.
IBM Research, Network Server Systems Software © 2006 IBM Corporation Characterizing Instant Messaging Traffic in an Enterprise Network Lei Guo, the Ohio.
Add image. 3 “ Content is NOT king ” today 3 40 analog cable digital cable Internet 100 infinite broadcast Time Number of TV channels.
1 The Stretched Exponential Distribution of Internet Media Access Patterns Lei Yahoo! Inc. Enhua Ohio State University Songqing George.
P2P Architecture Case Study: Gnutella Network
1 One-Click Hosting Services: A File-Sharing Hideout Demetris Antoniades Evangelos P. Markatos ICS-FORTH Heraklion,
Jesse E. Simsarian and Marcus Duelk Bell Laboratories, Alcatel-Lucent, Holmdel, NJ, 15th IEEE Workshop on Local and Metropolitan.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo 1, Songqing Chen 2, Zhen Xiao 3, Enhua Tan 1, Xiaoning Ding 1, and Xiaodong Zhang.
P.1Service Control Technologies for Peer-to-peer Traffic in Next Generation Networks Part2: An Approach of Passive Peer based Caching to Mitigate P2P Inter-domain.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.
1 Towards Cinematic Internet Video-on-Demand Bin Cheng, Lex Stein, Hai Jin and Zheng Zhang HUST and MSRA Huazhong University of Science & Technology Microsoft.
Distributing Layered Encoded Video through Caches Authors: Jussi Kangasharju Felix HartantoMartin Reisslein Keith W. Ross Proceedings of IEEE Infocom 2001,
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.
HUAWEI TECHNOLOGIES CO., LTD. Page 1 Survey of P2P Streaming HUAWEI TECHNOLOGIES CO., LTD. Ning Zong, Johnson Jiang.
1 Insights into Access Patterns of Internet Media Systems Lei Guo.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Characterizing User Access To Videos On The World Wide Web MMCN 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Peter Parnes.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Practical LFU implementation for Web Caching George KarakostasTelcordia Dimitrios N. Serpanos University of Patras.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo, Ph.D. Candidate Park Graduate Research Award Presentation.
NTMS 2012 GlobeTraff: a traffic workload generator for the performance evaluation of future Internet architectures K.V. Katsaros, G. Xylomenos, G.C. Polyzos.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
Performance Limitations of ADSL Users: A Case Study Matti Siekkinen, University of Oslo Denis Collange, France Télécom R&D Guillaume Urvoy-Keller, Ernst.
An Analysis of Internet Content Delivery Systems 19 rd November, 2007 Youngsub CSE, SNU.
#16 Application Measurement Presentation by Bobin John.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
On the scale and performance of cooperative Web proxy caching 2/3/06.
Modeling and Caching of P2P Traffic Osama Saleh Thesis Defense and Seminar 21 November 2006.
Does Internet media traffic really follow the Zipf-like distribution? Lei Guo 1, Enhua Tan 1, Songqing Chen 2, Zhen Xiao 3, and Xiaodong Zhang 1 1 Ohio.
Measurements, Analysis, and Modeling of BitTorrent-like Systems
Xiaodong Zhang Ohio State University
Presentation transcript:

1 Why is P2P the Most Effective Way to Deliver Internet Media Content Xiaodong Zhang Ohio State University In collaborations with Lei Guo, Yahoo! Songqing Chen, George Mason Enhua Tan, Ohio State Zhen Xiao, IBM T. J. Watson Research

2 Media contents on the Internet Video traffic is doubling every 3 to 4 months No. 3 1.Yahoo 2.Google 3.YouTube Video applications are mainstream

3 Different media delivery approaches ♫ ♫ P2P exchange P2P swarming Content Delivery Network Downloading Streaming Complete fetch User controlled access Pseudo Streaming Progressive download & play Overlay multicast HTTP RTSP HTTP

4 The Power of measurements and modeling Media delivery on the Internet –Internet is an open, complex system –Media traffic is user-behavior driven Challenges –Lack of QoS support –Lack of Internet management and control for media flow –Thousands of concurrent streams from diverse clients Measurements and modeling are critical for –Evaluating system performance under the Internet environment –Understanding user access patterns in media systems –Providing guidance to media system design and management

5 Zipf distribution is believed the general model of Internet traffic patterns Zipf distribution (power law) –Characterizes the property of scale invariance –Heavy tailed, scale free rule –Income distribution: 80% of social wealth owned by 20% people (Pareto law) –Web traffic: 80% Web requests access 20% pages (Breslau, INFOCOM’99) System implications –Objectively caching the working set in proxy –Significantly reduce network traffic log i log y slope: -  i : rank of objects y i : number of references  : 0.6~0.8 i y heavy tail

6 Does Internet media traffic follow Zipf’s law? Chesire, USITS’01: Zipf-like Cherkasova, NOSSDAV’02: non-Zipf Acharya, MMCN’00: non-Zipf Yu, EUROSYS’06: Zipf-like Web media systemsVoD media systems Live streaming and IPTV systems Veloso, IMW’02: Zipf-like Sripanidkulchai, IMC’04: non - Zipf P2P media systems Gummadi, SOSP’03: non-Zipf Iamnitchi, INFOCOM’04: Zipf-like

7 Inconsistent media access pattern models Still based on the Zipf model –Zipf with exponential cutoff –Zipf-Mandelbrot distribution –Generalized Zipf-like distribution –Two-mode Zipf distribution –Fetch-at-most-once effect –Parabolic fractal distribution –…–… All case studies –Based on one or two workloads –Different from or even conflict with each other An insightful understanding is essential to –Content delivery system design –Internet resource provisioning –Performance optimization heuristic assumptions

8 Challenges of addressing the issues Existing studies cannot identify a general media access pattern –Limited number of workloads –Constrained scope of media traffic –Biased measurements and noises in the data set Model should be accurate, simple, and meaningful –Characterize the unique properties –Have clear physical meanings –Observable and verifiable predictions –Impacts on system designs Model validation methodology –Goodness-of-fit test –Reexamination of previous observations –Reappraisal of other models

9 Research Objectives Discover a general distribution model of media access patterns –Comprehensive measurements and experiments –Rigorous mathematical analysis and modeling –Insights into media system designs Performance analysis of BitTorrent systems for media delivery –Identifying the weakness of BitTorrent –Modeling potential of collaboration among different torrents –System facility and incentive mechanism for multi-torrent collaboration Designs and implementations of streaming media systems –Reliable and scalable peer-to-peer media systems –Power efficient wireless media systems –High performance Internet streaming through WLANs

10 Outline Motivation and objectives Stretched exponential model of Internet media traffic Dynamics of access patterns in media systems Caching implications Concluding remarks

11 Workload summary 16 workloads in different media systems –Web, VoD, P2P, and live streaming –Both client side and server side Different delivery techniques –Downloading, streaming, pseudo streaming –Overlay multicast, P2P exchange, P2P swarming Data set characteristics –Workload duration: 5 days - two years –Number of users: –Number of requests: –Number of objects: nearly all workloads available on the Internet all major delivery techniques data sets of different scales

12 Stretched exponential distribution Media reference rank follows stretched exponential distribution (passed Chi-square test) c : stretch factor log i ycyc b slope: -a i : rank of media objects ( N objects) y : number of references Probability distribution: Weibull Rank distribution: fat head and thin tail in log-log scale straight line in log x - y c scale log i log y fat head thin tail c : stretch factor

13 Evidences: Web media systems (server logs) ST-SVR-01 (15 MB) *HPC-98 (14 MB) *HPLabs-99 (120 MB) HPC-98: enterprise streaming media server logs of HP corporation (29 months) HPLabs: logs of video streaming server for employees in HP Labs (21 months) ST-SVR-01: an enterprise streaming media server log workload like HPC-98 (4 months) data in stretched exponential scale data in log-log scale R 2 : coefficient of determination (1 means a perfect fit) x : rank of media object, y : number of references to the object. Title: workload name (median file size) log scale in x axis powered scale y c log scale fat headthin tail c = 0.22 R 2 ~ 1

14 Evidences: Web media systems (req packets) ST-CLT-05 (4.5 MB) PS-CLT-04 (1.5 MB) ST-CLT-04 (2 MB) All collected from a large cable network hosted by a well-known ISP PS-CLT-04: first IP packets of HTTP requests for media objects (downloading and pseudo streaming), 9 days ST-CLT-04: RTSP/MMS streaming requests (on-demand media), 9 days ST-CLT-05: RTSP/MMS streaming requests (on-demand media), 11 days powered scale y c log scale fat headthin tail log scale in x axis

15 Evidences: VoD media systems mMoD-98: logs of a multicast Media-on-Demand video server, 194 days CTVoD-04: streaming serer logs of a large VoD system by China telecom, 219 days, reported as Zipf in EUROSYS’06 IFILM-06: number of web page clicks to video clips in IFILM site, 16 weeks (one week for the figure) YouTube-06: cumulative number of requests to YouTube video clips, by crawling on web pages publishing the data *mMoD-98 (125 MB) *CTVoD-04 (300 MB) IFILM-06 (2.25 MB) YouTube-06 (3.4 MB) powered scale y c log scale fat headthin tail log scale in x axis

16 Evidences: P2P media systems BT-03 (636 MB) *KaZaa-02 (300 MB) *KaZaa-03 (5 MB) KaZaa-02: large video file (> 100 MB. Files smaller than 100 MB are intensively removed) transferring in KaZaa network, collected in a campus network, 203 days. KaZaa-03: music files, movie clips, and movie files downloading in KaZaa network, 5 days, reported as Zipf in INFOCOM’04. BT-03: 48 days BitTorrent file downloading (large video and DVD images) recorded by two tracker sites

17 Evidences: Live streaming and other systems IMDB-06 Akamai-03 Movie-02 Akamai-03: server logs of live streaming media collected from akamai CDN, 3 months, reported as two-mode Zipf in IMC’04 Movie-02: US movie box office ticket sales of year IMDB-06: cumulative number of votes for top 250 movies in Internet Movie Database web site

18 Why Zipf observed before? Media traffic is driven by user requests Intermediate systems may affect traffic pattern –Effect of extraneous traffic –Filtering effect due to caching Biased measurements may cause Zipf observation cache proxy ad server media server

19 Extraneous media traffic meta file link web server streaming media server ads server ads clip flag clip video program ad and flag video are pushed to clients mandatorily ads clip flag clip video prog 1 flag clip video prog 2 ads clip flag clip video prog 1 flag clip video prog 2

20 Do not represent user access patterns –High request rate (high popularity) –High total number of requests Not necessary Zipf with extraneous traffic –Extraneous traffic changes –Always SE without extraneous traffic Small object sizes, small traffic volume Effects of extraneous traffic on reference rank distributions Reference rates prog ads flag : 2 objects2005: merged into 1 object Non-Zipf Zipf with extraneous traffic SE 2004 SE 2005 without extraneous traffic

21 Caching effect Web workload: caching can cause a “flattened head” in log-log scale Stretched exponential is not caused by caching effect Local replay events can be traced by WM/RM streaming media protocols –Before replay: cache validation –After replay: send feed back –Recorded in server logs –Captured in our network measurement log i log y Zipf Filtered by Web cache log i log y Stretched exponential SET_PARAMETER Logplaystats DESCRIBE Server logs packet sniffer

22 Fetch-at-most-once effect SOSP’03: “flattened head” of P2P access pattern –Media access pattern is Zipf-like –Users fetch a file at most once Unlimited cache for all users Contradict with streaming media measurements –SE access pattern, without caching Small streaming media objects –Users fetch an object multiple times Large streaming media objects –User may fetch and view only once Conclusion –No relation with “fetch-at-most-once” CTVoD-04KaZaa-02 Streaming media File size 300 MB, c = 0.4 KaZaa network File size 300 MB, c = 0.45 ST-CLT-04ST-CLT-05 Multiple fetches by the same user Rank of object-user pair (log) # of requests (log) SOSP’03

23 Why media access pattern is not Zipf “Rich-get-richer” phenomenon –Pareto, power law, … –The structure of WWW Web accesses are Zipf –Popular pages can attract more users –Pages update to keep popular –Yahoo ranks No.1 more than six years –Zipf-like for long duration Media accesses are different –Popularity decreases with time exponentially –Media objects are immutable –Rich-get-richer not present –Non-Zipf in long duration Number of distinct weekly top N popular objects in 16 weeks Top 1 Web object never changes Top 1 video object changes every week CCDF of req (log) raw data linear fit Time after object birth (day) BitTorrent media file

24 Outline Motivation and objectives Stretched exponential model of Internet media traffic Dynamics of access patterns in media systems Caching implications Concluding remarks

25 Dynamics of Access Patterns in Media Systems Media reference rank distribution in log-log scale –Different systems have different access patterns –The distribution changes over time in a system (NOSSDAV’02) All follow stretched exponential distribution –Stretch factor c –Minus of slope a Physical meanings –Media file sizes –Aging effects of media objects –Deviation from the Zipf model log i ycyc b slope: -a c : stretch factor

26 Stretched factors of different systems

27 Stretched factor and media file sizes Other factors besides file size –Different encoding rates and compression ratios –Video and audio are different –Different content type: entertainment, educational, business EDU BIZ file size vs. stretch factor c 0 – 5 MB: c <= – 100 MB: 0.2 ~ 0.3 > 100 MB: c >= 0.3 c increases with file size

28 Conservations in dynamic media systems Media requests over time –Constant media request rate req –Constant object birth rate obj Objects created in [0, t) Objects created in (- , 0] Number of accessed objects: system upgrade slope = req slope = obj

29 Stretched exponential parameters In a media system –Constant request rate –Constant object birth rate –Constant median file size Stretch factor c is a time invariant constant Parameter a increases with time log i ycyc b slope: -a

30 Evolution of media reference rank distribution IFILM-06BT-03 ST-SVR-01 system upgrade Slow down Reference rank distribution Parameter a a increases with time Caused by objects created before workload collection a converge to a constant

31 Deviation from the Zipf model a increases with c ( c < 2) a increases with req / obj a increases with t Big media files have large deviation Deviation increases with time

32 Example: YouTube Video Measurements in IMC’07 YouTube video downloads in a campus network Total number of downloads of each video clip, for all YouTube up time Large a, large deviation Small a, small deviation Old object dominant No old objects Campus users: small request rate req Global users: large request rate req

33 Outline Motivation and objectives Stretched exponential model of Internet media traffic Dynamics of access patterns in media systems Caching implications Conclusion

34 Caching analysis methodology Analyze caching with reference rank distribution –Requests are independent –Objects occupy unit storage volume Optimal hit ratio –Unlimited cache –N objects, cache size is  N Zipf-like distribution Stretched exponential

35 Modeling caching performance Media caching is far less efficient than Web caching Parameter selection Zipf: typical Web workload (  =0.8) SE: typical streaming workload (c = 0.2, a = 0.25, same as ST-CLT-05) Asymptotic analysis for small cache size k (k << N) Zipf SE Web media

36 Potential of long term media caching Short term –Requests dominated by old objects N ’ (t) >> obj t Long term –Requests dominated by new objects obj t >> N ’ (t) Optimal hit ratio of caching 10% objects –PS-CLT-04: 0.52 for 9 days, 0.85 maximal –ST-CLT-04: 0.48 for 9 days, 0.84 maximal –ST-CLT-05: 0.54 for 11 days, 0.85 maximal Request correlation can be further exploited –Object popularity decreases with time Great improvement when obj t >> N ’ (t) CDF of object agesCDF of requests old objects new objects

37 Long time to reach optimal Media objects have long lifespan –Most requested objects are created long time ago –Most requests are for objects created long time ago To achieve maximal concentration –Very long time (months to years) –Huge amount of storage –Only peer-to-peer systems provide such a huge space with a long time 200 days150 days 50%

38 Summary Media access patterns do not fit Zipf model We give reasons why previous results were confusing Media access patterns are stretched exponential Our findings imply that –Client-server based proxy systems are not effective to deliver media contents –P2P systems are most suitable for this purpose We provide an analytical basis for the effectiveness of a P2P media content delivery infrastructure

39 Centralized Internet accesses follows zipf Decentralized Internet accesses (in an organized way, such as P2P) follow SE Other P2P-like accesses follow SE –Social networks –Instant messages, QQ, VoIP, … –Dictionary-type searches: Wikipedia, Yahoo answers SE distributions also exist in business and sciences. Stretched Exponential Distribution: Decentralized Content Delivery in Internet

40 References  The stretched exponential distribution, PODC ’ 08  PSM-throttling, streaming in WLAN with low power, ICNP’07  SCAP, wireless AP caching for streaming, ICDCS ’ 07.  Quality and resource utilization of Internet streaming, IMC ’ 06  Internet streaming workload analysis, WWW’05  Measuring and modeling BitTorrent, IMC’05  Sproxy, caching for streaming, INFOCOM’04

41 Caching effect on SE distribution Pages can be cached by web browser and proxy Page reload events can be accumulated over time –Trivial in one week –Increase with time gradually Number of affected objects is small –Movie replay events are not common IFILM-06 (1-16 weeks) Page clicks of movie trailers, published in IFILM Web site

42 After workload collection Before workload collection Client requests with time PS-CLT-04 (9 days) ST-CLT-04 (9 days)ST-CLT-05 (11 days) Cumulative number of requested objects born before and after workload collection In short duration, media reference rank distribution is stationary Both are linear N’(t)  t constant

43 Segment-based streaming media caching Streaming media are often partially accessed –Segment caching is efficient “Ideal” segment reference rank distribution –M segments per object, no partial access –Two mode SE distribution M

44 Segment-based streaming media caching Segmentation: 5 seconds of media data Segments rank distribution: two-mode SE –Same stretch factor c –Smaller a than object rank distribution Less temporal locality ST-SVR-01ST-CLT-04ST-CLT-05

45 Segment caching performance Segment hit ratio ≈ byte hit ratio –Much lower than object hit ratio LRU is less efficient for media caching –Less request concentration –Larger working set Segment LFU is even worse –Sequential order in an object not captured ST-CLT-04ST-CLT-05LRU efficiency Conventional replacement policies are not efficient  (i): minimum cache size to hold object i

46 file A file D Performance study of BitTorrent CCDF of peer arrival raw data linear fit Time after torrent birth (day) trace model time (hour) # of seeds constant arrival model C B D A Single torrent model: R’ fail = 0.1 R fail Multi-torrent model: R’ fail = R -6 fail

47 P2P-assisted proxy caching system Internet Media Server Media Proxy P2P Overlay Intranet DHT Proxy: reliability Peers: scalability