1 Peer-to-Peer Media Streaming Mohamed M. Hefeeda Advisor: Prof. Bharat Bhargava March 12, 2003.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
On Large-Scale Peer-to-Peer Streaming Systems with Network Coding Chen Feng, Baochun Li Dept. of Electrical and Computer Engineering University of Toronto.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
PROMISE: Peer-to-Peer Media Streaming Using CollectCast Mohamed Hafeeda, Ahsan Habib et al. Presented By: Abhishek Gupta.
1 On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda Advisor: Prof. Bharat Bhargava October 16, 2002.
1 A Hybrid Architecture for Cost-Effective On-Demand Media Streaming Mohamed Hefeeda & Bharat Bhargava CS Dept, Purdue University Support: NSF, CERIAS.
11/4/2003ACM Multimedia 2003, Berkeley, CA1 A Framework for Cost-Effective Peer- to-Peer Content Distribution Mohamed Hefeeda Ph.D. Candidate Advisor:
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada ISP-Friendly Peer Matching without ISP Collaboration Mohamed Hefeeda (Joint.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
ZIGZAG A Peer-to-Peer Architecture for Media Streaming By Duc A. Tran, Kien A. Hua and Tai T. Do Appear on “Journal On Selected Areas in Communications,
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
CS 268: Project Suggestions Ion Stoica February 6, 2003.
1March -05 Jiangchuan Liu with Xinyan Zhang, Bo Li, and T.S.P.Yum Infocom 2005 CoolStreaming/DONet: A Data-Driven Overlay Network for Peer-to-Peer Live.
1 An Overlay Scheme for Streaming Media Distribution Using Minimum Spanning Tree Properties Journal of Internet Technology Volume 5(2004) No.4 Reporter.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
A Framework for Cost-Effective Peer-to- Peer Content Distribution Mohamed Hefeeda and Bharat Bhargava Department of Computer Sciences Purdue University.
Peer-to-peer Multimedia Streaming and Caching Service by Won J. Jeon and Klara Nahrstedt University of Illinois at Urbana-Champaign, Urbana, USA.
Loopback: Exploiting Collaborative Caches for Large-Scale Streaming Ewa Kusmierek, Yingfei Dong, Member, IEEE, and David H. C. Du, Fellow, IEEE.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
P2P File Sharing Systems
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
“Intra-Network Routing Scheme using Mobile Agents” by Ajay L. Thakur.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 School of Computing Science Simon Fraser University CMPT 880: Internet Architectures and Protocols Introduction to Peer-to-Peer Systems Instructor: Dr.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
Distributing Layered Encoded Video through Caches Authors: Jussi Kangasharju Felix HartantoMartin Reisslein Keith W. Ross Proceedings of IEEE Infocom 2001,
Resilient Peer-to-Peer Streaming Presented by: Yun Teng.
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Peer-to-Peer Media Streaming ZIGZAG - Ye Lin PROMISE – Chanjun Yang SASABE - Kung-En Lin.
CoopNet: Cooperative Networking
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Cost-Effective Video Streaming Techniques Kien A. Hua School of EE & Computer Science University of Central Florida Orlando, FL U.S.A.
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
Overlay Networking Overview.
EE 122: Lecture 22 (Overlay Networks)
Presentation transcript:

1 Peer-to-Peer Media Streaming Mohamed M. Hefeeda Advisor: Prof. Bharat Bhargava March 12, 2003

2 Outline  Brief introduction to P2P  Scope/Objective  Current media streaming approaches  Proposed approach: P2P framework -Definitions, P2P model -Advantages and challenges  Architectures (realization of the model) -Hybrid Searching and dispersion algorithms -Pure P2P (in progress)  Evaluation -P2P model -Dispersion algorithm  Conclusions and future work

3 P2P Systems: Basic Definitions  Peers cooperate to achieve desired functions -Cooperate: share resources (CPU, storage, bandwidth), participate in the protocols (routing, replication, …) -Functions: file-sharing, distributed computing, communications, …  Examples -Gnutella, Napster, Freenet, OceanStore, CFS, CoopNet, SpreadIt, …  Well, aren’t they just distributed systems? -P2P == distributed systems?

4 P2P vs. Distributed Systems  P2P = distributed systems++; -Ad-hoc nature -Peers are not servers [Saroui et al., MMCN’02 ] Limited capacity and reliability -Much more dynamism -Scalability is a more serious issue (millions of nodes) -Peers are self-interested (selfish!) entities 70% of Gnutella users share nothing [Adar and Huberman ’00] -All kind of Security concerns Privacy, anonymity, malicious peers, … you name it!

5 P2P Systems: Rough Classification [Lv et al., ICS’02], [Yang et al., ICDCS’02]  Structured (or tightly controlled, DHT) +Files are rigidly assigned to specific nodes +Efficient search & guarantee of finding –Lack of partial name and keyword queries Ex.: Chord [Stoica et al., SIGCOMM’01], CAN [Ratnasamy et al., SIGCOMM’01], Pastry [Rowstron and Druschel, Middleware’01]  Unstructured (or loosely controlled) +Files can be anywhere +Support of partial name and keyword queries –Inefficient search (some heuristics exist) & no guarantee of finding Ex.: Gnutella  Hybrid (P2P + centralized), super peers notion) -Napster, KazaA

6 Scope/Objective  A media streaming service (video on demand) that: -Provides good quality -To a large number of clients -In a cost-effective manner  Main focus is on media distribution (or communication aspects)  Media storage and encoding/decoding techniques are orthogonal to our work.

7  Terminologies -Content provider -Clients -Third party (delivery)  Two broad categories -Direct approach Content provider  clients -Third-party approach Content provider  delivery network  clients Classification of the Current Streaming Approaches

8 Direct Approach  Content provider deploys and manages a powerful server or a set of servers/caches

9 Direct Approach (cont’d)  Problems -Limited scalability -Reliability concerns -High deployment cost $$$…..$  Note: -A server with T3 link (~45 Mb/s) supports up to 45 concurrent users at 1Mb/s!

10 Third-Party Approach  Third-party or Content Delivery Network (CDN) -Deploy thousands of servers at the “edge” of the Internet; mainly at POPs of major ISPs (AT&T, Sprint, …) (Akamai deploys 10,000+ servers) [Akamai white paper] -“Edge” of the Internet  Contents close to clients Better performance and less load on the backbone -Proprietary protocols to Distribute contents over servers (caches) Monitor traffic situation in the Internet Direct clients to “most” suitable cache

11 Third-Party Approach (cont’d)

12 Third-Party Approach (cont’d)  Pros -Good performance (short delay, more reliability, …) -Suitable for web pages with moderate-size objects (images, video clips, documents, etc.)  Cons -Co$t: CDN charges for every megabyte served!  -Not suitable for VoD service; movies are quite large (~Gbytes)  Note: [Raczkowski’02, white paper] -Cost ranges from 0.25 to 2 cents/MByte, depending on bandwidth consumed per month -For a one-hour movie streamed to 1,000 clients, content provider pays $264 to CDN (at 0.5 cents/MByte)!

13 Potential Solution: P2P Model  Idea -Clients (peers) share some of their spare resources (BW, storage) with each other -Result: combine enormous amount of resources into one pool  significantly amplifies system capacity -Why should peers cooperate? [Saroui et al., MMCN’02 ] They get benefits too! Incentives: e.g., lower rates [Cost-profit analysis, Hefeeda et al., TR’02]

14 P2P Model Proposed P2P model Peers Seeding peers Stream Media files Entities

15 P2P Model: Entities  Peers -Supplying peers Currently caching and willing to provide some segments Level of cooperation; every peer P x specifies:  G x (Bytes),  R x (Kb/s),  C x (Concurrent connections) -Requesting peers  Seeding peers -One (or a subset) of the peers seeds the new media into the system -Seed  stream to a few other peers for a limited duration

16 P2P Model: Entities (cont'd)  Stream -Time-ordered sequence of packets  Media file -Recorded at R Kb/s (CBR) -Composed of N equal-length segments -A segment is the minimum unit to be cached by a peer -A segment can be obtained from several peers at the same time (different piece from each)

17 P2P Model: Advantages  Cost effectiveness -For both supplier and clients -Initial results in [Hefeeda et al., TR’02] -On-going work in cooperation with Professor Philipp Afeche (Kellogg School of Management, Northwestern University) to: Develop more formal economic models Design incentive schemes Design pricing schemes  Ease of deployment -No need to change the network (routers) -A piece of software on the client’s machine

18 P2P Model: Advantages (cont'd)  Robustness -High degree of redundancy -Reduce (gradually eliminate) the role of the seeding server  Support for large number of clients -Capacity More peers join  more resources  larger capacity -Network Save downstream bandwidth; get the request from a nearby peer Contents are even closer to the clients (within the same domain!)

19 P2P Model: Challenges  Searching -Find peers who have the requested file  Dispersion -Efficiently disseminate the media files into the system  Maintaining comparable quality -Given a dynamic set of candidate senders, design a Distributed Streaming protocol that ensures the full quality of play back at the receiver  Robustness -Handle node failures and network fluctuations  Security -Malicious peers, free riders, …

20 Realization of the P2P Model  Two architectures to realize the abstract model  Hybrid [Hefeeda et al., FTDCS’03; submitted to J. Com. Net.] -P2P streaming + index-assisted searching/dispersion  Pure P2P -Peers form an overlay layer over the physical network -Built on top of a P2P substrate such as Pastry [Rowstron and Druschel, Middleware 2001] -On-going work

21 Hybrid Architecture  Streaming is P2P; searching and dispersion are server-assisted  Index server facilitates the searching process and reduces the overhead associated with it  Suitable for a commercial service -Need server to charge/account anyway, and -Faster to deploy  Seeding servers may maintain the index as well (especially, if commercial)

22 Hybrid Architecture: Searching  Requesting peer, P x -Send a request to the index server:  Index server -Find peers who have segments of fileID AND close to P x - close in terms of network hops  Traffic traverses fewer hops, thus Reduced load on the backbone Less susceptible to congestion Short and less variable delays (smaller delay jitter)  Clustering idea [Krishnamurthy et al., SIGCOMM’00]

23 Hybrid Architecture: Peers Clustering  A cluster is: -A logical grouping of clients that are topologically close and likely to be within the same network domain  Clustering Technique -Get routing tables from core BGP routers -Clients with IP’s having the same longest prefix with one of the entries are assigned the same cluster ID -Example: Domains: /16 (purdue), /16 (cmu) Peers: , , , ,

24 Hybrid Architecture: Dispersion  Objective -Store enough copies of the media file in each cluster to serve all expected requests from that cluster -We assume that peers get monetary incentives from the provider to store and stream to other peers  Questions -Should a peer cache? And if so, -Which segments?  Illustration (media file with 2 segments) -Caching 90 copies of segment 1 and only 10 copies of segment 2  10 effective copies -Caching 50 copies of segment 1 and 50 copies of segment 2  50 effective copies

25 Hybrid Architecture: Dispersion (cont'd)  Dispersion Algorithm (basic idea): -/* Upon getting a request from P y to cache N y segments */ -C  getCluster (P y ) -Compute available (A) and required (D) capacities in cluster C -If A < D P y caches N y segments in a cluster-wide round robin fashion (CWRR) –All values are smoothed averages –Average available capacity in C: –CWRR Example: (10-segment file) P 1 caches 4 segments: 1,2,3,4 P 2 then caches 7 segments: 5,6,7,8,9,10,1

26 Hybrid Architecture: Client Protocol  Building blocks of the protocol to be run by a requesting peer  Three phases -Availability check -Streaming -Caching

27 Hybrid Architecture: Client Protocol (cont’d)  Phase I: Availability check (who has what) -Search for peers that have segments of the requested file P j -Arrange the collected data into a 2-D table, row j contains all peers P j willing to provide segment j -Sort every row based on network proximity -Verify availability of all the N segments with the full rate R:

28 Hybrid Architecture: Client Protocol (cont'd)  Phase II: Streaming t j = t j-1 + δ /* δ: time to stream a segment */ For j = 1 to N do At time t j, get segment s j as follows: P jConnect to every peer P x in P j (in parallel) and Download from byte b x-1 to b x -1 Note: b x = |s j | R x /R Example: P 1, P 2, and P 3 serving different pieces of the same segment to P 4 with different rates

29 Hybrid Architecture: Client Protocol (cont'd)  Phase III: Caching -Store some segments -Determined by the dispersion algorithm, and -Peer’s level of cooperation

30 Evaluation Through Simulation  Performance of the hybrid architecture -Under several client arrival patterns (constant rate, flash crowd, Poisson) and different levels of peer cooperation -Performance measures Overall system capacity, Average waiting time, Average number of served (rejected) requests, and Load/Role on the seeding server  Performance of the dispersion algorithm -Compare against random dispersion algorithm

31 Simulation: Topology –Large (more than 13,000 nodes) –Hierarchical (Internet-like) –Used GT-ITM and ns-2

32 Hybrid Architecture Evaluation  Topology details -20 transit domains, 200 stub domains, 2,100 routers, and a total of 11,052 end hosts  Scenario -A seeding server with limited capacity (up to 15 clients) introduces a movie -Clients request the movie according to the simulated arrival pattern -Client protocol is applied  Fixed parameters -Media file of 20 min duration, divided into 20 one-min segments, and recorded at 100 Kb/s (CBR)

33 Hybrid Architecture Evaluation (cont'd) Constant rate arrivals: waiting time ─ Average waiting time decreases as the time passes It decreases faster with higher caching percentages

34 Hybrid Architecture: Evaluation (cont'd) Constant rate arrivals: service rate ─ Capacity is rapidly amplified All requests are satisfied after 250 minutes with 50% caching ─ Q: Given a target arrival rate, what is the appropriate caching%? When is the steady state? Ex.: 2 req/min  30% sufficient, steady state within 5 hours

35 Hybrid Architecture: Evaluation (cont'd) Constant rate arrivals: rejection rate ─ Rejection rate is decreasing with time No rejections after 250 minutes with 50% caching ─ Longer warm up period is needed for smaller caching percentages

36 Hybrid Architecture: Evaluation (cont'd) Constant rate arrivals: load on the seeding server ─ The role of the seeding server is diminishing For 50%: After 5 hours, we have 100 concurrent clients (6.7 times original capacity) and none of them is served by the seeding server

37 Hybrid Architecture: Evaluation (cont'd) Flash crowd arrivals: waiting time ─ Flash crowd arrivals  surge increase in client arrivals ─ Waiting time is zero even during the peak (with 50% caching)

38 Hybrid Architecture: Evaluation (cont'd) Flash crowd arrivals: service rate ─ All clients are served with 50% caching ─ Smaller caching percentages need longer warm up periods to fully handle the crowd

39 Hybrid Architecture: Evaluation (cont'd) Flash crowd arrivals: rejection rate ─ No clients turned away with 50% caching

40 Hybrid Architecture: Evaluation (cont'd) Flash crowd arrivals: load on the seeding server ─ The role of the seeding server is still just seeding During the peak, we have 400 concurrent clients (26.7 times original capacity) and none of them is served by the seeding server (50% caching)

41 Dispersion Algorithm: Evaluation  Topology details -100 transit domains, 400 stub domains, 2,400 routes, and a total of 12,021 end hosts Distribute clients over a wider range  more stress on the dispersion algorithm  Compare against a random dispersion algorithm -No other dispersion algorithms fit our model  Comparison criterion -Average number of network hops traversed by the stream  Vary the caching percentage from 5% to 90% -Smaller cache %  more stress on the algorithm

42 Dispersion Algorithm: Evaluation (cont'd) ─ Avg. number of hops: 8.05 hops (random), 6.82 hops (ours)  15.3% savings ─ For a domain with a 6-hop diameter: Random: 23% of the traffic was kept inside the domain Cluster-based: 44% of the traffic was kept inside the domain 5% caching

43 Dispersion Algorithm: Evaluation (cont'd) ─ As the caching percentage increases, the difference decreases; peers cache most of the segments, hence no room for enhancement by the dispersion algorithm 10% caching

44 Conclusions  Presented a new model for on-demand media streaming  Proposed two architectures to realize the model -Hybrid and Pure P2P  Presented dispersion and searching algorithms  Through large-scale simulation, we showed that -Our model successfully supports large number of clients Arriving to the system with various distributions, including flash crowds -Our dispersion algorithm pushes the contents close to the clients (within the same domain)  Reduces number of hops traversed by the stream and the load on the network

45 Future Work  Work out the details of the overlay approach  Address the reliability and security challenges  Develop a detailed cost-profit model for the P2P architecture to show its cost effectiveness compared to the conventional approaches  Implement a system prototype and study other performance metrics, e.g., delay, delay jitter, and loss rate  Enhance the proposed algorithms and formally analyze them

46

47 P2P: File-sharing vs. Streaming  File-sharing -Download the entire file first, then use it -Small files (few Mbytes)  short download time -A file is stored by one peer  one connection -No timing constraints  Streaming -Consume (playback) as you download -Large files (few Gbytes)  long download time -A file is stored by multiple peers  several connections -Timing is crucial

48 Current Streaming Approaches (cont'd)  P2P approaches -SpreadIt [Deshpande et al., Stanford TR’01] Live media  Build application-level multicast distribution tree over peers -CoopNet [Padmanabhan et al., NOSSDAV’02 and IPTPS’02] Live media  Builds application-level multicast distribution tree over peers On-demand  Server redirects clients to other peers  Assumes a peer can (or is willing to) support the full rate  CoopNet does not address the issue of quickly disseminating the media file

49 Current Streaming Approaches (cont'd)  Distributed caches [e.g., Chen and Tobagi, ToN’01 ] -Deploy caches all over the place -Yes, increases the scalability Shifts the bottleneck from the server to caches! -But, it also multiplies cost -What to cache? And where to put caches?  Multicast -Mainly for live media broadcast -Application level [Narada, NICE, Scattercast, … ] Efficient? -IP level [e.g., Dutta and Schulzrine, ICC’01] Widely deployed?