Download presentation
Presentation is loading. Please wait.
Published bySolomon Dennis Modified over 9 years ago
1
Content Distribution March 6, 2012 2: Application Layer1
2
Contents r P2P architecture and benefits r P2P content distribution r Content distribution network (CDN) 2: Application Layer2
3
3 Pure P2P architecture r no always-on server r arbitrary end systems directly communicate r peers are intermittently connected and change IP addresses r Three topics: File distribution Searching for information Case Study: Skype peer-peer
4
2: Application Layer4 File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) File, size F u s : server upload bandwidth u i : peer i upload bandwidth d i : peer i download bandwidth
5
2: Application Layer5 File distribution time: server-client usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r server sequentially sends N copies: NF/u s time r client i takes F/d i time to download increases linearly w.r.t. N (for large N) = d cs = max { NF/u s, F/min(d i ) } i Time to distribute F to N clients using client/server approach
6
2: Application Layer6 File distribution time: P2P usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r server must send one copy: F/u s time r client i takes F/d i time to download r NF bits must be downloaded (aggregate) fastest possible upload rate: u s + u i d P2P = max { F/u s, F/min(d i ), NF/(u s + u i ) } i
7
2: Application Layer7 Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, u s = 10u, d min ≥ u s Client server ~ NF/u s vs. P2P ~ NF/(u s + u i )
8
Contents r P2P architecture and benefits r P2P content distribution r Content distribution network (CDN) 2: Application Layer8
9
P2P content distribution issues r Issues Group management and data search Reliable and efficient file exchange Security/privacy/anonymity/trust r Approaches for group management and data search (i.e., who has what?) Centralized (e.g., BitTorrent tracker) Unstructured (e.g., Gnutella) Structured (Distributed Hash Tables [DHT]) 2: Application Layer9
10
Centralized model (Napster) original “Napster” design 1) when peer connects, it informs central server: IP address content 2) Alice queries for “Hey Jude”; server notifies that Bob has the file.. 3) Alice requests file from Bob centralized directory server peers Alice Bob 1 1 1 1 2 3 2: Application Layer10 Q: “Hey Jude” A: Bob has it
11
Centralized model BobAlice JaneJudy file transfer is decentralized, but locating content is highly centralized 2: Application Layer11
12
Centralized model r Benefits: Low per-node state Limited bandwidth usage Short search time High success rate Fault tolerant r Drawbacks: Single point of failure Limited scale Possibly unbalanced load r copyright infringement (?) BobAlice JaneJudy 2: Application Layer12
13
2: Application Layer13 File distribution: BitTorrent tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain a list of peers trading chunks peer r P2P file distribution
14
2: Application Layer14 BitTorrent (1) r file divided into 256KB chunks. r peer joining torrent: has no chunks, but will accumulate them over time registers with tracker to get list of peers, connects to subset of peers (“neighbors”) r while downloading, peer uploads chunks to other peers. r peers may come and go r once peer has entire file, it may (selfishly) leave or (altruistically) remain
15
2: Application Layer15 BitTorrent (2) Pulling Chunks r at any given time, different peers have different subsets of file chunks r periodically, a peer (Alice) asks each neighbor for a list of chunks that it has. r Alice sends requests for her missing chunks rarest first Sending Chunks: tit-for-tat r Alice sends chunks to four neighbors currently sending her chunks at the highest rate re-evaluate top 4 every 10 secs r every 30 secs: randomly select another peer, starts sending chunks newly chosen peer may join top 4 “optimistically unchoke”
16
2: Application Layer16 BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster!
17
2: Application Layer17 P2P Case study: Skype r inherently P2P: pairs of users communicate. r proprietary application-layer protocol (inferred via reverse engineering) r hierarchical overlay with super nodes (SNs) r Index maps usernames to IP addresses; distributed over SNs Skype clients (SC) Supernode (SN) Skype login server
18
2: Application Layer18 Peers as relays r Problem when both Alice and Bob are behind “NATs”. NAT prevents an outside peer from initiating a call to insider peer r Solution: Using Alice’s and Bob’s SNs, Relay is chosen Each peer initiates session with relay. Peers can now communicate through NATs via relay
19
Contents r P2P architecture and benefits r P2P content distribution r Content distribution network (CDN) 2: Application Layer19
20
Why Content Networks? r More hops between client and Web server more congestion! r Same data flowing repeatedly over links between clients and Web server S C1 C4 C2 C3 - IP router Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt 2: Application Layer20
21
Why Content Networks? r Origin server is bottleneck as number of users grows r Flash Crowds (for instance, Sept. 11) r The Content Distribution Problem: Arrange a rendezvous between a content source at the origin server (www.cnn.com) and a content sink (us, as users) Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt 2: Application Layer21
22
Example: Web Server Farm r Simple solution to the content distribution problem: deploy a large group of servers r Arbitrate client requests to servers using an “intelligent” L4-L7 switch r Pretty widely used today L4-L7 Switch Request from grad.umd.edu Request from ren.cis.udel.edu Request from ren.cis.udel.edu Request from grad.umd.edu www.cnn.com (Copy 1) www.cnn.com (Copy 3) www.cnn.com (Copy 2) 2: Application Layer22
23
Example: Caching Proxy r Majorly motivated by ISP business interests – reduction in bandwidth consumption of ISP from the Internet r Reduced network traffic r Reduced user perceived latency Client ren.cis.udel.edu Client merlot.cis.ud el.edu Intercepters Proxy www.cnn.com Internet TCP port 80 traffic Other traffic ISP 2: Application Layer23
24
But on Sept. 11, 2001 2: Application Layer24 Web Server www.cnn.com User mslab.kaist.ac.kr 1000,000 other hosts 1000,000 other hosts New Content WTC News! old content request - Caching Proxy ISP - Congestion / Bottleneck
25
Problems with discussed approaches: Server farms and Caching proxies r Server farms do nothing about problems due to network congestion r Caching proxies serve only their clients, not all users on the Internet r Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies r Accounting issues with caching proxies. For instance, www.cnn.com needs to know the number of hits to the webpage for advertisements displayed on the webpage 2: Application Layer25
26
Again on Sept. 11, 2001 with CDN 2: Application Layer26 Web Server www.cnn.com User mslab.kaist.ac.kr New Content WTC News! request new content 1000,000 other users 1000,000 other users - Surrogate - Distribution Infrastructure FL IL DE NY MA MI CA WA
27
Web replication - CDNs r Overlay network to distribute content from origin servers to users r Avoids large amount of same data repeatedly traversing potentially congested links on the Internet r Reduces Web server load r Reduces user perceived latency r Tries to route around congested networks 2: Application Layer27
28
CDN vs. Caching Proxies r Caches are used by ISPs to reduce bandwidth consumption, CDNs are used by content providers to improve quality of service to end users r Caches are reactive, CDNs are proactive r Caching proxies cater to their users (web clients) and not to content providers (web servers), CDNs cater to the content providers (web servers) and clients r CDNs give control over the content to the content providers, caching proxies do not 2: Application Layer28
29
CDN Architecture Surrogate Request Routing Infrastructure Distribution & Accounting Infrastructure CDN Origin Server Client 2: Application Layer29
30
CDN Components r Distribution Infrastructure: Moving or replicating content from content source (origin server, content provider) to surrogates r Request Routing Infrastructure: Steering or directing content request from a client to a suitable surrogate r Content Delivery Infrastructure: Delivering content to clients from surrogates r Accounting Infrastructure: Logging and reporting of distribution and delivery activities 2: Application Layer30
31
Server Interaction with CDN Distribution Infrastructure 1 1. Origin server pushes new content to CDN OR CDN pulls content from origin server Accounting Infrastructure 2 2. Origin server requests logs and other accounting info from CDN OR CDN provides logs and other accounting info to origin server CDN Origin Server www.cnn.com 2: Application Layer31
32
Request Routing Infrastructure Client Interaction with CDN 1 1. Hi! I need www.cnn.com/sept11 2 2. Go to surrogate newyork.cnn.akamai.com 3 3. Hi! I need content /sept11 Q: How did the CDN choose the New York surrogate over the California surrogate ? Client Surrogate (NY) Surrogate (CA) CDN california.cnn.akamai.com newyorkcnn.akamai.com 2: Application Layer32
33
Request Routing Techniques r Request routing techniques use a set of metrics to direct users to “best” surrogate r Proprietary, but underlying techniques known: DNS based request routing Content modification (URL rewriting) Anycast based (how common is anycast?) URL based request routing Transport layer request routing Combination of multiple mechanisms 2: Application Layer33
34
DNS based Request-Routing r Common due to the ubiquity of DNS as a directory service r Specialized DNS server inserted in a DNS resolution process r DNS server is capable of returning a different set of A, NS or CNAME records based on policies/metrics 2: Application Layer34
35
DNS based Request-Routing Akamai DNS DNS query: www.cnn.com DNS response: A 145.155.10.15 Session local DNS server (dns.nyu.edu) 128.4.4.12 1) DNS query: www.cnn.com DNS response: A 145.155.10.15 www.cnn.com Surrogate 145.155.10.15 Surrogate 58.15.100.152 Akamai CDN test.nyu.edu 128.4.30.15 newyork.cnn.akamai.com california.cnn.akamai.com newyork.cnn.akamai.com Q: How does the Akamai DNS know which surrogate is closest ? 2: Application Layer35
36
DNS based Request-Routing DNS query Akamai DNS www.cnn.com Surrogate Akamai CDN test.nyu.edu 128.4.30.15 local DNS server (dns.nyu.edu) 128.4.4.12 DNS query Measure to Client DNS Measure to Client DNS Measurement results Measurements 2: Application Layer36
37
DNS based Request-Routing www.cnn.com Client DNS 76.43.32.4 Surrogate 145.155.10.15 Surrogate 58.15.100.152 Akamai DNS Akamai CDN Client 76.43.35.53 Requesting DNS - 76.43.32.4 Surrogate - 145.155.10.15 www.cnn.com A 145.155.10.15 TTL = 10s Requesting DNS - 76.43.32.4 Available Bandwidth = 10 kbps RTT = 10 ms Requesting DNS - 76.43.32.4 Available Bandwidth = 5 kbps RTT = 100 ms 2: Application Layer37
38
DNS based Request Routing: Discussion r Originator Problem: Client may be far removed from client DNS r Client DNS Masking Problem: Virtually all DNS servers, except for root DNS servers honor requests for recursion Q: Which DNS server resolves a request for test.nyu.edu? Q: Which DNS server performs the last recursion of the DNS request? r Hidden Load Factor: A DNS resolution may result in drastically different load on the selected surrogate – issue in load balancing requests, and predicting load on surrogates 2: Application Layer38
39
Summary r P2P architecture and its benefits r P2P content distribution BitTorrent, Skype r Content distribution network (CDN) DNS-based request routing 2: Application Layer39
40
Distributed Hash Table (DHT) r DHT = distributed P2P database r Database has (key, value) pairs; key: ss number; value: human name key: content type; value: IP address r Peers query DB with key DB returns values that match the key r Peers can also insert (key, value) peers 2: Application Layer40
41
DHT Identifiers r Assign integer identifier to each peer in range [0,2 n -1]. Each identifier can be represented by n bits. r Require each key to be an integer in same range. r To get integer keys, hash original key. eg, key = h(“Led Zeppelin IV”) This is why they call it a distributed “hash” table 2: Application Layer41
42
How to assign keys to peers? r Central issue: Assigning (key, value) pairs to peers. r Rule: assign key to the peer that has the closest ID. r Convention in lecture: closest is the immediate successor of the key. r Ex: n=4; peers: 1,3,4,5,8,10,12,14; key = 13, then successor peer = 14 key = 15, then successor peer = 1 2: Application Layer42
43
1 3 4 5 8 10 12 15 Chord (a circular DHT) (1) r Each peer only aware of immediate successor and predecessor. r “Overlay network” 2: Application Layer43
44
Chord (a circular DHT) (2) 0001 0011 0100 0101 1000 1010 1100 1111 Who’s resp for key 1110 ? I am O(N) messages on avg to resolve query, when there are N peers 1110 Define closest as closest successor 2: Application Layer44
45
Chord (a circular DHT) with Shortcuts r Each peer keeps track of IP addresses of predecessor, successor, short cuts. r Reduced from 6 to 2 messages. r Possible to design shortcuts so O(log N) neighbors, O(log N) messages in query 1 3 4 5 8 10 12 15 Who’s resp for key 1110? 2: Application Layer45
46
Peer Churn r Peer 5 abruptly leaves r Peer 4 detects; makes 8 its immediate successor; asks 8 who its immediate successor is; makes 8’s immediate successor its second successor. r What if peer 13 wants to join? 1 3 4 5 8 10 12 15 To handle peer churn, require each peer to know the IP address of its two successors. Each peer periodically pings its two successors to see if they are still alive. 2: Application Layer46
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.