Presentation is loading. Please wait.

Presentation is loading. Please wait.

P2P Networking and Content Distribution

Similar presentations


Presentation on theme: "P2P Networking and Content Distribution"— Presentation transcript:

1 P2P Networking and Content Distribution
March 28, 2013 2: Application Layer

2 Announcements H/W due today (Calendar, Packet pair)
Calendar app 1 week extension is possible (but w/ 10% point deduction) Meeting w/ project mentors by Monday Project plan presentation Introduction/background Problem definition (or research questions) Related work (no need to be complete) Approach (+supporting materials) Plans (including refining research questions + experimenting about ideas) 2: Application Layer

3 Reviews Network app: client-server, p2p, hybrid Programming: socket
Addressing issues Transport layer vs. service requirements TCP vs. UDP (differences) HTTP: persistent vs. non-persistent HTTP: cookies DNS: distributed, hierarchical DB DNS name hierarchy vs. Internet's topology DNS resolution: iterative vs. recursive 2: Application Layer

4 Contents P2P architecture and benefits P2P content distribution
Content distribution network (CDN) 2: Application Layer

5 Pure P2P architecture no always-on server
arbitrary end systems directly communicate peers are intermittently connected and change IP addresses Three topics: File distribution Searching for information Case Study: Skype peer-peer 2: Application Layer

6 File Distribution: Server-Client vs P2P
Question : How much time to distribute file from one server to N peers? us: server upload bandwidth Server ui: peer i upload bandwidth u1 d1 u2 d2 us di: peer i download bandwidth File, size F dN Network (with abundant bandwidth) uN 2: Application Layer

7 File distribution time: server-client
server sequentially sends N copies: NF/us time client i takes F/di time to download F u2 u1 d1 d2 us Network (with abundant bandwidth) dN uN = dcs = max {NF/us, F/min(di) } i Time to distribute F to N clients using client/server approach increases linearly w.r.t. N (for large N) 2: Application Layer

8 File distribution time: P2P
Server server must send one copy: F/us time client i takes F/di time to download NF bits must be downloaded (aggregate) F u2 u1 d1 d2 us Network (with abundant bandwidth) dN uN fastest possible upload rate: us + Sui dP2P = max { F/us, F/min(di) , NF/(us + Sui) } i 2: Application Layer

9 Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us Client server ~ NF/us vs. P2P ~ NF/(us + Sui) 2: Application Layer

10 Contents P2P architecture and benefits P2P content distribution
Content distribution network (CDN) 2: Application Layer

11 P2P content distribution issues
Group management and data search Reliable and efficient file exchange Security/privacy/anonymity/trust Approaches for group management and data search (i.e., who has what?) Centralized (e.g., BitTorrent tracker) Unstructured (e.g., Gnutella) Structured (Distributed Hash Tables [DHT]) 2: Application Layer

12 Centralized model (Napster)
Bob original “Napster” design 1) when peer connects, it informs central server: IP address content 2) Alice queries for “Hey Jude”; server notifies that Bob has the file.. 3) Alice requests file from Bob centralized directory server 1 3 peers 1 1 1 2 Q: “Hey Jude” A: Bob has it Alice 2: Application Layer

13 Centralized model Bob Alice Jane Judy file transfer is decentralized, but locating content is highly centralized Upload index to central server when you come online To search, consult central server Request doc directly 2: Application Layer

14 Centralized model Benefits: Drawbacks: Low per-node state
Limited bandwidth usage Short search time High success rate Fault tolerant Drawbacks: Single point of failure Limited scale Possibly unbalanced load Bob Alice Upload index to central server when you come online To search, consult central server Request doc directly Judy Jane 2: Application Layer

15 File distribution: BitTorrent
P2P file distribution tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain a list of peers trading chunks peer 2: Application Layer

16 BitTorrent (1) file divided into 256KB chunks. peer joining torrent:
has no chunks, but will accumulate them over time registers with a tracker to get list of peers, connects to subset of peers (“neighbors”) while downloading, peer uploads chunks to other peers. peers may come online and go offline once peer has entire file, it may (selfishly) leave or (altruistically) remain 2: Application Layer

17 BitTorrent (2) Sending Chunks: tit-for-tat
Alice sends chunks to four neighbors currently sending her chunks at the highest rate re-evaluate top 4 every 10 secs every 30 secs: randomly select another peer, starts sending chunks newly chosen peer may join top 4 “optimistically unchoke” Pulling Chunks at any given time, different peers have different subsets of file chunks periodically, a peer (Alice) asks each neighbor for a list of chunks that it has. Alice sends requests for her missing chunks rarest first 2: Application Layer

18 BitTorrent: Tit-for-tat
(1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster! 2: Application Layer

19 P2P Case study: Skype inherently P2P: pairs of users communicate.
Skype clients (SC) inherently P2P: pairs of users communicate. proprietary application-layer protocol (inferred via reverse engineering) hierarchical overlay with super nodes (SNs) Index maps usernames to IP addresses; distributed over SNs Skype login server Supernode (SN) 2: Application Layer

20 Peers as relays Problem when both Alice and Bob are behind “NATs”.
NAT prevents an outside peer from initiating a call to insider peer Solution: Using Alice’s and Bob’s SNs, Relay is chosen Each peer initiates session with relay. Peers can now communicate through NATs via relay 2: Application Layer

21 Contents P2P architecture and benefits P2P content distribution
Content distribution network (CDN) 2: Application Layer

22 Why Content Networks? More hops between client and Web server  more congestion! Same data flowing repeatedly over links between clients and Web server C1 C4 C2 C3 S - IP router 2: Application Layer Slides from

23 Why Content Networks? Origin server is bottleneck as number of users grows Flash Crowds (for instance, Sept. 11) The Content Distribution Problem: Arrange a rendezvous between a content source at the origin server ( and a content sink (us, as users) 2: Application Layer Slides from

24 Example: Web Server Farm
Simple solution to the content distribution problem: deploy a large group of servers L4-L7 Switch Request from grad.umd.edu ren.cis.udel.edu (Copy 1) (Copy 3) (Copy 2) Arbitrate client requests to servers using an “intelligent” L4-L7 switch Pretty widely used today 2: Application Layer

25 Example: Caching Proxy
Majorly motivated by ISP business interests – reduction in bandwidth consumption of ISP from the Internet Reduced network traffic Reduced user perceived latency ISP Other traffic Client ren.cis.udel.edu Intercepters Internet TCP port 80 traffic Client merlot.cis.udel.edu Proxy 2: Application Layer

26 But on Sept. 11, 2001 Web Server www.cnn.com 1000,000 other hosts ISP
User mslab.kaist.ac.kr 1000,000 other hosts New Content WTC News! old content request - Caching Proxy ISP Congestion / Bottleneck 2: Application Layer

27 Problems with discussed approaches: Server farms and Caching proxies
Server farms do nothing about problems due to network congestion Caching proxies serve only their clients, not all users on the Internet Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies Accounting issues with caching proxies. For instance, needs to know the number of hits to the webpage for advertisements displayed on the webpage 2: Application Layer

28 Again on Sept. 11, 2001 with CDN Web Server www.cnn.com WA CA MI IL MA
User mslab.kaist.ac.kr New Content WTC News! FL IL DE NY MA MI CA WA 1000,000 other users request new content Distribution Infrastructure - Surrogate 2: Application Layer

29 Web replication - CDNs Overlay network to distribute content from origin servers to users Avoids large amount of same data repeatedly traversing potentially congested links on the Internet Reduces Web server load Reduces user perceived latency Tries to route around congested networks 2: Application Layer

30 CDN vs. Caching Proxies Caches are used by ISPs to reduce bandwidth consumption, CDNs are used by content providers to improve quality of service to end users Caches are reactive, CDNs are proactive Caching proxies cater to their users (web clients) and not to content providers (web servers), CDNs cater to the content providers (web servers) and clients CDNs give control over the content to the content providers, caching proxies do not 2: Application Layer

31 CDN Architecture Origin Server CDN Client 2: Application Layer Request
Routing Infrastructure Distribution & Accounting Infrastructure Surrogate 2: Application Layer

32 CDN Components Distribution Infrastructure:
Moving or replicating content from content source (origin server, content provider) to surrogates Request Routing Infrastructure: Steering or directing content request from a client to a suitable surrogate Content Delivery Infrastructure: Delivering content to clients from surrogates Accounting Infrastructure: Logging and reporting of distribution and delivery activities 2: Application Layer

33 Server Interaction with CDN
Origin Server Distribution Infrastructure 1 Origin server pushes new content to CDN OR CDN pulls content from origin server Accounting Infrastructure 2 2. Origin server requests logs and other accounting info from CDN OR CDN provides logs and other accounting info to origin server 2: Application Layer

34 Client Interaction with CDN
Surrogate (NY) (CA) CDN california.cnn.akamai.com newyorkcnn.akamai.com 1 1. Hi! I need 2 Go to surrogate newyork.cnn.akamai.com Request Routing Infrastructure 3 3. Hi! I need content /sept11 Q: How did the CDN choose the New York surrogate over the California surrogate ? 2: Application Layer

35 Request Routing Techniques
Request routing techniques use a set of metrics to direct users to “best” surrogate Proprietary, but underlying techniques known: DNS based request routing Content modification (URL rewriting) Anycast based (how common is anycast?) URL based request routing Transport layer request routing Combination of multiple mechanisms 2: Application Layer

36 DNS based Request-Routing
Common due to the ubiquity of DNS as a directory service Specialized DNS server inserted in a DNS resolution process DNS server is capable of returning a different set of A, NS or CNAME records based on policies/metrics 2: Application Layer

37 DNS based Request-Routing
Surrogate Akamai CDN test.nyu.edu newyork.cnn.akamai.com california.cnn.akamai.com Q: How does the Akamai DNS know which surrogate is closest ? Akamai DNS DNS response: A DNS query: Session local DNS server (dns.nyu.edu) 1) DNS query: DNS response: A 2: Application Layer

38 DNS based Request-Routing
Akamai DNS Surrogate Akamai CDN test.nyu.edu Measurement results Measure to Client DNS DNS query Measurements local DNS server (dns.nyu.edu) DNS query 2: Application Layer

39 DNS based Request-Routing
Client DNS Surrogate Akamai DNS Akamai CDN Client Requesting DNS Surrogate Requesting DNS Available Bandwidth = 10 kbps RTT = 10 ms Available Bandwidth = 5 kbps RTT = 100 ms A TTL = 10s 2: Application Layer

40 DNS based Request Routing: Discussion
Originator Problem: Client may be far removed from client DNS Client DNS Masking Problem: Virtually all DNS servers, except for root DNS servers honor requests for recursion Q: Which DNS server resolves a request for test.nyu.edu? Q: Which DNS server performs the last recursion of the DNS request? Hidden Load Factor: A DNS resolution may result in drastically different load on the selected surrogate – issue in load balancing requests, and predicting load on surrogates 2: Application Layer

41 CDN Strategies Pushing content closer to the users: hop count reduction (overall network traffic reduction) CDN Strategies: Limelight  placing CDN servers near a small # of ISP core nets Akamai  placing CDN servers deep into a large # of ISP networks’ sites Nano Data Center (NaDa)  home gateways (STBs/modems) as CDN servers (peer-to-peer delivery among NaDa servers) Edge Router Core Router ONT OLT DSLAM Modem Access Metro/Edge Network Core Network NaDa Digital Media Delivery Platform

42 Summary P2P architecture and its benefits P2P content distribution
BitTorrent, Skype Content distribution network (CDN) DNS-based request routing 2: Application Layer


Download ppt "P2P Networking and Content Distribution"

Similar presentations


Ads by Google