CSE 30264 Computer Networks Prof. Aaron Striegel Department of Computer Science & Engineering University of Notre Dame Lecture 25 – April 15, 2010.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Course on Computer Communication and Networks Lecture 10 Chapter 2; peer-to-peer applications (and network overlays) EDA344/DIT 420, CTH/GU.
Spring 2003CS 4611 Naming Outline Terminology Domain Naming System Distributed File Systems.
CS 6401 The Domain Name System Outline Domain Name System.
Spring 2002CS 4611 Naming Outline Terminology Domain Naming System Distributed File Systems.
1 Higher level protocols Domain Naming System, DNS HTTP.
Spring 2006CS 3321 Name Service (DNS) Outline Terminology Domain Naming System.
Applications Outline Name Service (DNS) Traditional Applications.
CS440 Computer Networks 1 Domain Name System (DNS) Neil Tang 12/05/2008.
1 Naming Services (DNS) Name versus Address Name space –defines set of possible names –consists of a set of name to value bindings –flat (names are not.
1 Naming Outline Terminology Domain Naming System Distributed File Systems.
L-19 P2P. Scaling Problem Millions of clients  server and network meltdown 2.
Peer-to-Peer
Peer-to-Peer Jeff Pang Spring Spring 2004, Jeff Pang2 Intro Quickly grown in popularity –Dozens or hundreds of file sharing applications.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Peer-to-Peer Credits: Slides adapted from J. Pang, B. Richardson, I. Stoica, M. Cuenca, B. Cohen, S. Ramesh, D. Qui, R. Shrikant, Xiaoyan Li and R. Martin.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Wide-area cooperative storage with CFS
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
Peer-to-Peer Networking Credit slides from J. Pang, B. Richardson, I. Stoica.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Peer-to-Peer Scaling Problem Millions of clients  server and network meltdown.
Peer-to-Peer Outline p2p file sharing techniques –Downloading: Whole-file vs. chunks –Searching Centralized index (Napster, etc.) Flooding (Gnutella,
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Introduction Widespread unstructured P2P network
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 9
Lecturer: Maxim Podlesny Sep CSE 473 File Transfer and Electronic in Internet.
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Introduction of P2P systems
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Spring, 2001CS 6401 Addressing and Domain Name System Outline Addressing Subnetting Supernetting Domain Name System.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Chapter 9 Applications Giving user-friendly names (instead of router-friendly addresses) is often the 1 st application (middleware) implemented on a network.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
EE324 DISTRIBUTED SYSTEMS L17 P2P. Scaling Problem 2  Millions of clients  server and network meltdown.
Hui Zhang, Fall Computer Networking CDN and P2P.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
1 10. DNS, HTTP, Unix Socket Programming DNS (Domain Name Service) Domain Name Name Resolution HTTP (Hyper Text Transfer Protocol) Request Response Persistent.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Ch 2. Application Layer Myungchul Kim
CSE 486/586 Distributed Systems Distributed Hash Tables
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
CSE 486/586 Distributed Systems Distributed Hash Tables
Peer-to-Peer Data Management
Distributed Systems CDN & Peer-to-Peer.
EE 122: Peer-to-Peer (P2P) Networks
Consistent Hashing and Distributed Hash Table
CSE 486/586 Distributed Systems Distributed Hash Tables
Presentation transcript:

CSE Computer Networks Prof. Aaron Striegel Department of Computer Science & Engineering University of Notre Dame Lecture 25 – April 15, 2010

Today’s Lecture Project 4 –Overview / Discussion Applications – Part 1 –DNS –SMTP –HTTP –P2P Spring 2010CSE Physical Data Network Transport Application

Project 4 Implementation –Multi-threaded server –Performance will count Experimentation –Latency –Loss Spring 2010CSE Due: Thursday, April 29 th, 5 PM

Premise Spaceship racing game in alpha development –C# / WPF / Windows –Up to 4 players –Test network functionality Your task –Write a multi-threaded server that allows clients to connect and relays messages back and forth Twist –The protocol is changing Need to make your server agnostic with respect to message content Spring 2010CSE

Operation Client starts up Client sets its player name to a unique value –Responsibility of client, not server Client connects to the server –TCP, port defined in the client Client periodically sends data –Send status / information to server –Server relays that information to all other clients Spring 2010CSE

Client App Provided the alpha version –Core functionality Will also be provided a beta version (next week) –Additional polish  different message contents Near-final version –Different set of messages –Only for evaluation of your code Spring 2010CSE Alpha, beta  Provided to you

Your Task - Server Write a multi-threaded server –Listen for new TCP connections on a particular port –Receive messages / relay to all other clients Receive and forward No inspection or modification What is different –Server needs to be robust Client quits, error detection, etc. –Thread safety Potential for clients to exit, arrive while receiving / sending messages Mutexes –Performance Speedy or else the clients will lag Spring 2010CSE

Notes - Implementation Console input –Queries / status Thread – listen / accept Thread – each client Global object –List of sockets / etc. –Guard on write –Guard on modifying the list Spring 2010CSE

Experimentation Human experimenting –Alpha version – raw racing field –Beta version – collision detection / etc. Tinker with delay / loss –Slider bar  inject loss Fails to report a new status –Slider bar  inject delay Slightly queues the status report –Where does it start to seem off? Spring 2010CSE

Spring 2009CS Miscellaneous Outline Domain Name System Peer-to-Peer Networks

Spring 2009CS Name Service Names versus addresses Location transparent versus location-dependent Flat versus hierarchical Resolution mechanism Name server DNS: domain name system

Spring 2009CS Examples Hosts cheltenham.cs.princeton.edu :23:A8:33:5B:9F Files /usr/llp/tmp/foo fileid

Spring 2009CS Examples (cont) Mailboxes

Spring 2009CS Domain Naming System Hierarchy Name wizard.cse.nd.edu educom princeton ■ ■ ■■ ■ ■ mit csee ux01ux04 physics cisco ■ ■ ■■ ■ ■ yahoonasa ■ ■ ■■ ■ ■ nsfarpa ■ ■ ■■ ■ ■ navyacm ■ ■ ■■ ■ ■ ieee govmilorgnetukfr

Spring 2009CS Name Servers Partition hierarchy into zones Each zone implemented by two or more name servers Princeton name server Cisco name server CS name server EE name server ■ ■ ■■ ■ ■ Root name server educom princeton ■ ■ ■ mit csee ux01ux04 physics ciscoyahoonasansfarpanavyacmieee govmilorgnetukfr ■ ■ ■ ■ ■ ■■ ■ ■

Spring 2009CS Resource Records Each name server maintains a collection of resource records (Name, Value, Type, Class, TTL) Name/Value: not necessarily host names to IP addresses Type –A: IP addresses –NS: value gives domain name for host running name server that knows how to resolve names within specified domain. –CNAME: value gives canonical name for a host; used to define aliases. –MX: value gives domain name for host running mail server that accepts messages for specified domain. Class: allows other entities to define types TTL: how long the resource record is valid

Spring 2009CS Root Server (princeton.edu, cit.princeton.edu, NS, IN) (cit.princeton.edu, , A, IN) (cisco.com, thumper.cisco.com, NS, IN) (thumper.cisco.com, , A, IN) …

Spring 2009CS Princeton Server (cs.princeton.edu, optima.cs.princeton.edu, NS, IN) (optima.cs.princeton.edu, , A, IN) (ee.princeton.edu, helios.ee.princeton.edu, NS, IN) (helios.ee.princeton.edu, , A, IN) (jupiter.physics.princeton.edu, , A, IN) (saturn.physics.princeton.edu, , A, IN) (mars.physics.princeton.edu, , A, IN) (venus.physics.princeton.edu, , A, IN)

Spring 2009CS CS Server (cs.princeton.edu, optima.cs.princeton.edu, MX, IN) (cheltenham.cs.princeton.edu, , A, IN) (che.cs.princeton.edu, cheltenham.cs.princeton.edu, CNAME, IN) (optima.cs.princeton.edu, , A, IN) (opt.cs.princeton.edu, optima.cs.princeton.edu, CNAME, IN) (baskerville.cs.princeton.edu, , A, IN) (bas.cs.princeton.edu, baskerville.cs.princeton.edu, CNAME, IN)

Spring 2009CS Name Resolution Strategy Local server –need to know root at only one place (not each host) –site-wide cache

Spring 2009CS Electronic Mail RFC 822: header and body MIME: Multi-purpose Internet Mail Extensions MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=“ CA6E2DE4ABCAFBC5” From: Alice Smith To: Subject: look at the attached image! Date: Mon, 07 Sep :45: CA6E2DE4ABCAFBC5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Bob, here’s the jpeg image I promised. -- Alice CA6E2DE4ABCAFBC5 Content-Type: image/jpeg Content-Transfer-Encoding: base64 [unreadable encoding of a jpeg figure]

Spring 2009CS SMTP Mail reader, mail daemon, mail gateway SMTP messages: HELO, MAIL, RCPT, DATA, QUIT; server responds with code.

Spring 2009CS World Wide Web URL: uniform resource locator HTTP: START_LINE MESSAGE_HEADER MESSAGE_BODY

Spring 2009CS HTTP Request: –GET: fetch specified web page –HEAD: fetch status information about specified page GET HTTP/1.1http:// Response: –HTTP/ Accepted –HTTP/ Not Found –HTTP/ Moved Permanently Location:

Spring 2009CS HTTP HTTP 1.0: separate TCP connection for each request (each data item). HTTP 1.1: persistent connections Caching: –client: faster retrieval of web pages –server: reduced load –location: client, sitewide cache, ISP, etc. –EXPIRES header field (provided by server) –IF-MODIFIED-SINCE header field (issued by cache)

Spring 2009CS SNMP Request/reply protocol (on top of UDP) 2 main operations: –GET: retrieve state info from hosts –SET: set new state on host Relies on Management Information Base (MIB) –system: uptime, name, … –interfaces: physical address, packets sent/received, … –address translation: ARP (contents of table) –IP: routing table, number of forwarded datagrams, reassembly statistics, dropped packets, … –TCP: number of passive and active opens, resets, timeouts, … –UDP: number of packets sent/received, …

Spring 2009CS P2P Overview: –centralized database: Napster –query flooding: Gnutella –intelligent query flooding: KaZaA –swarming: BitTorrent –unstructured overlay routing: Freenet –structured overlay routing: Distributed Hash Tables

Spring 2009CS Napster Centralized Database: –Join: on startup, client contacts central server –Publish: reports list of files to central server –Search: query the server => return someone that stores the requested file –Fetch: get the file directly from peer

Spring 2009CS Gnutella Query Flooding: –Join: on startup, client contacts a few other nodes; these become its “neighbors” –Publish: no need –Search: ask neighbors, who ask their neighbors, and so on... when/if found, reply to sender. –Fetch: get the file directly from peer

Spring 2009CS KaZaA (Kazaa) In 2001, Kazaa created by Dutch company KaZaA BV. Single network called FastTrack used by other clients as well: Morpheus, giFT, etc. Eventually protocol changed so other clients could no longer talk to it. 2004: 2nd most popular file sharing network, 1-5million at any given time, about 1000 downloads per minute. (June 2004, average 2.7 million users, compare to BitTorrent: 8 million)

Spring 2009CS KaZaA “Smart” Query Flooding: –Join: on startup, client contacts a “supernode”... may at some point become one itself –Publish: send list of files to supernode –Search: send query to supernode, supernodes flood query amongst themselves. –Fetch: get the file directly from peer(s); can fetch simultaneously from multiple peers

Spring 2009CS KaZaA “Super Nodes”

Spring 2009CS KaZaA: File Insert I have X! Publish insert(X, )

Spring 2009CS KaZaA: File Search Query search(A) --> search(A) --> Replies Where is file A?

Spring 2009CS KaZaA: Fetching More than one node may have requested file... How to tell? –must be able to distinguish identical files –not necessarily same filename –same filename not necessarily same file... Use Hash of file –KaZaA uses UUHash: fast, but not secure –alternatives: MD5, SHA-1 How to fetch? –get bytes [ ] from A, [ ] from B

Spring 2009CS KaZaA Pros: –tries to take into account node heterogeneity: bandwidth host computational resources –rumored to take into account network locality Cons: –mechanisms easy to circumvent –still no real guarantees on search scope or search time

Spring 2009CS BitTorrent In 2002, B. Cohen debuted BitTorrent Key motivation: –popularity exhibits temporal locality (flash crowds) –e.g., Slashdot effect, CNN on 9/11, new movie/game release Focused on efficient Fetching, not Searching: –distribute the same file to all peers –files split up in pieces (typically 250kBytes) –single publisher, multiple downloaders –each downloader becomes a publisher (while still downloading) Has some “real” publishers: –Blizzard Entertainment using it to distribute the beta of their new games

Spring 2009CS BitTorrent Swarming: –Join: contact centralized “tracker” server, get a list of peers. –Publish: run a tracker server. –Search: out-of-band, e.g., use Google to find a tracker for the file you want. –Fetch: download chunks of the file from your peers. Upload chunks you have to them.

Spring 2009CS BitTorrent: Publish/Join Tracker

Spring 2009CS BitTorrent: Fetch

Spring 2009CS BitTorrent: Sharing Strategy Employ “Tit-for-tat” sharing strategy –“I’ll share with you if you share with me” –be optimistic: occasionally let freeloaders download otherwise no one would ever start! also allows you to discover better peers to download from when they reciprocate Approximates Pareto Efficiency –game theory: “No change can make anyone better off without making others worse off”

Spring 2009CS BitTorrent Pros: –works reasonably well in practice –gives peers incentive to share resources; avoids freeloaders Cons: –central tracker server needed to bootstrap swarm

Spring 2009CS Freenet In 1999, I. Clarke started the Freenet project Basic idea: –employ Internet-like routing on the overlay network to publish and locate files Additional goals: –provide anonymity and security –make censorship difficult

Spring 2009CS FreeNet Routed Queries: –Join: on startup, client contacts a few other nodes it knows about; gets a unique node id –Publish: route file contents toward the file id. File is stored at node with id closest to file id –Search: route query for file id toward the closest node id –Fetch: when query reaches a node containing file id, it returns the file to the sender

Spring 2009CS Distributed Hash Tables DHT In , academic researchers said “we want to play too!” Motivation: –Frustrated by popularity of all these “half-baked” P2P apps :) –We can do better! (so we said) –Guaranteed lookup success for files in system –Provable bounds on search time –Provable scalability to millions of node Hot Topic in networking ever since

Spring 2009CS DHT Abstraction: a distributed “hash-table” (DHT) data structure: –put(id, item); –item = get(id); Implementation: nodes in system form a distributed data structure –Can be Ring, Tree, Hypercube, Skip List, Butterfly Network,...

Spring 2009CS DHT Structured Overlay Routing: –Join: On startup, contact a “bootstrap” node and integrate yourself into the distributed data structure; get a node id –Publish: Route publication for file id toward a close node id along the data structure –Search: Route a query for file id toward a close node id. Data structure guarantees that query will meet the publication. –Fetch: Two options: Publication contains actual file => fetch from where query stops Publication says “I have file X” => query tells you has X, use IP routing to get X from

Spring 2009CS DHT Example: Chord Associate to each node and file a unique id in an uni-dimensional space (a Ring) –E.g., pick from the range [0...2 m ] –Usually the hash of the file or IP address Properties: –Routing table size is O(log N), where N is the total number of nodes –Guarantees that a file is found in O(log N) hops

Spring 2009CS DHT: Consistent Hashing N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID

Spring 2009CS DHT: Chord Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

Spring 2009CS DHT: Chord Finger Table N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Entry i in the finger table of node n is the first node that succeeds or equals n + 2 i In other words, the ith finger points 1/2 n-i way around the ring

Spring 2009CS DHT: Chord Join Assume an identifier space [0..7] Node n1 joins i id+2 i succ Succ. Table

Spring 2009CS DHT: Chord Join Node n2 joins i id+2 i succ Succ. Table i id+2 i succ Succ. Table

Spring 2009CS DHT: Chord Join Nodes n0, n6 join i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table

Spring 2009CS DHT: Chord Join Nodes: n1, n2, n0, n6 Items: f7, f i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Items 1 i id+2 i succ Succ. Table

Spring 2009CS DHT: Chord Routing Upon receiving a query for item id, a node: Checks whether stores the item locally If not, forwards the query to the largest node in its successor table that does not exceed id i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Items 1 i id+2 i succ Succ. Table query(7)

Spring 2009CS DHT Pros: –Guaranteed Lookup –O(log N) per node state and search scope Cons: –No one uses them? (only one file sharing app) –Supporting non-exact match search is hard

Spring 2009CS P2P Summary Many different styles; remember pros and cons of each –centralized, flooding, swarming, unstructured and structured routing Lessons learned: –Single points of failure are very bad –Flooding messages to everyone is bad –Underlying network topology is important –Not all nodes are equal –Need incentives to discourage freeloading –Privacy and security are important –Structure can provide theoretical bounds and guarantees