Download presentation
Presentation is loading. Please wait.
Published byIrma Perry Modified over 9 years ago
1
P2P Computing MIRA YUN September 16, 2005
2
Outline What is P2P P2P taxonomies Characteristics Different P2P systems Conclusion
3
P2P “Peer-to-peer” (P2P) refers to a class of systems and applications that employ distributed resources to perform a function in a decentralized manner Generally opposed to the client/server architecture
4
Peers A peer gives some resources and obtains other resources in return. Peer = like each other All participants are peers (in the pure form of a P2P net.) Each peer depends on other peers Meaningless to be alone Peers are autonomous (self governing) if not wholly controlled by each other or by the same authority as everyone else
5
What is P2P? “The sharing of computer resources and services by direct exchange between systems” [p2pwg, 2001]. “Systems and applications that employ distributed resources to perform critical functions in a decentralized manner” enables peers to share their resources (information, processing, presence, etc.) with at most a limited interaction with a Centralized server.
6
Taxonomy of computer systems
7
P2P Models : pure, hybrid, super-peers Pure: peers have same capability and responsibility. symmetric communication. No host superior; all hosts can act as client or server. examples: Gnutella, Freenet Hybrid: servers facilitate the interaction between peers addressing bypasses the DNS, but a central server as directory examples: Napster, ICQ, Jabber
8
P2P Models : pure, hybrid, super-peers Super-peers A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers. Super-peer networks try to balance the efficiency of centralized search, and the autonomy, load balancing and robustness to attacks provided by distributed search. example: Kazaa
9
P2P search models Centralized directory model There is a central index. Once the requested file is located, exchange takes place directly between peers.
10
P2P search models Napster Created in 1999 by Shawn Fanning a freshman student at Northeastern University. To freely get MP3 music files. Central index server, P2P exchange Sued several times, suspended. The music industry is against Napster because people can get music for free instead of paying for a CD. Napster's defense is that the files are personal files that people maintain on their own machines, and therefore Napster is not responsible.
11
P2P search models Flooded requests model Each request from a peer is flooded/broadcast to directly connected peers (1) which in turn flood their peers (2). Propagated until a maximum number of floods occur (typically 5 to 9) or the request is answered. Used by Gnutella Requires a lot of bandwidth, does not scale Good for company networks
12
P2P search models Document routing model Each peer is assigned a random ID; each peers knows a number of other peers. When a document is published, an ID is computed by hash on the document contents and name. Each peer routes the document to the node with the most similar ID until the nearest peer ID is the current peer's ID.
13
P2P search models Document routing model When a peer requests the document, the request will go to the peer with the ID most similar to the document ID. This process is repeated until a copy of the document is found. Then the document is transferred back to the request originator, while each peer participating in the routing will keep a local copy.
14
P2P search models Document routing model Efficient for large communities But document ID must be known before posting request Used in FreeNet Four improved algorithms: Chord, CAN, Tapestry and Pastry.
15
Characteristics Decentralization Centralized systems Ideal for some applications Bottlenecks Inefficient use of resources Expensive to setup Hard to maintain Decentralized systems P2P emphasis on the users' ownership and control of data and resources. Fully decentralized is difficult in practice Hybrid approach
16
Characteristics Scalability Limited by factors: The amount of centralized operations The amount of state The inherent parallelism an application exhibits Scalability also depends on the ratio of communication to computation between the nodes Napset: can scale up to over 6 million users SETI@home : close to 3.5 million users so far
17
Characteristics Anonymity One goal of P2P is to allow people to use systems without concern for legal issue. Three different kinds of anonymity sender anonymity, Receiver anonymity mutual anonymity Gnutella Request is broadcast and rebroadcast until it reaches a peer with the content Freenet Request is sent and forward to a peer that is most likely to have the content
18
Characteristics Self-Organization Needed because of scalability, fault resilience, and the cost of ownership. Adaptation is required to handle the changes caused by peers connecting and disconnecting from the P2P systems. Cost of Ownership Reduces the cost of owning the systems and the content, and the cost of maintaining them. SETI@home faster than fastest supercomputer in world, cost is 1% Ad-Hoc Connectivity Has a strong effect on all classes of P2P systems
19
Characteristics Performance Influenced by three types of resources: processing, storage, and networking. Three key approaches to optimize performance: Replication: puts copies of objects/files closer to the requesting peers Caching : Reduces the path length required to fetch a file/object and therefore the number of messages exchanged between the peers. Intelligent routing and network organization:
20
Taxonomy of P2P systems
21
- Processing scalability in massive multi- parameters systems - Run by a central controller - Fork and join mechanism - Limitations Independent small parts Internet latencies - Intel claim speed-ups from 15hours to 30 minutes in case of interest rate swap modeling by using P2P Distributed Computing
22
SETI@home (Search for Extraterrestrial Intelligence) A collection of research projects aimed at discovering alien civilizations. Goals: to search for extraterrestrial radio emissions. Design: Two major components: data server & client. Decentralization and Scalability: distributes files (350KB large) to its users.
23
Jay Sheth - Application level collaboration between users - Event based applications such as Instant messaging, chat, online games - Challenges Location of other peers (e.g.. NetMeeting requires to know other peers IP address) Real time constraints e.g.. Game DOOM Collaboration
24
Jay Sheth - Platforms have support for primary P2P components : naming, discovery, communication, security and resource aggregation - Candidates for future P2P platform :.net, JXTA Platforms
25
Platforms (JXTA) JXTA = Juxtapose = side by side Open-source initiative from Sun (Java) “JXTA™ technology is a set of open protocols that allow any connected device on the network ranging from cell phones and wireless PDAs to PCs and servers to communicate and collaborate in a P2P manner.” “JXTA peers create a virtual network where any peer can interact with other peers and resources directly even when some of the peers and resources are behind firewalls and NATs or are on different network transports.” Objectives: Interoperability - across systems and communities Platform independence - multiple/diverse languages, systems, and networks Ubiquity - every device with a digital heartbeat
26
Platforms (JXTA) Architecture JXTA application layer JXTA service layer JXTA core layer Set of 6 protocols Peer Endpoint Protocols: available route to destination Peer Rendezvous Protocol : sign in/out, authentication Peer Resolver Protocol : send/receiver search queries for peers Pipe Binding protocols : pipe advertisement to pipe and point Peer Information protocol : learn peer’s status/properties Peer Discovery Protocol : find peers, groups, advertisement
27
- Content storage and exchange is where P2P is most successful Napster, Gnutella, Kazza File Sharing
28
Gnutella Protocol v0.4 (1/5) One of the most popular file-sharing protocols. Operates without a central Index Server (such as Napster). Clients (downloaders) are also servers => servents Clients may join or leave the network at any time => highly fault- tolerant but with a cost! Searches are done within the virtual network while actual downloads are done offline (with HTTP). The core of the protocol consists of 5 descriptors (PING, PONG, QUERY, QUERYHIT and PUSH).
29
Gnutella Protocol (2/5) A Peer (p) needs to connect to 1 or more other Gnutella Peers in order to participate in the virtual Network p initially doesn’t know IPs of its fellow file-sharers Gnutella Network N Servent p
30
Gnutella Protocol (3/5) a. HostCaches – The initial connection P connects to a HostCache H to obtain a set of IP addresses of active peers. P might alternatively probe its cache to find peers it was connected in the past. Gnutella Network N Servent p 1 2 Request/Receive a set of Active Peers H Connect to network
31
Gnutella Protocol (4/5) b. Ping/Pong – The communication overhead Although p is already connected it must discover new peers since its current connections may break. Thus, it sends periodically PING messages which are broadcasted (message flooding). If a host e.g. p2 is available it will respond with a PONG (routed only the same path the PING came from). P might utilize this response and attempt a connection to p2 in order to increase its degree. Gnutella Network N Servent p PING 1 PONG 2 Servent p2
32
Gnutella Protocol (5/5) c. Query/QueryHit – The utilization Query descriptors contain unstructured queries e.g. “celine dion mp3” They are again, like PING, broadcasted with a typical TTL=7. If a host e.g. p2 matches the query it will respond with a Queryhit descriptor d. Push – Enable downloads from peers that are firewalled. If a peer is firewalled => we can’t connect to him. Hence we request from him to establish a connection on us and to send us the file.
33
Conclusions Not anything new... but right time to: Take advantage of available resources Find an alternative to centralized c/s solutions There is something attractive about the defiance or avoidance of authority. Raised legal copyright issues Currently, 60% to 89% of all Internet traffic is due to p2p traffic => source of revenue => marketing argument. Potential good match between adhoc nets and P2P Interesting architectural and technical issues behind... And challenging requirements
34
Summary of P2P computing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.