Download presentation
Presentation is loading. Please wait.
1
1 School of Computing Science Simon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 10 January 2005
2
2 Course Logistics Time & place -MW 3:00 – 4:20 PM, SUR 15-300 Instructor -Mohamed Hefeeda -Office: SUR 15-260 -Office hours: MW 4:30 – 5:30 or by appointment -mhefeeda@cs.sfu.camhefeeda@cs.sfu.ca Web page -www.cs.sfu.ca/~mhefeeda/Courses/05/P2P/www.cs.sfu.ca/~mhefeeda/Courses/05/P2P/
3
3 Course Objectives In-depth study of the peer-to-peer computing paradigm, a very active research area in networking and distributed systems
4
4 Course Objectives (cont’d) Learn how to effectively read, criticize, discuss, and present research papers Learn how to search for and develop new research ideas Learn how to write and defend your own work Hopefully, you will find interesting ideas that either strengthen your current research, or help you jump start a new research path
5
5 Course Perquisites Enthusiasm! -To read and explore new research ideas -To actively participate in the discussion Some Computer Networks background -E.g., undergraduate course in networks or distributed systems -We will present the necessary concepts throughout the course
6
6 Course Load and Policy Paper presentations 25% -One or more Paper critique15% -For each presented paper, write a one-page summarizing: (1) contributions; (2) weaknesses, concerns, and flaws; and (3) suggestions for improvements - Due BEFORE the presentation - You may not submit up to 3 paper reviews Class Participation10%
7
7 Course Load and Policy (cont’d) Project 50% -On something related to P2P systems Theoretical (e.g., new algorithm), Measurement study, Performance analysis, Comparative study, Implementation and experimentation Survey We will discuss possible projects -Propose your own project and get bonus points!
8
8 Research Facilities Access to a wide area test bed composed of 400+ nodes distributed all over the world (PlanetLab) Local area test bed (e.g., cluster of nodes, small LAN, …) can be arranged as well Access to traffic logs and statistics from SFU and other institutions may be arranged (subject to use and privacy policies) Most importantly, constructive critique and suggestions from your fellow students and the instructor
9
9 Course Schedule Weeks 1-2 -Introduction to P2P systems (instructor) -End of week 2, discussion of possible projects Weeks 3-11 -Paper presentations (students) -End of week 3, 1-2 page project proposal is due -In week 7, students will present the status of their projects (5-7 min talk for each student) Weeks 12-13 -Project presentations and discussions -In week 12, project final report is due
10
10 Course Topics (tentative) Introduction Basic algorithms -Routing, overlay management, replication, … Modeling and Analysis -Peer characteristics -Traffic analysis -System modeling Security -Possible attacks -Trust management -Anonymity and Privacy
11
11 Course Topics (cont’d) Rationality -Definitions -Incentive mechanisms to combat rationality -Designing incentive-compatible P2P protocols Current and Potential Applications -File sharing -Storage and file systems -Distributed cycle sharing -Streaming and content distribution
12
12 Advice on Reading and Writing Papers Jamin, Paper Reading and Writing ChecklistsPaper Reading and Writing Checklists Hanson and McNamee, Efficient Reading of Papers in Science and TechnologyEfficient Reading of Papers in Science and Technology
13
13 Introduction to Peer-to-Peer Systems
14
14 P2P Computing: Definitions Peers cooperate to achieve desired functions -Peers: End-systems (typically, user machines) Interconnected through an overlay network Peer ≡ Like the others (similar or behave in similar manner) -Cooperate: Share resources, e.g., data, CPU cycles, storage, bandwidth Participate in protocols, e.g., routing, replication, … -Functions: File-sharing, distributed computing, communications, content distribution, … Note: the P2P concept is much wider than file sharing
15
15 Overlay Network
16
16 When Did P2P Start? Napster (Late 1990’s) -Court shut Napster down in 2001 Gnutella (2000) Then the killer FastTrack (Kazaa,...) BitTorrent, and many others Accompanied by significant research interest Claim -P2P is much older than Napster! Proof -The original Internet! -Remember UUCP (unix-to-unix copy)?
17
17 What IS and IS NOT New in P2P? What is not new -Concepts! What is new -The term P2P (may be!) -New characteristics of Nodes which constitute the System that we build
18
18 What IS NOT New in P2P? Distributed architectures Distributed resource sharing Node management (join/leave/fail) Group communications Distributed state management ….
19
19 What IS New in P2P? Nodes (Peers) -Quite heterogeneous Several order of magnitudes difference in resources Compare the bandwidth of a dial-up peer versus a high-speed LAN peer -Unreliable Failure is the norm! -Offer limited capacity Load sharing and balancing are critical -Autonomous Rational, i.e., maximize their own benefits! Motivations should be provided to peers to cooperate in a way that optimizes the system performance
20
20 What IS New in P2P? (cont’d) System -Scale Numerous number of peers (millions) -Structure and topology Ad-hoc: No control over peer joining/leaving Highly dynamic -Membership/participation Typically open -More security concerns Trust, privacy, data integrity, … -Cost of building and running Small fraction of same-scale centralized systems How much would it cost to build/run a super computer with processing power of that 3 Million SETI@Home PCs?
21
21 What IS New in P2P? (cont’d) So what? We need to design new lighter-weight algorithms and protocols to scale to millions (or billions!) of nodes given the new characteristics Question: why now, not two decades ago? -We did not have such abundant (and underutilized) computing resources back then! -And, network connectivity was very limited
22
22 Why is it Important to Study P2P? P2P traffic is a major portion of Internet traffic (50+%), current killer app P2P traffic has exceeded web traffic (former killer app)! Direct implications on the design, administration, and use of computer networks and network resources -Think of ISP designers or campus network administrators Many potential distributed applications
23
23 Sample P2P Applications File sharing -Gnutella, Kazaa, Napster, … Distributed cycle sharing -SETI@home, Gnome@home, … File and storage systems -OceanStore, CFS, Freenet, Farsite, … Media streaming and content distribution -PROMISE -SplitStream, CoopNet, PeerCast, Bullet, Zigzag, NICE, …
24
24 P2P vs its Cousin (Grid Computing) Common Goal: -Aggregate resources (e.g., storage, CPU cycles, and data) into a common pool and provide efficient access to them Differences along five axes [Foster & Imanitchi 03] -Target communities and applications -Type of shared resources -Scalability of the system -Services provided -Software required
25
25 P2P vs Grid Computing (cont’d) IssueGridP2P Communities and Applications Established communities, e.g., scientific institutions Computationally- intensive problems Grass-root communities (anonymous) Mostly, file- swapping Resources Shared Powerful and Reliable machines, clusters High-speed connectivity Specialized instruments PCs with limited capacity and connectivity Unreliable Very diverse
26
26 P2P vs Grid Computing (cont’d) IssueGridP2P System Scalability Hundreds to thousands of nodes Hundreds of thousands to Millions of nodes Services Provided Sophisticated services: authentication, resources discovery, scheduling, access control, and membership control Members usually trust others Limited services: resource discovery limited trust among peers Software required Sophisticated suit: e.g., Globus, Condor Simple: (screen saver), e.g., Kazza, SETI@Home
27
27 P2P vs Grid Computing: Discussion The differences mentioned are based on the traditional view of each paradigm -In the future, it is conceived that both paradigms will converge and will complement each other [e.g., Butt et al. 03] Target communities and applications -Grid: is going open Type of shared resources -P2P: is to include various and more powerful resources Scalability of the system -Grid: is to increase number of nodes Services provided -P2P: is to provide authentication, data integrity, trust management, …
28
28 P2P Systems: Simple Model P2P Substrate Operating System Hardware Middleware P2P Application Software architecture model on a peer System architecture: Peers form an overlay according to the P2P Substrate
29
29 Overlay Network An abstract layer built on top of the physical network Neighbors in the overlay can be several hops away in the physical network Why do we need overlays? -Flexibility in Choosing neighbors Forming and customizing topology to fit application needs (e.g., short delay, reliability, high BW, …) Designing communication protocols among nodes -Get around limitations in legacy networks -Enable new (and old!) network services
30
30 Overlay Network (cont’d)
31
31 Overlay Network (cont’d) Some applications that use overlays -Application level multicast, e.g., ESM, Zigzag, NICE, … -Reliable inter-domain routing, e.g., RON -Content Distribution Networks (CDN) -Peer-to-peer file sharing Overlay design issues -Select neighbors -Handle node arrivals, departures -Detect and handle failures (nodes, links) -Monitor and adapt to network dynamics
32
32 Overlay Network (cont’d) IP Multicast
33
33 Overlay Network (cont’d) Application Level Multicast (ALM)
34
34 Peer Software Architecture Model A software client installed on each peer Three components: -P2P Substrate -Middleware -P2P Application P2P Substrate Operating System Hardware Middleware P2P Application Software architecture model on a peer
35
35 Peer Software Architecture Model (cont’d) P2P Substrate (key component) -Overlay management Construction Maintenance (peer join/leave/fail and network dynamics) -Resource management Allocation (storage) Discovery (routing and lookup) Can be classified according to the flexibility of placing objects at peers
36
36 P2P Substrates: Classification Structured (or tightly controlled, DHT) −Objects are rigidly assigned to specific peers −Looks like as a Distributed Hash Table (DHT) −Efficient search & guarantee of finding −Lack of partial name and keyword queries −Maintenance overhead −Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet) Unstructured (or loosely controlled) −Objects can be anywhere −Support partial name and keyword queries −Inefficient search & no guarantee of finding −Some heuristics exist to enhance performance −Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03]
37
37 Peer Software Architecture Model (cont’d) Middleware -Provides auxiliary services to the P2P application, e.g., Peer selection Trust management Data integrity validation Authentication and authorization Membership management Accounting (Economics and rationality) … -Ex: CollectCast, EigenTrust, Micro payement
38
38 Peer Software Architecture Model (cont’d) P2P Application -Potentially, there could be multiple applications running on top of a single P2P substrate -Applications include File sharing File and storage systems Distributed cycle sharing Content distribution -This layer provides some functions and bookkeeping relevant to the target application File assembly (file sharing) Buffering and rate smoothing (streaming) Ex: Promise, Bullet, CFS, Gnutella, Kazaa
39
39 Outline of the Rest of the Introduction P2P Substrates -Structured (DHT) Example: CAN -Unstructured Example 1: Gnutella Example 2: Kazaa Middleware and P2P Application -Example: CollectCast and Promise Course Roadmap: -Papers flash overview (1-2 min each!) Project discussion
40
40 Summary In P2P computing paradigm: -Peers cooperate to achieve desired functions Started (or re-discovered) with Napster ’98 Old, well-researched distributed concepts BUT, with new characteristics (e.g., heterogeneity, unreliability, rationality, scale, ad hoc), new and lighter-weight algorithms are needed Simple model for P2P systems: -Peers form an abstract layer called overlay -A peer software client may have three components P2P substrate, middleware, and P2P application Borders between components may be blurred Next lecture: Structured P2P substrates (DHTs)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.