Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zuse-Institute Berlin (ZIB) Computer Science Research Artur Andrzejak Zuse-Institute Berlin (ZIB) Overview: Challenges in P2P Systems.

Similar presentations


Presentation on theme: "Zuse-Institute Berlin (ZIB) Computer Science Research Artur Andrzejak Zuse-Institute Berlin (ZIB) Overview: Challenges in P2P Systems."— Presentation transcript:

1 Zuse-Institute Berlin (ZIB) Computer Science Research Artur Andrzejak Zuse-Institute Berlin (ZIB) Overview: Challenges in P2P Systems

2 Zuse-Institute Berlin (ZIB) Computer Science Research What is a Peer-To-Peer System? >Participants are autonomous (different owners) >Resources are distributed >Sites have equal functionality >clients, when accessing information >servers, when serving information to other peers >routers, when forwarding information... and so are called „peers“ >Lange number of participants

3 Zuse-Institute Berlin (ZIB) Computer Science Research P2P – a Bad Idea? >„Distribution is expensive, specialized functionality is good!“ (Garcia-Molina) >If distribution is necessary (e.g. due to reliability): >build centralized directory and use backups >  computational efficiency suffers in P2P-scenario!

4 Zuse-Institute Berlin (ZIB) Computer Science Research So Why P2P-Systems Exist at All? >User‘s view: >exploiting existing inexpensive resources >sharing costs among many >legal protection >autonomy >anonymity >Researcher‘s view: >Scalability >Self-organization and low management cost >High availability and fault-tolerance

5 Zuse-Institute Berlin (ZIB) Computer Science Research Main Challenges >Search >Reliability and security >(Resource Management)

6 Zuse-Institute Berlin (ZIB) Computer Science Research Main Challenges >Search >Reliability and security

7 Zuse-Institute Berlin (ZIB) Computer Science Research Search Mechanism Characteristics >Comprehensiveness and guarantees >Many of today‘s systems do not guarantee that existing items will be found at all, or they do not find all items >Query expressiveness >Today: only key/keyword searches; range queries, aggregates and SQL- like queries desirable >Efficiency >A major problem: too many messages for searching, some systems even use flooding >Robustness >Autonomy

8 Zuse-Institute Berlin (ZIB) Computer Science Research Search Mechanism Determines.. >Topology >From arbitrary (Gnutella) to rigid (Napster) >Rigid topology increases efficiency but decreases autonomy >Placement of Data/Metadata >Gnutella – only own data; Chord – data/metadata is carefully distributed in whole network; superpeers – metadata for superpeers is centralized >Message Routing >Each query message is sent to a group of peers >From unstructured flooding (Gnutella) to sofisticated protocols (Chord, CAN etc.)

9 Zuse-Institute Berlin (ZIB) Computer Science Research Gnutella – How it Works query hit download

10 Zuse-Institute Berlin (ZIB) Computer Science Research Gnutella – Characteristics CharacteristicsGnutella Comprehensivness++ Expressivness++++ Efficiency+ Autonomy++++ Robustness+++ Architectural PropertiesGnutella Topologypower law Data Placementarbitrary Message Routingflooding

11 Zuse-Institute Berlin (ZIB) Computer Science Research Chord – How it Works A key is stored at its successor: node with next higher ID N32 N90 N105 K80 K20 K5 Circular 160-bit ID space Key 5 Node 105 N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 112 N120 Finger i points to successor of n+2 i

12 Zuse-Institute Berlin (ZIB) Computer Science Research Chord - Characteristics CharacteristicsGnutellaChord Comprehensivness++++++ Expressivness+++++ Efficiency+++++ Autonomy++++++ Robustness+++++ Architectural PropertiesGnutellaChord Topologypower lawring Data Placementarbitraryhashing Message Routingfloodingdirected

13 Zuse-Institute Berlin (ZIB) Computer Science Research Decouple Efficiency, Autonomy, Robustness autonomy robustness efficiency + + + gnutella chord (From „Open Problems in Data Sharing Peer-To-Peer Systems“ by Hector Garcia-Molina)

14 Zuse-Institute Berlin (ZIB) Computer Science Research Novelty: Location-Independent Routing >Each unique document or endpoint has a globally unique identifier (GUID) >Locating data can be seen as a routing problem: >clients construct messages addressed with GUIDs and let peers pass these messages until object is located >Known as Decentralized Object Location and Routing (DOLR) paradigm or Distributed Hash Table (DHT) >Advantages: >allows for routing messages to objects without knowing their location >data can be stored anywhere, amidst millions of peers  scalability >provides locality: use of local resources instead of distant, if possible >Implemented in Chord, CAN, Pastry, Tapestry

15 Zuse-Institute Berlin (ZIB) Computer Science Research Main Challenges >Search >Reliability and security

16 Zuse-Institute Berlin (ZIB) Computer Science Research Essence: Untrusted/Unreliable Components >Centralized systems have components which are professionally maintained and trusted to behave well >Components of a P2P-system may crash or fail at any time (unreliable components) >Also, the participants might be adversarial, attempting to damage the system (untrusted components) >Failure rate ~ system size  larger P2P-systems are guaranteed to have malfunctioning components >P2P-system builders must invoke new design principles to achieve guarantees >„only the aggregate behaviour of many peers can be trusted“ >Techniques for untrusted components solve issues for unreliable ones (converse is not true)

17 Zuse-Institute Berlin (ZIB) Computer Science Research Achieving Reliability and Security >Replication >Cryptography >Byzantine Agreement >Exploiting differences >„Thermodynamic“ Systems Design

18 Zuse-Institute Berlin (ZIB) Computer Science Research Replication >Redundancy helps to achieve fault tolerance by providing online replacements for faulty resources >Advanced P2P Systems (Intermemory, OceanStore, FreeHaven) use so called erasure coding >Each chunk of data is transformed into many fragments >Very low Fraction of Blocks Lost Per Year (FBLPY) Losses per year for 6 months repair interval: Std: 0.03 blocks Erasure: 10 -35 blocks

19 Zuse-Institute Berlin (ZIB) Computer Science Research Byzantine Agreement >Immutable (read-only) data can be easily signed („sealed“) by cryptographic means to detect and discard faulty information >Also repairs are possible by these techniques >However, some decisions are active: e.g. changing, replacing or deleting information >These decisions must be taken collectively to eliminate corrupted nodes >Here Byzantine Agreement can be used: only if a correct number of nodes agree, a unified decision is taken >Works if no more than 1/3 of the nodes are compromized >Applied in OceanStore and Farsite

20 Zuse-Institute Berlin (ZIB) Computer Science Research Exploiting Differences >Some peers are „more equal“ than others: >Different CPUs, memory, storage cap., network connectivity >Some are professionally managed, others not >Physically, some are locked in secure rooms, others are public >We can exploit these differences to tune performance, availability, reliability, security >Examples: >Computers with higher connectivity as supernodes >Actively managed nodes for Byzantine Agreement >Placing archival data on servers deep in mountains

21 Zuse-Institute Berlin (ZIB) Computer Science Research „Thermodynamic“ Systems Design >A new concept of John Kubiatiowicz – „Stability through Statistics“ >We can give guarantees on collective behaviour while individual nodes are not predictable >Over time, the latent order of a system is destroyed – this resembles the 2nd law of thermodynamics: „entropy of closed systems increases“ >Therefore, self-organizing behaviour is necessary: >Servers must continuously collect, regenerate and redistribute fragments in a data storage system >They must adjust routing links in the DOLR to correct changes >They must recognize faults without global communication >Entropy reduction can be also achieved by introspection >System observes itself, applies analyses, then adapts accordingly >Research in the area of IBM‘s Autonomic Computing

22 Zuse-Institute Berlin (ZIB) Computer Science Research P2P-Research at ZIB: CSR-DMS >Management of large scientific data-sets (up to 400 Mio. files) >Should improve existing approaches in the area of GRID technologies >Also as a framework for research >Architecture is P2P-based >Should exhibit self-management abilities >Candidates for Diplomarbeiten are very welcome!


Download ppt "Zuse-Institute Berlin (ZIB) Computer Science Research Artur Andrzejak Zuse-Institute Berlin (ZIB) Overview: Challenges in P2P Systems."

Similar presentations


Ads by Google