Peer-To-Peer Data Management

Slides:



Advertisements
Similar presentations
2/15/2001O'Reilly P2P Conference Characterizing P2P Infrastructure Wesley Felter Editor, Hack the Planet
Advertisements

P2PR-tree: An R-tree-based Spatial Index for P2P Environments ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Routing Indices For Peer-to-Peer Systems Arturo Crespo, Hector Garcia-Molina Stanford ICDCS 2002.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Open Problems in Data- Sharing Peer-to-Peer Systems Neil Daswani, Hector Garcia-Molina, Beverly Yang.
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
1 Peer-To-Peer Data Management Hector Garcia-Molina ICDE Conference, February 28, 2002.
Peer-to-Peer Networking By: Peter Diggs Ken Arrant.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Conclusions in Peer-to-Peer Systems Παρουσίαση: Τάσος Καραγιάννης, Σπυριδούλα Μαργαρίτη, Κώστας Στεφανίδης, Θοδωρής Τσώτσος.
Searching in Unstructured Networks Joining Theory with P-P2P.
P2P Databases. Overview 0. Data objects, pointers (URLs), and attributes 1. Freeform versus structured attribute data 2. Centralized indices for attribute.
Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University www-db.stanford.edu/peers/
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Introduction Widespread unstructured P2P network
Searching In Peer-To-Peer Networks Chunlin Yang. What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Colin J. MacDougall.  Class of Systems and Applications  “Employ distributed resources to perform a critical function in a decentralized manner”  Distributed.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 18: Data Management in Peer-to-Peer Systems Professor Chen Li Based on slides.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
03/19/02Scalab Seminar Series1 Routing in Peer-to-Peer Systems Ramaswamy N.Vadivelu Scalab, ASU.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
PEER TO PEER (P2P) NETWORK By: Linda Rockson 11/28/06.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Peer to Peer Network Design Discovery and Routing algorithms
Peer to Peer Computing. What is Peer-to-Peer? A model of communication where every node in the network acts alike. As opposed to the Client-Server model,
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Bruce Hammer, Steve Wallis, Raymond Ho
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
Composing Web Services and P2P Infrastructure. PRESENTATION FLOW Related Works Paper Idea Our Project Infrastructure.
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
Peer-to-Peer Information Systems Week 12: Naming
PROGRAM STUDI TEKNIK INFORMATIKA FAKULTAS ILMU KOMPUTER
مظفر بگ محمدی دانشگاه ایلام
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
Peer to Peer Information Retrieval
Peer-to-Peer Information Systems Week 6: Performance
Mobile P2P Data Retrieval and Caching
An Overview of Peer-to-Peer
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Peer-to-Peer Information Systems Week 12: Naming
#02 Peer to Peer Networking
Presentation transcript:

Peer-To-Peer Data Management Hector Garcia-Molina ICDE Conference, February 28, 2002

? What is P2P? pastry can jxta fiorana napster freenet united devices open cola ? aim ocean store netmeeting gnutella farsite icq maorpheus ebay limewire bearshare seti@home uddi grove jabber popular power kazaa folding@home tapestry process tree mojo nation chord

Napster join query answer get file central index ...

Gnutella query

Morpheus ... ... ... ... super peer ... ...

Seti@Home satellite dish raw data chunk analyzed data central site ...

Lockss D3 D1 library D library A D2 library C library B library E

PeerCast Stanford source after: before: Stanford source

What is a P2P System? Multiple sites (at edge) Distributed resources Sites are autonomous (different owners) Sites are both clients and servers Sites have equal functionality P2P Purity

P2P is BAD IDEA!! Distribution is expensive! Specialized functionality is good!

Example: Distributed Data Management Distribution is expensive If you must distribute: build centralized directory, index use backups for reliability for replicated data, use primary copy

Computational Efficiency is NOT Main Goal Main driving force in a P2P system: exploiting existing (often free) resources sharing costs among many legal protection autonomy anonymity

Should We Do P2P Research? Should we help people break the law? Analogy: Should we develop pillows, knives, hammers, drugs, bath tubs, cars, airplanes, ... ??

Should We Do P2P Research? YES: P2P not exclusively for breaking law Remember the VCR YES: P2P can liberate us from culture “plantation owners” (Lessig)

Is “Free Culture’’ Feasible? Example: Legal texts Can we afford it? economic activity rules of the game today

Should DB community work on P2P? YES

P2P Challenges Easier to list NON-Research-Topics: Color schemes for P2P Nodes Impact of P2P on Moroccan 15th Century Literature

P2P Challenges Search Resource Management Security & Privacy

Search Taxonomy lookup freenet can partial replicated SP content queries search gnutella morpheus napster routing single site regional global scope of index

Index Implementation Taxonomy routing replicated SP freenet yes gnutella morpheus index location correlated with content location partial no napster can centralized distributed P2P nature of index

Content Addressable Network (CAN) Nodes 1 Data 2

Can We Improve Flooding? routing replicated SP freenet yes gnutella morpheus index location correlated with content location partial no napster can centralized distributed P2P nature of index

Directed BFS in Gnutella ? ... query Heuristics for Selecting Direction >RES: Returned most results <TIME: Shortest satisfaction time <HOPS: Min hops for results >MSG: Sent us most messages (all types) <QLEN: Shortest queue <LAT: Shortest latency >DEG: Highest degree

How Does One Evaluate? Live Gnutella? Use real Gnutella as “laboratory”

Time to Satisfaction for Directed BFS

Routing Index C Q(DB) A B D 50 25 C AI DB 20 65 B 70 75 50 90 20 A AI A AI DB 50 B AI DB D 15 D 20 A 50 25 C 15 D AI DB

Types of Routing Indexes Compound Hop Count Exponential Decay Strategies for Cycles Ignore (for Hop-Count, exponential) Avoid Update Cycles Detect Update Cycles and Recover

Effect of Index Compression

Effect of Network Topology

Resource Management Resource: Issues: storage (lockss) CPU processing (seti@home) bandwidth (PeerCast) Issues: fairness load balancing

A1 B1 C1 A2 B2 C2 B1 A1 B2 A2 Example: Data Trading site 1 site 2 trade B2 A2 trade

A1 B1 C1 A2 B2 C2 B1 A1 C1 A2 C2 B2 Example: Data Trading site 1 trade C1 A2 trade C2 B2 trade

Data Trading Order of trades impacts reliability Issues: Swaps vs. Deeds Fixed price vs. bids Preference to sites with a lot of space? reliable sites? “desperate” sites?

Effect of Bid Policies bid more (ask more in return) when I have less free space bid more (ask more in return) when I have more free space

Effect of One Maverick Site always bids high

Security & Privacy Issues: Anonymity Reputation Accountability Information Preservation Information Quality Trust Denial of service attacks

Information Preservation Example Policy: make 3 copies of documents A1 make copies What can go wrong?

A1 A1 A’1 What Can Go Wrong? “Bad” sites make copies “Bad” site alters copy “Bad” site publishes fake “Bad” site makes may copies of other docs ... A1 A1 make copies A’1

Conclusion P2P systems popular today P2P systems vulnerable and inefficient Many challenges ahead Search Resource Management Security and Privacy

For Additional Information Google: “Stanford Peers” http://www-db.stanford.edu/peers/