XFilter and Distributed Data Storage Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems November 22, 2015 Some portions.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
XML Transformations and Content-based Crawling Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems August 7, 2015.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Network Layer (3). Node lookup in p2p networks Section in the textbook. In a p2p network, each node may provide some kind of service for other.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Distributed Hash Tables Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems October 5, 2015 Some slides based on originals.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Node Lookup in P2P Networks. Node lookup in p2p networks In a p2p network, each node may provide some kind of service for other nodes and also will ask.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
Finding What We Want: DNS and XPath-Based Pub-Sub Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 12, 2008.
Peer to Peer Network Design Discovery and Routing algorithms
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
Index and Distributed Index Methods Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 19, 2008 Some portions.
© 2016 A. Haeberlen, Z. Ives CIS 455/555: Internet and Web Systems 1 University of Pennsylvania Key-Based Routing and DHTs February 17, 2016.
Distributed Hashing Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 21, 2008 Some slides based on originals.
CSE 486/586 Distributed Systems Distributed Hash Tables
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Peer-to-Peer Information Systems Week 12: Naming
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
CS 268: Lecture 22 (Peer-to-Peer Networks)
CSE 486/586 Distributed Systems Distributed Hash Tables
Peer-to-Peer Data Management
EE 122: Peer-to-Peer (P2P) Networks
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
CSE 486/586 Distributed Systems Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

XFilter and Distributed Data Storage Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems November 22, 2015 Some portions derived from slides by Raghu Ramakrishnan

Readings & Reminders  Homework 2 Milestone 1 due 3/1  Please read for Wednesday: Stoica et al. “Chord”  Write a 1 paragraph summary of the key ideas and post to the discussion list  Next week:  Monday will be an abbreviated (1 hour) lecture  Wednesday: guest lecture by Marie Jacob on Q – search across databases 2

3 Recall: XFilter [Altinel & Franklin 00]

4 How Does It Work?  Each XPath segment is basically a subset of regular expressions over element tags  Convert into finite state automata  Parse data as it comes in – use SAX API  Match against finite state machines  Most of these systems use modified FSMs because they want to match many patterns at the same time

5 Path Nodes and FSMs  XPath parser decomposes XPath expressions into a set of path nodes  These nodes act as the states of corresponding FSM  A node in the Candidate List denotes the current state  The rest of the states are in corresponding Wait Lists  Simple FSM for politics usabody Q1_1 Q1_2 Q1_3

6 Decomposing Into Path Nodes Query ID Position in state machine Relative Position (RP) in tree: 0 for root node if it’s not preceded by “//” -1 for any node preceded by “//” Else =1+ (no of “*” nodes from predecessor node) Level: If current node has fixed distance from root, then 1+ distance Else if RP = –1, then –1, else 0 Finaly, NextPathNodeSet points to next node Q Q1-1Q1-2Q1-3 Q Q2-1Q2-2Q2-3 Q2=//usa/*/body/p

Thinking of XPath Matching as Threads 7

What Is a Thread?  It includes a promise of CPU scheduling, plus context  Suppose we do the scheduling based on events  … Then the “thread” becomes a context  Active state  What’s to be matched next  Whether it’s a final state 8

9 Query Index  Query index entry for each XML tag  Two lists: Candidate List (CL) and Wait List (WL) divided across the nodes  “Live” queries’ states are in CL; “pending” queries + states are in WL  Events that cause state transition are generated by the XML parser politics usa body p Q1-1 Q2-1 Q1-3Q2-2 Q2-3 X X X X X X X X CL WL Q1-2

10 Encountering an Element  Look up the element name in the Query Index and all nodes in the associated CL  Validate that we actually have a match Q Q1-1 politics Q1-1 X X WL startElement: politics CL Query ID Position Rel. Position Level Entry in Query Index: NextPathNodeSet

11 Validating a Match  We first check that the current XML depth matches the level in the user query:  If level in CL node is less than 1, then ignore height  else level in CL node must = height  This ensures we’re matching at the right point in the tree!  Finally, we validate any predicates against attributes (e.g.,

12 Processing Further Elements  Queries that don’t meet validation are removed from the Candidate Lists  For other queries, we advance to the next state  We copy the next node of the query from the WL to the CL, and update the RP and level  When we reach a final state (e.g., Q1-3), we can output the document to the subscriber  When we encounter an end element, we must remove that element from the CL

A Simpler Approach  Instantiate a DOM tree for each document  Traverse and recursively match XPaths  Pros and cons? 13

14 Publish-Subscribe Model Summarized  XFilter has an elegant model for matching XPaths  A good deal more complex than HW2, in that it supports wildcards (*) and //  Useful for applications with RSS (Rich Site Summary or Really Simple Syndication) Many news sites, web logs, mailing lists, etc. use RSS to publish daily articles  An instance of a more general concept, a topic- specific crawler

Revisiting Storage and Crawling with a Distributed Spin  In recent weeks:  Index structures primarily intended for single machines  B+ Trees  Data / document formats  Basics of single-machine crawling  Seed URLs, robots.txt, etc.  Now: let’s revisit (most of) the above in a setting where there are multiple machines working together!  First: storage 15

16 How Do We Distribute a B+ Tree?  We need to host the root at one machine and distribute the rest  What are the implications for scalability?  Consider building the index as well as searching

17 Eliminating the Root  Sometimes we don’t want a tree-structured system because the higher levels can be a central point of congestion or failure  Two strategies:  Modified tree structure (e.g., BATON, Jagadish et al.)  Non-hierarchical structure

18 A “Flatter” Scheme: Hashing  Start with a hash function with a uniform distribution of values:  h(name)  a value (e.g., 32-bit integer)  Map from values to hash buckets  Generally using mod (# buckets)  Put items into the buckets  May have “collisions” and need to chain … buckets { h(x) values overflow chain

19 Dividing Hash Tables Across Machines  Simple distribution – allocate some number of hash buckets to various machines  Can give this information to every client, or provide a central directory  Can evenly or unevenly distribute buckets  Lookup is very straightforward  A possible issue – data skew: some ranges of values occur frequently  Can use dynamic hashing techniques  Can use better hash function, e.g., SHA-1 (160-bit key)

20 Some Issues Not Solved with Conventional Hashing  What if the set of servers holding the inverted index is dynamic?  Our number of buckets changes  How much work is required to reorganize the hash table?  Solution: consistent hashing

21 Consistent Hashing – the Basis of “Structured P2P” Intuition: we want to build a distributed hash table where the number of buckets stays constant, even if the number of machines changes  Requires a mapping from hash entries to nodes  Don’t need to re-hash everything if node joins/leaves  Only the mapping (and allocation of buckets) needs to change when the number of nodes changes Many examples: CAN, Pastry, Chord  For this course, you’ll use Pastry  But Chord is simpler to understand, so we’ll look at it

22 Basic Ideas  We’re going to use a giant hash key space  SHA-1 hash: 20B, or 160 bits  We’ll arrange it into a “circular ring” (it wraps around at to become 0)  We’ll actually map both objects’ keys (in our case, keywords) and nodes’ IP addresses into the same hash key space  “abacus”  SHA-1  k10   SHA-1  N12

23 Chord Hashes a Key to its Successor N32 N10 N100 N80 N60 Circular hash ID Space  Nodes and blocks have randomly distributed IDs  Successor: node with next highest ID k52 k30 k10 k70 k99 Node IDk112 k120 k11 k33 k40 k65 Key Hash

24 Basic Lookup: Linear Time N32 N10 N5 N20 N110 N99 N80 N60 N40 “Where is k70?” “N80”  Lookups find the ID’s predecessor  Correct if successors are correct

25 “Finger Table” Allows O(log N) Lookups N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128  Goal: shortcut across the ring – binary search  Reasonable lookup latency

26 Node Joins  How does the node know where to go? (Suppose it knows 1 peer)  What would need to happen to maintain connectivity?  What data needs to be shipped around? N32 N10 N5 N20 N110 N99 N80 N60 N40 N120

27 A Graceful Exit: Node Leaves  What would need to happen to maintain connectivity?  What data needs to be shipped around? N32 N10 N5 N20 N110 N99 N80 N60 N40

28 What about Node Failure?  Suppose a node just dies?  What techniques have we seen that might help?

29 Successor Lists Ensure Connectivity N32 N10 N5 N20 N110 N99 N80 N60  Each node stores r successors, r = 2 log N  Lookup can skip over dead nodes to find objects N40 N10, N20, N32 N20, N32, N40 N32, N40, N60 N40, N60, N80 N60, N80, N99 N80, N99, N110 N99, N110, N5 N110, N5, N10 N5, N10, B20

30 Objects are Replicated as Well  When a “dead” peer is detected, repair the successor lists of those that pointed to it  Can take the same scheme and replicate objects on each peer in the successor list  Do we need to change lookup protocol to find objects if a peer dies?  Would there be a good reason to change lookup protocol in the presence of replication?  What model of consistency is supported here? Why?

31 Stepping Back for a Moment: DHTs vs. Gnutella and Napster 1.0  Napster 1.0: central directory; data on peers  Gnutella: no directory; flood peers with requests  Chord, CAN, Pastry: no directory; hashing scheme to look for data  Clearly, Chord, CAN, and Pastry have guarantees about finding items, and they are decentralized  But non-research P2P systems haven’t adopted this paradigm:  Kazaa, BitTorrent, … still use variations of the Gnutella approach  Why? There must be some drawbacks to DHTs..?

32 Distributed Hash Tables, Summarized  Provide a way of deterministically finding an entity in a distributed system, without a directory, and without worrying about failure  Can also be a way of dividing up work: instead of sending data to a node, might send a task  Note that it’s up to the individual nodes to do things like store data on disk (if necessary; e.g., using B+ Trees)

33 Applications of Distributed Hash Tables  To build distributed file systems (CFS, PAST, …)  To distribute “latent semantic indexing” (U. Rochester)  As the basis of distributed data integration (U. Penn, U. Toronto, EPFL) and databases (UC Berkeley)  To archive library content (Stanford)

34 Distributed Hash Tables and Your Project If you’re building a mini-Google, how might DHTs be useful in:  Crawling + indexing URIs by keyword?  Storing and retrieving query results? The hard parts:  Coordinating different crawlers to avoid redundancy  Ranking different sites (often more difficult to distribute)  What if a search contains 2+ keywords? (You’ll initially get to test out DHTs in Homework 3)

35 From Chord to Pastry  What we saw was the basic data algorithms for the Chord system  Pastry is a slightly different:  It uses a different mapping mechanism than the ring (but one that works similarly)  It doesn’t exactly use a hash table abstraction – instead there’s a notion of routing messages  It allows for replication of data and finds the closest replica  It’s written in Java, not C  … And you’ll be using it in your projects!

36 Pastry API Basics (v 1.4.3_02)  See freepastry.org for details and downloads  Nodes have identifiers that will be hashed: interface rice.p2p.commonapi.Id  2 main kinds of NodeIdFactories – we’ll use socket-based  Nodes are logical entities: can have more than one virtual node  Several kinds of NodeFactories: create virtual Pastry nodes  All Pastry nodes have built in functionality to manage routing Derive from “common API” class rice.p2p.commonapi.Application

37 Creating a P2P Network  Example code in DistTutorial.java  Create a Pastry node: Environment env = new Environment(); PastryNodeFactory d = new SocketPastryNodeFactory(new NodeFactory(keySize), env); // Need to compute InetSocketAddress of a host to be addr NodeHandle aKnownNode = ((SocketPastryNodeFactory)d).getNodeHandle(addr); PastryNode pn = d.newNode(aKnownNode); MyApp = new MyApp(pn);// Base class of your application!  No need to call a simulator – this is real!

38 Pastry Client APIs  Based on a model of routing messages  Derive your message from class rice.p2p.commonapi.Message  Every node has an Id(NodeId implementation)  Every message gets an Id corresponding to its key  Call endpoint.route(id, msg, hint) (aka routeMsg) to send a message (endpoint is an instance of Endpoint)  The hint is the starting point, of type NodeHandle  At each intermediate point, Pastry calls a notification:  forward(id, msg, nextHop)  At the end, Pastry calls a final notification:  deliver(id, msg)akamessageForAppl

39 IDs  Pastry has mechanisms for creating node IDs itself  Obviously, we need to be able to create IDs for keys  Need to use java.security.MessageDigest: MessageDigest md = MessageDigest.getInstance("SHA"); byte[] content = myString.getBytes(); md.update(content); byte shaDigest[] = md.digest(); rice.pastry.Id keyId = new rice.pastry.Id(shaDigest);

40 How Do We Create a Hash Table (Hash Map/Multiset) Abstraction? We want the following:  put(key, value)  remove(key)  valueSet =get(key)  How can we use Pastry to do this?

41 Next Time  Distributed filesystem and database storage: GFS, PNUTS