NCLAB 1 Supporting complex queries in a distributed manner without using DHT NodeWiz: Peer-to-Peer Resource Discovery for Grids Sujoy Basu, Sujata Banerjee,

Slides:



Advertisements
Similar presentations
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Advertisements

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
P2PR-tree: An R-tree-based Spatial Index for P2P Environments ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo.
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Scalable Content-Addressable Network Lintao Liu
Supporting Complex Multi-dimensional Queries in P2P Systems Bin Liu, Wang-Chien Lee Hong Kong University of Science and Technology ICDCS ‘05 Dik Lun Lee.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
SKIP GRAPHS Slides adapted from the original slides by James Aspnes Gauri Shah.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
1 One Torus to Rule Them All: Multi-dimensional Queries in P2P Systems Prasanna Ganesan Beverly Yang Hector Garcia-Molina Stanford University.
Applications over P2P Structured Overlays Antonino Virgillito.
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Presented by.
Carnegie Mellon University Complex queries in distributed publish- subscribe systems Ashwin R. Bharambe, Justin Weisz and Srinivasan Seshan.
Scalable Resource Information Service for Computational Grids Nian-Feng Tzeng Center for Advanced Computer Studies University of Louisiana at Lafayette.
Efficient, Proximity-Aware Load Balancing for DHT-Based P2P Systems Yingwu Zhu, Yiming Hu Appeared on IEEE Trans. on Parallel and Distributed Systems,
Mercury: Scalable Routing for Range Queries Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan.
Rendezvous Points-Based Scalable Content Discovery with Load Balancing Jun Gao Peter Steenkiste Computer Science Department Carnegie Mellon University.
OSMOSIS Final Presentation. Introduction Osmosis System Scalable, distributed system. Many-to-many publisher-subscriber real time sensor data streams,
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
XtreemOS IP project is funded by the European Commission under contract IST-FP XtreemOS WP3.2 - T3.2.3 Scalable Directory Service Design State.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
P2P Course, Structured systems 1 Introduction (26/10/05)
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
FLANN Fast Library for Approximate Nearest Neighbors
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Symmetric Replication in Structured Peer-to-Peer Systems Ali Ghodsi, Luc Onana Alima, Seif Haridi.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
A Distributed Architecture for Multi-dimensional Indexing and Data Retrieval in Grid Environments Athanasia Asiki, Katerina Doka, Ioannis Konstantinou,
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Distributed Session Announcement Agents for Real-time Streaming Applications Keio University, Graduate School of Media and Governance Kazuhiro Mishima.
Routing Indices For P-to-P Systems ICDCS Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Resource Addressable Network (RAN) An Adaptive Peer-to-Peer Substrate for Internet-Scale Service Platforms RAN Concept & Design  Adaptive, self-organizing,
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Evaluation of Information Service Architectures for Grid.
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
PeerNet: Pushing Peer-to-Peer Down the Stack Jakob Eriksson, Michalis Faloutsos, Srikanth Krishnamurthy University of California, Riverside.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
Malugo – a scalable peer-to-peer storage system..
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Magdalena Balazinska, Hari Balakrishnan, and David Karger
Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks Sushma Maramreddy.
Early Measurements of a Cluster-based Architecture for P2P Systems
SCOPE: Scalable Consistency in Structured P2P Systems
DHT Routing Geometries and Chord
Dynamic Replica Placement for Scalable Content Delivery
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Presentation transcript:

NCLAB 1 Supporting complex queries in a distributed manner without using DHT NodeWiz: Peer-to-Peer Resource Discovery for Grids Sujoy Basu, Sujata Banerjee, Puneet Sharma, and Sung-ju Lee Brushwood: Distributed Trees in Peer-to-Peer Systems Chi Zhang, Arvind Krishnamurthy, and Randolph Y. Wang Presenter: Yongjoon Son 2006/04/03

NCLAB 2 Table of Contents Growing Requirements of Information Services NodeWiz: Peer-to-Peer Resource Discovery for Grids Defining Basics NodeWiz Architecture Evaluation Brushwood: Distributed Trees in Peer-to-Peer Systems Introduction of Brushwood Brushwood Design Applications Comparison PRISM with SWORD, Mercury, NodeWiz and Brushwood

NCLAB 3 Growing Requirements of Information Services Complex Queries (multi-attribute range queries) Centralized Systems Inappropriate in geographically large systems Hierarchical Distributed Systems Inefficiency by static hierarchy Single point of failure Load-balancing Problem DHT Unnatural to perform complex queries while maintaining load-balance MDS MDS-2 Chord, Pastry, Tapestry, CAN Supporting complex queries over DHTs Prefix Hash Tree PIER SWORD MAAN Supporting complex queries in distributed environment without utilizing DHTs SkipNet DIM Mercury NodeWizBrushwood

NCLAB 4 NodeWiz: Peer-to-Peer Resource Discovery for Grids Sujoy Basu, Sujata Banerjee, Puneet Sharma, and Sung-ju Lee 2005 IEEE International Symposium on Cluster Computing and the Grid

NCLAB 5 Nodes in Grid Information Service Nodes (Information) Service Nodes (assumed as stable infrastructure nodes) Resource Provider Nodes Resource Broker Nodes Resource Consumer Nodes

NCLAB 6 Scope and Policy of This Work Scope of this work Defining how service nodes join NodeWiz Defining how we do load balancing Defining how messages are routed Policy Information Soft-state Periodically advertised by the provider nodes

NCLAB 7 NodeWiz Bootstrapping A BrokerProvider Consumer Information in A Empty Routing Table Information in A Information in E E BrokerProvider Consumer Routing Table Level 0 Attr Load Min 0.6 Max +inf IP addr E Routing Table Level 0 Attr Load Min 0 Max 0.6 IP addr A A E A, E Load < 0.6Load ≥ 0.6

NCLAB 8 Distributed Decision Tree Level 0 Attr Load Min 0.6 Max +inf IP addr E (d): Overlay Routing Table of node B 1Mem02C 2Load0.30.6A Possible to splitting the attributed space into more than two partitions.

NCLAB 9 Identify the splitting node An existing node having maximum workload Top-K Algorithm - Identify the existing node with the highest workload Identify the splitting attribute and value Splitting Algorithm The distributed decision tree will grow in a balanced fashion. Dynamic Tree Reorganization (Node leave & rejoin) Proximity-Aware Replication Distributed Decision Tree

NCLAB 10 Load Balancing Identify the most overloaded node and split its attribute space with the new joining node. Top-K Algorithm Run periodically or on demand depending on how frequently nodes join Run in two phases and in a distributed fashion Counter Top-K List for Overloaded Nodes Update! List of top K workloads among all nodes in NodeWiz Root in Top-K Algorithm

NCLAB 11 Splitting the Attribute Space Splitting Algorithm Clustering algorithm Select the splitting value as the boundary between clusters of attribute values. In experiments, “k-means algorithm” is used. Select among all the attributes the one for which the clustering algorithm leads to most even-sized clusters.

NCLAB 12 Evaluation Simulation Synthetic datasets Real datasets (Reported by the ganglia for PlanetLab nodes) Scalability Load-Balancing

NCLAB 13 Evaluation Complex Queries Routing Diversity Optimization (Caching)

NCLAB 14 Brushwood: Distributed Trees in Peer-to-Peer Systems Chi Zhang, Arvind Krishnamurthy, and Randolph Y. Wang 2005 International Workshop on Peer-To-Peer Systems

NCLAB 15 Introduction of Brushwood A general paradigm for distributing and searching tree data structures in peer-to-peer environments while preserving data locality Each peer has a partial view (brushwood) of the whole logical tree hierarchy. Scope of this work How to partition a tree while preserving locality How to balance load How to search the partitioned tree efficiently in peer-to-peer systems

NCLAB 16 Brushwood Design Linearization of the tree Skip Graphs Partial view of the tree /bin/X11/X ? l edge : edge label f node : comparison function TreeID: partition id

NCLAB 17 Choice of Routing Substrate Skip Graphs A 001 J M 011 G 100 W 101 R 110 Level 1 G R W AJM Level 2 A G JMRW Level 0 Membership vectors Link at level i to nodes with matching prefix of length i. Think of a tree of skip lists that share lower layers.

NCLAB 18 Choice of Routing Substrate Skip Graphs

NCLAB 19 Load Balancing Each processor maintains load information about the nodes in its partial tree. The load in an internal node is the aggregated load on all processors managing portions of this node. Gossip Periodic peer-to-peer exchanges With this information, When a processor joins, find a processor with high load and partition it. If a processor is overloaded, find an underloaded processor an processor and make it rejoin.

NCLAB 20 Applications K-D tree l edge : 0 or 1 f node : compare the target to the splitting plane TreeID: Build its routing state Multi-dimensional Indexing (SkipIndex)

NCLAB 21 Applications Multi-dimensional Indexing (SkipIndex)

NCLAB 22 Comparison of PRISM with SWORD, Mercury, NodeWiz and Brushwood PRISM Logical Hierarchy representing membership over a DHT Quantization-based resource information Advertisement from Resource Provider (DHT routing) Advertisement from Service node (PDP) Quantization- based Query DHT routing Not yet implemented Dynamic Logical Hierarchy (Dynamic Scope Partition)

NCLAB 23 Backup Slides

NCLAB 24 Distributed Decision Tree The distributed decision tree will grow in a balanced fashion. Assumption: The workload does not show a sudden change in characteristics. In practice, since the workload pattern can change, the tree may grow in an unbalanced way. Dynamic Tree Reorganization Node leave and rejoin (Detail is future work) Proximity-Aware Replication A joining node can be a replica of an existing node. Latency vs. Traffic Additional traffic is generated for consistency among the primary node and its replicas.