1 Online Balancing of Range-Partitioned Data with Applications to P2P Systems Prasanna Ganesan Mayank Bawa Hector Garcia-Molina Stanford University.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Fast Algorithms For Hierarchical Range Histogram Constructions
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Load Rebalancing for Distributed File Systems in Clouds Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, Haiying Shen, Member, IEEE, and.
Fall 2008Parallel Query Optimization1. Fall 2008Parallel Query Optimization2 Bucket Sizes and I/O Costs Bucket B does not fit in the memory in its entirety,
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Dr. Kalpakis CMSC 661, Principles of Database Systems Index Structures [13]
Locality Aware Dynamic Load Management for Massively Multiplayer Games Jin Chen, Baohua Wu, Margaret Delap, Bjorn Knutson, Honghui Lu and Cristina Amza.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
IBM Software Group ® Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
1 Advanced Database Technology Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Spring 2004 March 4, 2004 INDEXING II Lecture based on [GUW,
CSE332: Data Abstractions Lecture 7: AVL Trees Tyler Robison Summer
CS4432: Database Systems II
1 One Torus to Rule Them All: Multi-dimensional Queries in P2P Systems Prasanna Ganesan Beverly Yang Hector Garcia-Molina Stanford University.
1 Canon in G Major: Designing DHTs with Hierarchical Structure Prasanna Ganesan (Stanford University) Krishna Gummadi (U. of Washington) Hector Garcia-Molina.
YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Qixiang Sun Prasanna Ganesan Hector Garcia-Molina Stanford University.
B+-tree and Hashing.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Accessing Spatial Data
Database Implementation Issues CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 5 Slides adapted from those used by Jennifer Welch.
1 Tree-Structured Indexes Yanlei Diao UMass Amherst Feb 20, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Load Balancing in Structured P2P Systems (DHTs) Sonesh Surana [Brighten Godfrey, Karthik Lakshminarayanan, Ananth Rao, Ion Stoica,
Mercury: Scalable Routing for Range Queries Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan.
Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Chapter 3: Data Storage and Access Methods
CS 245Notes 51 CS 245: Database System Principles Hector Garcia-Molina Notes 5: Hashing and More.
CS 4432lecture #10 - indexing & hashing1 CS4432: Database Systems II Lecture #10 Professor Elke A. Rundensteiner.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Proteus: Power Proportional Memory Cache Cluster in Data Centers Shen Li, Shiguang Wang, Fan Yang, Shaohan Hu, Fatemeh Saremi, Tarek Abdelzaher.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Multi-dimensional Queries in P2P Systems. Applications Photo-sharing (photographs tagged with metadata) Multi-player online games (locate objects and.
1 Experimental Evidence on Partitioning in Parallel Data Warehouses Pedro Furtado Prof. at Univ. of Coimbra & Researcher at CISUC DEI/CISUC-Universidade.
New Balanced Search Trees Siddhartha Sen Princeton University Joint work with Bernhard Haeupler and Robert E. Tarjan.
Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.
Trevor Brown – University of Toronto B-slack trees: Space efficient B-trees.
CS 245Notes 51 CS 245: Database System Principles Hector Garcia-Molina Notes 5: Hashing and More.
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
SAGA: Array Storage as a DB with Support for Structural Aggregations SSDBM 2014 June 30 th, Aalborg, Denmark 1 Yi Wang, Arnab Nandi, Gagan Agrawal The.
1 CPS216: Data-intensive Computing Systems Operators for Data Access (contd.) Shivnath Babu.
K-Anycast Routing Schemes for Mobile Ad Hoc Networks 指導老師 : 黃鈴玲 教授 學生 : 李京釜.
Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
1 CPS216: Advanced Database Systems Notes 05: Operators for Data Access (contd.) Shivnath Babu.
Data Structures and Algorithms in Parallel Computing Lecture 7.
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
IPTPS 2005 Mirek Korzeniowski: Page 1 International Graduate School of Dynamic Intelligent Systems HEINZ NIXDORF INSTITUT University of Paderborn Algorithms.
Supporting On-Demand Elasticity in Distributed Graph Processing Mayank Pundir*, Manoj Kumar, Luke M. Leslie, Indranil Gupta, Roy H. Campbell University.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 10.
Adlib : A Self-Tuning Index for Dynamic Peer-to-Peer Systems Proceedings of the 21st International Conference on Data Engineering(ICDE'05) Prasanna Ganesan.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Computer Architecture: Parallel Task Assignment
Parallel Databases.
Tree-Structured Indexes
On Multi-Arm Manipulation Planning
Accessing nearby copies of replicated objects
External Memory Hashing
Tree-Structured Indexes
CPS216: Advanced Database Systems
Presentation transcript:

1 Online Balancing of Range-Partitioned Data with Applications to P2P Systems Prasanna Ganesan Mayank Bawa Hector Garcia-Molina Stanford University

2 Motivation Parallel databases use range partitioning Parallel databases use range partitioning Advantages: Inter-query parallelism Advantages: Inter-query parallelism –Data Locality  Low-cost range queries  High thru’put Key Range

3 The Problem How to achieve load balance? How to achieve load balance? –Partition boundaries have to change over time –Cost: Data Movement Goal: Guarantee load balance at low cost Goal: Guarantee load balance at low cost –Assumption: Load balance beneficial !! Contribution Contribution –Online balancing -- self-tuning system –Slows down updates by small constant factor

4 Roadmap Model and Definitions Model and Definitions Load Balancing Operations Load Balancing Operations The Algorithms The Algorithms Extension to P2P Setting Extension to P2P Setting Experimental Results Experimental Results

5 Model and Definitions (1) Nodes maintain range partition (on a key) Nodes maintain range partition (on a key) –Load of a node = # tuples in its partition –Load imbalance σ = Largest load/Smallest load Arbitrary sequence of tuple inserts and deletes Arbitrary sequence of tuple inserts and deletes –Queries not relevant –Automatically directed to relevant node

6 Model and Definitions (2) After each insert/delete: After each insert/delete: –Potentially fix “imbalance” by modifying partitioning –Cost= # tuples moved Assume no inserts/deletes during balancing Assume no inserts/deletes during balancing –Non-critical simplification Goal: σ < constant always Goal: σ < constant always –Constant amortized cost per insert/delete –Implication: Faster queries, slower updates

7 Load Balancing Operations (1) NbrAdjust: Transfer data between “neighbors’’ NbrAdjust: Transfer data between “neighbors’’ [0,50) AB [50,100) [0,35)[35,100)

8 Is NbrAdjust good enough? Can be highly inefficient Can be highly inefficient –  (n) amortized cost per insert/delete ( n=#nodes ) AEDCBF

9 Load Balancing Operations (2) Reorder: Hand over data to neighbor and split load of some other node Reorder: Hand over data to neighbor and split load of some other node AEDCBF [0,10)[10,20)[20,30)[30,40)[40,50)[50,60)[40,60)[0,5)[5,10)

10 Roadmap Model and Definitions Model and Definitions Load Balancing Operations Load Balancing Operations The Algorithms The Algorithms Experimental Results Experimental Results Extension to P2P Setting Extension to P2P Setting

11 The Doubling Algorithm Geometrically divide loads into levels Geometrically divide loads into levels –Level i  Load in ( 2 i,2 i+1 ] –Will try balancing on level change Two Invariants Two Invariants –Neighbors tightly balanced  Max 1 level apart –All nodes within 3 levels  Guarantees σ ≤ i2i 2 i +1 2 i +2 Level i Level 0 Level 2 Level 1 Load Scale

12 The Doubling Algorithm (2) AEDCBF

13 The Doubling Algorithm (2) AEDCBF

14 The Doubling Algorithm (2) AEDCBF

15 The Doubling Algorithm: Case 2 AEDCBF Search for a blue node Search for a blue node –If none, do nothing!

16 The Doubling Algorithm: Case 2 ADCEBF Search for a blue node Search for a blue node –If none, do nothing!

17 The Doubling Algorithm (3) Similar operations when load goes down a level Similar operations when load goes down a level –Try balancing with neighbor –Otherwise, find a red node and reorder yourself Costs and Guarantees Costs and Guarantees –σ ≤ 8 –Constant amortized cost per insert/delete

18 From Doubling to Fibbing Change thresholds to Fibonacci numbers Change thresholds to Fibonacci numbers –σ ≤  3  4.2 –Can also use other geometric sequences –Costs are still constant FiFi F i+1 F i+2 = +

19 More Generalizations Improve σ to (1+  ) for any  >0 [BG04] Improve σ to (1+  ) for any  >0 [BG04] –Generalize neighbors to c-neighbors –Still constant cost O(1/  ) Dealing with concurrent inserts/deletes Dealing with concurrent inserts/deletes –Allow multiple balancing actions in parallel –Paper claims it is ok

20 Application to P2P Systems Goal: Construct P2P system supporting efficient range queries Goal: Construct P2P system supporting efficient range queries –Provide asymptotic performance a la DHTs What is a P2P system? A parallel DB with What is a P2P system? A parallel DB with –Nodes joining and leaving at will –No centralized components –Limited communication primitives Enhance load-balancing algorithms to Enhance load-balancing algorithms to –Allow dynamic node joins/leaves –Decentralize implementation

21 Experiments Goal: Study cost of balancing for different workloads Goal: Study cost of balancing for different workloads –Compare to periodic re-balancing algorithms (Paper) –Trade-off between cost and imbalance ratio (Paper) Results presented on Fibbing Algorithm (n=256) Results presented on Fibbing Algorithm (n=256) Three-phase Workload Three-phase Workload –(1) Inserts (2) Alternating inserts and deletes (3) Deletes Workload 1: Zipf Workload 1: Zipf –Random draws from Zipf-like distribution Workload 2: HotSpot Workload 2: HotSpot –Think key=timestamp Workload 3: ShearStress Workload 3: ShearStress –Insert at most-loaded, delete from least-loaded

22 Load Imbalance (Zipf) Time (x1000) Load Imbalance Growing PhaseSteady PhaseShrinking Phase

23 Load Imbalance (ShearStress)

24 Cost of Load Balancing

25 Related Work Karger & Ruhl [SPAA 04] Karger & Ruhl [SPAA 04] –Dynamic model, weaker guarantees Load balancing in DBs Load balancing in DBs –Partitioning static relations, e.g., [GD92,RZML02, SMR00] –Migrating fragments across disks, e.g., [SWZ93] –Intra-node data structures, e.g., [LKOTM00] Litwin et al. SDDS Litwin et al. SDDS

26 Conclusions Indeed possible to maintain well-balanced range partitions Indeed possible to maintain well-balanced range partitions –Range partitions competitive with hashing Generalize to more complex load functions Generalize to more complex load functions –Allow tuples to have dynamic weights –Change load definition in algorithms! * –Range partitioning is powerful Enables P2P system supporting range queries Enables P2P system supporting range queries –Generalizes DHTs with same asymptotic guarantees * Lots of caveats apply. Need load to be evenly divisible. No guarantees offered on costs. This offer not valid with any other offers. Etc, etc. etc.