Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow.

Slides:



Advertisements
Similar presentations
TURKISH STATISTICAL INSTITUTE 1 /34 SQL FUNDEMANTALS (Muscat, Oman)
Advertisements

1 Advanced SQL Queries. 2 Example Tables Used Reserves sidbidday /10/04 11/12/04 Sailors sidsnameratingage Dustin Lubber Rusty.
Relational Algebra Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
6.830 Lecture 9 10/1/2014 Join Algorithms. Database Internals Outline Front End Admission Control Connection Management (sql) Parser (parse tree) Rewriter.
Database: A collection of related data [Elmasri]. A database represents some aspect of real world called “miniworld” [Elmasri] or “enterprise” [Ramakrishnan].
CS 166: Database Management Systems
ICS 624 Spring 2011 Overview of DB & IR Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/12/20111Lipyeow.
Parallel Database Systems The Future Of High Performance Database Systems David Dewitt and Jim Gray 1992 Presented By – Ajith Karimpana.
ICS 321 Fall 2009 Project Presentation Template Team Name Team Members 11/19/20091Lipyeow Lim -- University of Hawaii at Manoa.
ICS 421 Spring 2010 Indexing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 02/18/20101Lipyeow Lim.
ICS 421 Spring 2010 Indexing (2) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 2/23/20101Lipyeow Lim.
ICS 624 Spring 2011 Entity Resolution with Evolving Rules Preface to Steven Whang’s slides Asst. Prof. Lipyeow Lim Information & Computer Science Department.
ICS 421 Spring 2010 Relational Query Optimization Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/8/20091Lipyeow.
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
ICS 421 Spring 2010 Security & Authorization Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/20/20101Lipyeow.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Optimization Recap and examples. 2 Optimization introduction For every SQL expression, there are many possible ways of implementation. The different.
Parallel Algorithms for Relational Operations. Models of Parallelism There is a collection of processors. –Often the number of processors p is large,
1 Optimization - Selection. 2 The Selection Operation Table: Reserves(sid, bid, day, agent) A page (block) can hold 100 Reserves tuples There are 1,000.
1 Rewriting Minus Queries Using Not In SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid = R.sid and R.bid = B.bid and B.color = ‘red’ MINUS.
SELECT S.rating, MIN (S.age) AS minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 sidsnameratingage 22dustin745 29brutus133.
What is it? What kind of system need it?. Distributing system, cloud system etc.
Fall 2008Parallel Databases1. Fall 2008Parallel Databases2 Ideal Parallel Systems Two key properties:  Linear Speedup: Twice as much hardware can perform.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
1 Distributed Databases CS347 Lecture 13 May 23, 2001.
Optimization Exercises. Question 1 How do you think the following query should be computed? What indexes would you suggest to use? SELECT E.ename, D.mgr.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Databases with Scalable capabilities Presented by Mike Trischetta.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
ICS 499 Projects Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 12/7/20111Lipyeow Lim -- University of.
1 Experimental Evidence on Partitioning in Parallel Data Warehouses Pedro Furtado Prof. at Univ. of Coimbra & Researcher at CISUC DEI/CISUC-Universidade.
LeongHW, SoC, NUS (UIT2201: Database) Page 1 © Leong Hon Wai, Animation of SQL Queries To illustrate three SQL queries: –Q1: simple select (one.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Spring 2011 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii.
ICS 321 Spring 2011 High Level Database Models Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 2/7/20111Lipyeow.
ICS 321 Fall 2011 Overview of Storage & Indexing (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 11/9/20111Lipyeow.
Institute for Software Science – University of ViennaP.Brezany Parallel and Distributed Systems Peter Brezany Institute for Software Science University.
ICS 321 Fall 2009 SQL: Queries, Constraints, Triggers Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/8/20091Lipyeow.
ICS 321 Fall 2011 Constraints, Triggers, Views & Indexes Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa.
ICS 321 Fall 2009 DBMS Application Programming Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 10/06/20091Lipyeow.
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
ICS 321 Fall 2010 SQL in a Server Environment (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 11/1/20101Lipyeow.
1 Database Systems ( 資料庫系統 ) October 24, 2005 Lecture #5.
ICS 321 Fall 2011 The Database Language SQL (iv) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 10/26/20111Lipyeow.
ICS 321 Spring 2011 The Database Language SQL (iii) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/14/20111Lipyeow.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
ICS 421 Spring 2010 Relational Model & Normal Forms Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/19/20101Lipyeow.
Mapping the Data Warehouse to a Multiprocessor Architecture
1 Distributed Databases architecture, fragmentation, allocation Lecture 1.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
2/16/2016Lecture 31 CS 222 Database Management System Spring Lecture 3 b Korra Sathya Babu Department of Computer Science NIT Rourkela.
CS 440 Database Management Systems Parallel DB & Map/Reduce Some slides due to Kevin Chang 1.
1 Triggers. 2 PL/SQL reminder We presented PL/SQL- a Procedural extension to the SQL language. We reviewed the structure of an anonymous PL/SQL block:
Shared Nothing Architecture Allen Archer. What is Shared Nothing architecture? It is a distributed architecture in which each node is independent and.
1 CS122A: Introduction to Data Management Lecture #7 Relational Algebra I Instructor: Chen Li.
SQL. Internet technologies – Ohad © Database  A database is a collection of data  A database management system (DBMS) is software designed to assist.
CSE 5810 Biomedical Informatics and Cloud Computing Zhitong Fei Computer Science & Engineering Department The University of Connecticut CSE5810: Introduction.
DataBase - Check 01 DataBase 2 nd year Computer Science & Engineer 1.
COP Introduction to Database Structures
Parallel Databases.
Data Storage Requirements
Mapping the Data Warehouse to a Multiprocessor Architecture
April 30th – Scheduling / parallel
The Gamma Database Machine Project
Presentation transcript:

Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow Lim -- University of Hawai`i at Manoa

Outline 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa2

DBMS Shared Nothing Parallel DBMS 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa3 DBMS query results Network Parallel DB layer

Cloud-based Architecture 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa4 (Virtualized) Network Disk Memory CPU Disk Memory CPU Disk Memory CPU Disk Memory CPU Amazon EC2 Physical Resources Virtual Machines

DBMS “Scaling” Up and Down 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa5 Network Parallel DB layer DBMS query results

Problem Statement Given A relation T A partitioning function F on a fixed partitioning key An initial number p of partitions/fragments An initial mapping of p fragments to p database nodes A target number q of partitions Find a mapping of {T1, T2,.. Tp} to {T1, T2,... Tq} and an assignment of the q fragments to q database nodes Such that we minimize The number of tuples re-partitioned The number of tuples moved between database nodes 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa6

Partitioning a Relation Partitioning attribute/key. Partitioning type. Eg. Range or Hash Partitioning constraint. Eg. Equi-width, equi-size Number of partitions/fragments. 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa : : : : hash function

Horizontal Fragmentation: Range Partition sidsnameratingage 22dustin745 29brutus133 31lubber855 32andy423 58rusty horatio735 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa8 sidsnameratingage 29brutus133 32andy423 sidsnameratingage 22dustin745 31lubber855 58rusty horatio735 Range Partition on rating column Partition 1: 0 <= rating < 5 Partition 2: 5 <= rating <= 10 Partition 1 Partition 2

Range Partition: Query Processing Which partitions? Better than non-parallel ? 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa9 sidsnameratingage 29brutus133 32andy423 sidsnameratingage 22dustin745 31lubber855 58rusty horatio735 Partition 1 Partition 2 SELECT * FROM Sailors S SELECT * FROM Sailors S WHERE rating = 2 SELECT * FROM Sailors S WHERE rating < 2 and age < 30 SELECT * FROM Sailors S WHERE age > 30

Partition 1 Partition 2 Horizontal Fragmentation: Hash Partition Hash partitioning using hash function – Partition = rating mod 2 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa10 sidsnameratingage 22dustin745 29brutus133 31lubber855 32andy423 58rusty horatio735 sidsnameratingage 31lubber855 32andy423 58rusty1035 sidsnameratingage 22dustin745 29brutus133 64horatio735

Hash Partition: Query Processing Which partitions? Better than non-parallel ? 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa11 SELECT * FROM Sailors S SELECT * FROM Sailors S WHERE rating = 2 SELECT * FROM Sailors S WHERE rating < 2 and age < 30 SELECT * FROM Sailors S WHERE age > 30 Partition 1 Partition 2 sidsnameratingage 31lubber855 32andy423 58rusty1035 sidsnameratingage 22dustin745 29brutus133 64horatio735

Method N: Naive Resize 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa12

Method C : Chunk-based 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa13

Method T : Tree-based 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa14

Method H : Hash-based 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa15

9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa16

9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa17

9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa18

9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa19