Download presentation
Presentation is loading. Please wait.
Published byAlannah Stephens Modified over 9 years ago
1
Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow Lim -- University of Hawai`i at Manoa
2
Outline 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa2
3
DBMS Shared Nothing Parallel DBMS 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa3 DBMS query results Network Parallel DB layer
4
Cloud-based Architecture 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa4 (Virtualized) Network Disk Memory CPU Disk Memory CPU Disk Memory CPU Disk Memory CPU Amazon EC2 Physical Resources Virtual Machines
5
DBMS “Scaling” Up and Down 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa5 Network Parallel DB layer DBMS query results
6
Problem Statement Given A relation T A partitioning function F on a fixed partitioning key An initial number p of partitions/fragments An initial mapping of p fragments to p database nodes A target number q of partitions Find a mapping of {T1, T2,.. Tp} to {T1, T2,... Tq} and an assignment of the q fragments to q database nodes Such that we minimize The number of tuples re-partitioned The number of tuples moved between database nodes 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa6
7
Partitioning a Relation Partitioning attribute/key. Partitioning type. Eg. Range or Hash Partitioning constraint. Eg. Equi-width, equi-size Number of partitions/fragments. 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa7 2 4 6 7 13 20 : 2 4 6 7 13 20 : 2 4 6 7 13 20 : 2 4 6 7 13 20 : hash function
8
Horizontal Fragmentation: Range Partition sidsnameratingage 22dustin745 29brutus133 31lubber855 32andy423 58rusty1035 64horatio735 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa8 sidsnameratingage 29brutus133 32andy423 sidsnameratingage 22dustin745 31lubber855 58rusty1035 64horatio735 Range Partition on rating column Partition 1: 0 <= rating < 5 Partition 2: 5 <= rating <= 10 Partition 1 Partition 2
9
Range Partition: Query Processing Which partitions? Better than non-parallel ? 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa9 sidsnameratingage 29brutus133 32andy423 sidsnameratingage 22dustin745 31lubber855 58rusty1035 64horatio735 Partition 1 Partition 2 SELECT * FROM Sailors S SELECT * FROM Sailors S WHERE rating = 2 SELECT * FROM Sailors S WHERE rating < 2 and age < 30 SELECT * FROM Sailors S WHERE age > 30
10
Partition 1 Partition 2 Horizontal Fragmentation: Hash Partition Hash partitioning using hash function – Partition = rating mod 2 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa10 sidsnameratingage 22dustin745 29brutus133 31lubber855 32andy423 58rusty1035 64horatio735 sidsnameratingage 31lubber855 32andy423 58rusty1035 sidsnameratingage 22dustin745 29brutus133 64horatio735
11
Hash Partition: Query Processing Which partitions? Better than non-parallel ? 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa11 SELECT * FROM Sailors S SELECT * FROM Sailors S WHERE rating = 2 SELECT * FROM Sailors S WHERE rating < 2 and age < 30 SELECT * FROM Sailors S WHERE age > 30 Partition 1 Partition 2 sidsnameratingage 31lubber855 32andy423 58rusty1035 sidsnameratingage 22dustin745 29brutus133 64horatio735
12
Method N: Naive Resize 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa12
13
Method C : Chunk-based 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa13
14
Method T : Tree-based 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa14
15
Method H : Hash-based 9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa15
16
9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa16
17
9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa17
18
9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa18
19
9/8/2010Lipyeow Lim -- University of Hawai`i at Manoa19
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.