Data Partitioning in VLDB Tal Olier
Why am I here? Tal Olier – ~15 years in various software development positions. All of them involved database practice. I work in HP Software, I love working there and I came to tell you this; the lecture is just an excuse getting me into the building :)
Agenda RDBMS in short (basic terms) SQL reminder A bit about (RDBMS) architecture Performance - access paths What is table join VLDB - the size factor VLDB - industry practice How joins are executed Summary
R elational D atabase M anagement S ystem
A little history Was invented in 1970 – By Edgar Frank "Ted" Codd – In IBM labs – Oracle emerged first to the market
Basics – a table Rows Columns Primary key Emp_idEmp_nameSalary 1Dany10,000 2Yosi20,000 3Moshe30,000 4Eli40,000
Basics – a relation A foreign key (constraint) A reference – Source table – Source column/s – Target table – Target column/s
People example People: name, height, smoking, father Books read: title, author Schedule details: from, to, activity Resume details: from, to, salary
People example
S tructured Q uery L anguage
Query language SQL – Structured Query Language Declarative (vs. procedural) Requires Internal optimization
SELECT query structure SELECT FROM… JOIN WHERE GROUP BY HAVING ORDER BY
SQL modules DML (+Select) – Data manipulation language DDL – Data definition language TC – Transaction controls (commit/rollback) DCL – Data control language (grant/revoke) PE – Procedural extensions
A bit about architecture
Database server Memory Process I/O System Client Process Data Files Log Files Server Process Buffer cache Log cache Other cache Everything is blocks
IO bound vs. CPU bound CPU – what is it consumed for? IO – what is it consumed for?
Performance?
FTS – full table scan Scan the whole table – from top to bottom
B Tree index B tree – allows great spanning that derives small tree height
B+ tree The leaves are organized in a doubly linked list B+ tree – allows searching through all values by searching the leaf level only
Database index Data is sorted according to the index columns The leaf contain pointers to rows in the table Search of 1 value in a tree - o (log n) Smaller index height in B+ trees Index (database) operations: – Add/remove values – Index seek – Index scan
Index seek/scan …
Join (logical)
Inner join Use join predicate to match rows from 2 table: A and B Each row in table A is compared to each row in table B to find the pairs of rows that satisfy the join predicate Than column values for each matched pairs are combined into a result row
dept_idDept_nam e 1Sales 2Engineering 3Marketing department employee Emp_nameDept_id Rina1 Moshe2 Shira2 Yossinull emp_dept_ id emp_namedept_dep t_id dept_name 1Rina1Sales 1Rina2Engineering 1Rina3Marketing 2Moshe1Sales 2Moshe2Engineering 2Moshe3Marketing 2Shira1Sales 2Shira2Engineering 2Shira3Marketing nullYossi1Sales nullYossi2Engineering nullYossi3Marketing Cartesian product
Equi join A inner join that uses equality comparison in the join predicate Example: select * from employee emp join department dept on emp.dept_id = dept.dept_id
Equi join OK emp_dept_i d emp_namedept_dept _id dept_name 1Rina1Sales 1Rina2Engineering 1Rina3Marketing 2Moshe1Sales 2Moshe2Engineering 2Moshe3Marketing 2Shira1Sales 2Shira2Engineering 2Shira3Marketing nullYossi1Sales nullYossi2Engineering nullYossi3Marketing
RDBMS – summary in a nutshell Tables References Joins Indexes Blocks I/O
V ery L arge D ata B ase
RDBMS – summary in a nutshell Tables References Indexes Blocks I/O
VLDB – a table – size factor
Use case: Sales Information Table: – Customer name – Order number – Order date and time – List of items, amount and prices
Order details ( )
Remove 2007’s orders …
Order details kept in 4 tables 2007 … 2008 … 2009 … 2010 …
… 4 tables – remove 2007’s data 2007 … 2008 … 2009 … 2010 …
Union view Select * from t2007 Union all Select * from t2008 Union all Select * from t2009 Union all Select * from t2010
Order details kept in 4 tables and a view 2007 … 2008 … 2009 … 2010 …
Partitioned table 2007 … 2008 … 2009 … 2010 …
Get back to: Remove 2007’s data? 2007 … 2008 … 2009 … 2010 …
Impact on index behavior … 2007 … 2008 … 2009 … 2010 …
Partitioned index (local index) … 2007 … 2008 … 2009 … 2010 …
Local indexes Index is bound to it’s partition Drop partition derives drop index Smaller index heights Index is always usable Harder to maintain uniqueness with it
Partitioned table - concepts Partition column is the key for dividing the data Performance – only relevant partitions used Add/drop partition – DDL Local index – index is bound to a partition
Star schema
Data tables block T a b l e - A T a b l e - B T a b l e - C
Let’s get back to our partitioned table 2007 … 2008 … 2009 … 2010 …
Dimension referencing YearCust_id… … … Cust_idNameSerial signature… … 1715Bank of America Ltd/FFAA … … … …
Making fact tables thin 2007 … 2008 … 2009 … 2010 … Dimension
Join (physical)
Join To perform a join the optimizer need make the following decisions: Access path how to access each table Join order if more than 2 tables/views are joined, which join to do first Join method for each pair of row resource how to perform the join
Join methods– nested loop One input is the outer loop, the other input is the inner loop The inner loop is executed for each row in the outer loop Effective when – The outer loop is small – The inner loop is pre indexed
Join methods– hash The smaller of the 2 inputs is named the build input The second is probe input Hash table is build from build input Each row in the build input is put in the appropriate bucket The entire probe input is scanned
Join methods– hash cont’ For each row the hash value is calculated The corresponding hash bucket is scanned to find matched rows in the build input Good for joining large amount of data
Join methods– merge There is no concept of driving table Both input sources are sorted according the join key ( or use sorted source such as index) The sorted lists are merged together The merge itself is very fast, but it can be expensive to sort the sources
Summary It’s all about I/O Star schema – facts and dimensions Partitions + local indexes SQL joins (probably hash)
Q & A