A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses George Candea (EPFL & Aster Data) Neoklis Polyzotis (UC Santa Cruz) Radek Vingralek.

Slides:



Advertisements
Similar presentations
Arjun Suresh S7, R College of Engineering Trivandrum.
Advertisements

Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Brian Babcock Surajit Chaudhuri Gautam Das at the 2003 ACM SIGMOD International Conference By Shashank Kamble Gnanoba.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Clydesdale: Structured Data Processing on MapReduce Jackie.
1 HYRISE – A Main Memory Hybrid Storage Engine By: Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudre-Mauroux, Samuel Madden, VLDB.
6.814/6.830 Lecture 8 Memory Management. Column Representation Reduces Scan Time Idea: Store each column in a separate file GM AAPL.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Depth Estimation for Ranking Query Optimization Karl Schnaitter, UC Santa Cruz Joshua Spiegel, BEA Systems, Inc. Neoklis Polyzotis, UC Santa Cruz.
Lab3 CPIT 440 Data Mining and Warehouse.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.
1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class.
Parallel Execution Plans Joe Chang
MySQL Data Warehousing Survival Guide Marius Moscovici Steffan Mejia
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
Efficient Parallel Set-Similarity Joins Using Hadoop Chen Li Joint work with Michael Carey and Rares Vernica.
Lecture 11: DMBS Internals
Wook-Shin Han, Sangyeon Lee POSTECH, DGIST
Crescando: Predictable Performance for Unpredictable Workloads G. Alonso, D. Fauser, G. Giannikis, D. Kossmann, J. Meyer, P. Unterbrunner Amadeus S.A.
Improving Network I/O Virtualization for Cloud Computing.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
H-Store: A Specialized Architecture for High-throughput OLTP Applications Evan Jones (MIT) Andrew Pavlo (Brown) 13 th Intl. Workshop on High Performance.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
C-Store: How Different are Column-Stores and Row-Stores? Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 8, 2009.
Indexing HDFS Data in PDW: Splitting the data from the index VLDB2014 WSIC、Microsoft Calvin
Set Containment Joins: The Good, The Bad and The Ugly Karthikeyan Ramasamy Jointly With Jignesh Patel, Jeffrey F. Naughton and Raghav Kaushik.
Fine-grained Partitioning for Aggressive Data Skipping Calvin SIGMOD 2014 UC Berkeley.
Fine-grained Partitioning for Aggressive Data Skipping Liwen Sun, Michael J. Franklin, Sanjay Krishnan, Reynold S. Xin† UC Berkeley and †Databricks Inc.
Parallel Execution Plans Joe Chang
GSLPI: a Cost-based Query Progress Indicator
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Scaling up analytical queries with column-stores Ioannis Alagiannis Manos Athanassoulis Anastasia Ailamaki École Polytechnique Fédérale de Lausanne.
Dependable Technologies for Critical Systems Copyright Critical Software S.A All Rights Reserved. Handling big dimensions in distributed data.
Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.
Weaving Relations for Cache Performance Anastassia Ailamaki Carnegie Mellon David DeWitt, Mark Hill, and Marios Skounakis University of Wisconsin-Madison.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
1 Copyright © 2005, Oracle. All rights reserved. Following a Tuning Methodology.
Page 1 A Platform for Scalable One-pass Analytics using MapReduce Boduo Li, E. Mazur, Y. Diao, A. McGregor, P. Shenoy SIGMOD 2011 IDS Fall Seminar 2011.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
Stela: Enabling Stream Processing Systems to Scale-in and Scale-out On- demand Le Xu ∗, Boyang Peng†, Indranil Gupta ∗ ∗ Department of Computer Science,
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data Authored by Sameer Agarwal, et. al. Presented by Atul Sandur.
IBM Systems and Technology Group © 2008 IBM Corporation Oracle Exadata Storage and the HP Oracle Database Machine Competitive Seller Podcast Mark Wulf.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Sub-millisecond Stateful Stream Querying over
Lecture 11: DMBS Internals
Blazing-Fast Performance:
Boyang Peng, Le Xu, Indranil Gupta
April 30th – Scheduling / parallel
The Yin and Yang of Processing Data Warehousing Queries on GPUs
Managing batch processing Transient Azure SQL Warehouse Resource
Benchmarking Cloud Serving Systems with YCSB
Declarative Transfer Learning from Deep CNNs at Scale
The Gamma Database Machine Project
Presentation transcript:

A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses George Candea (EPFL & Aster Data) Neoklis Polyzotis (UC Santa Cruz) Radek Vingralek (Aster Data)

Highly Concurrent Data Warehouses Data analytics is a core service of any DW. High query concurrency is becoming important. At the same time, customers need predictability. – Requirement of actual customer: Increasing concurrency from one query to 40 should not increase latency by more than 6x. 2

Shortcoming of Existing Systems DWs employ the query-at-a-time model. – Each query executes as a separate physical plan. Result: Concurrent plans contend for resources. This creates a situation of “workload fear”. 3

Our Contribution: CJOIN A novel physical operator for star queries. – Star queries arise frequently in ad-hoc analytics. Main ideas: – A single physical plan for all concurrent queries. – The plan is always ``on’’. – Deep work sharing: I/O, join processing, storage. 4

Outline Preliminaries The CJOIN operator Experimental study Conclusions 5

Setting We assume a star-schema DW. We target the class of star queries. Goal: Executing efficiently concurrent star queries. – Low latency. – Graceful scale-up. 6

Further Assumptions Fact table is too large to fit in main memory. Dimension tables are “small”. – Example from TPC-DS: 2.5GB of dimension data for 1TB warehouse. Indices and materialized views may exist. Workload is volatile. 7

Outline Preliminaries The CJOIN operator Experimental study Conclusions 8

Design Overview 9 Preprocessor Filter Distributor Filter Optimizer Conventional Query Processor CJOIN Star Queries Other Queries Query Stream

Running Example 10 Q1Q1 select COUNT(*) from F join X join Y where φ 1 (X) and ψ 1 (Y) Q2Q2 select SUM(F.m) from F join Y where ψ 2 (Y) Queries Schema Fact Table F m Dimension X Dimension Y join X and TRUE(X)

The CJOIN Operator 11 Preprocessor Filter Distributor Filter Fact Table F COUNT SUM Q1Q1 Q2Q2 Continuous Scan

The CJOIN Operator 12 Preprocessor Filter Distributor Filter Dimension X Q1Q1 Dimension Y Q 1 ∧ −Q 2 −Q1 ∧ Q2−Q1 ∧ Q2 Q1 ∧ Q2Q1 ∧ Q2 Fact Table F COUNT SUM Q1Q1 Q2Q2 Continuous Scan a a b Q 1 : a Q 2 : b Q1Q1 Q2Q2 11 * * 01 Hash Table X Q1Q1 Q2Q2 10 * * Hash Table Y Query Start

Processing Fact Tuples 13 Preprocessor Filter Distributor Filter Q1Q1 Q2Q2 11 * * 01 Q1Q1 Q2Q2 Q1Q1 Q2Q2 10 * * 00 Fact Table F Q1Q1 Q2Q2 Q1Q1 Q2Q2 COUNT SUM Q1Q1 Q2Q Q1Q1 Q2Q a a b Q 1 : a Q 2 : b Hash Table XHash Table Y Query Start 0 1 Continuous Scan

Registering New Queries 14 Preprocessor Filter Distributor Filter Dimension X Q1Q1 Q1Q1 Q2Q2 11 * * 01 Q1Q1 Q2Q2 Fact Table F Q1Q1 Q2Q2 Q1Q1 Q2Q2 COUNT SUM Q1Q1 Q2Q2 Q1Q1 Q2Q2 10 * * Q1Q1 Q2Q a a b Q 1 : a Q 2 : b Hash Table XHash Table Y Query Start Q1Q1 Q2Q2 11 * * Q3Q Q3Q Continuous Scan Q3Q3 select AVG(F.m) from F join X where φ 3 (X) join Y and TRUE(Y) select * from X where φ 3 (Χ) −Q 1 ∧ Q 3 ∧ −Q 3

Registering New Queries 15 Preprocessor Filter Distributor Filter Q1Q1 Q 2 Q 3 Fact Table F Q1Q1 Q 2 Q 3 Q1Q1 COUNT SUM Q1Q1 Q2Q2 Q1Q1 Q2Q2 10 * * Q1Q1 Q 2 Q a a b Q 1 : a Q 2 : b Hash Table XHash Table Y Query Start Q3Q c Q 3 : c Begin Q 3 AVG Q3Q Continuous Scan Q1Q1 Q2Q2 11 * * Q3Q select AVG(F.m) from F join X where φ 3 (X) join Y and TRUE(Y) c:

Properties of CJOIN Processing CJOIN enables a deep form of work sharing: – Join computation. – Tuple storage. – I/O. Computational cost per tuple is low. -Hence, CJOIN can sustain a high I/O throughput. Predictable query latency. – Continuous scan can provide a progress indicator. 16

Other Details (in the paper) Run-time optimization of Filter ordering. Updates. Implementation on multi-core systems. Extensions: – Column stores. – Fact table partitioning. – Galaxy schemata. 17 Preprocessor Distributor Filter x n

Outline Preliminaries The CJOIN operator Experimental study Conclusions 18

Experimental Methodology Systems: – CJOIN Prototype on top of Postgres. – Postgres with shared scans enabled. – Commercial system X. We use the Star Schema Benchmark (SSB). – Scale factor = 100 (100GB of data). – Workload comprises parameterized SSB queries. Hardware: – Quad-core Intel Xeon. – 8GB of shared RAM. – RAID-5 array of four 15K RPM SAS disks. 19

Effect of Concurrency 20 Throughput increases with more concurrent queries.

Response Time Predictability 21 Query latency is predictable; no more workload fear.

Influence of Data Scale 22 CJOIN is effective even for small data sets. Concurrency level: 128

Related Work Materialized views [R+95,HRU96]. Multiple query Optimization [T88]. Work Sharing. – Staged DBs [HSA05]. – Scan Sharing [F94, Z+07, Q+08]. – Aggregation [CR07]. BLINK [R+08]. Streaming database systems [M+02, B+04]. 23

Conclusions High query concurrency is crucial for DWs. Query-at-a-time leads to poor performance. Our solution: CJOIN. – Target: Class of star queries. – Deep work sharing: I/O, join, tuple storage. – Efficient realization on multi-core architectures. Experiments show an order of magnitude improvement over commercial system. 24

THANK YOU!