© 1999 FORWISS FORWISS MISTRAL Performance of TPC-D Benchmark and Datawarehouses Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical.

Slides:



Advertisements
Similar presentations
Vorlesung Datawarehousing Table of Contents Prof. Rudolf Bayer, Ph.D. Institut für Informatik, TUM SS 2002.
Advertisements

Examples of Physical Query Plan Alternatives
Query Processing and Optimizing on SSDs Flash Group Qingling Cao
Data Warehouse Tuning. 7 - Datawarehouse2 Datawarehouse Tuning Aggregate (strategic) targeting: –Aggregates flow up from a wide selection of data, and.
6.830 Lecture 9 10/1/2014 Join Algorithms. Database Internals Outline Front End Admission Control Connection Management (sql) Parser (parse tree) Rewriter.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
University of Konstanz Advances in Database Query Processing Sahak Maloyan Avoiding Sorting and Grouping In Processing Queries Sahak Maloyan.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting CS634 Lecture 10, Mar 5, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
C-Store: Introduction to TPC-H Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 20, 2009.
Transbase® Hypercube: A leading-edge ROLAP Engine supporting multidimensional Indexing and Hierarchy Clustering Roland Pieringer Transaction Software GmbH.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 11 External Sorting.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Computer Organization Cs 147 Prof. Lee Azita Keshmiri.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
Motivation Mobile devices often work offline, and users often need to download large query results for later use. Results are often accessed in small pieces.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
By N.Gopinath AP/CSE. Two common multi-dimensional schemas are 1. Star schema: Consists of a fact table with a single table for each dimension 2. Snowflake.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
External Sorting Chapter 13.. Why Sort? A classic problem in computer science! Data requested in sorted order  e.g., find students in increasing gpa.
SQL Server 2005 Performance Enhancements for Large Queries Joe Chang
SQL Server Parallel Data Warehouse: Supporting Large Scale Analytics José Blakeley, Software Architect Database Systems Group, Microsoft Corporation.
© 2008 Quest Software, Inc. ALL RIGHTS RESERVED. Benchmarking Advice & Recommendations August 2008.
A Paradigm Shift in Database Optimization: From Indices to Aggregates Presented to: The Data Warehousing & Data Mining mini-track – AMCIS 2002 as Research-in-Progress.
July, 2001 High-dimensional indexing techniques Kesheng John Wu Ekow Otoo Arie Shoshani.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
1 Recovery Tuning Main techniques Put the log on a dedicated disk Delay writing updates to the database disks as long as possible Setting proper intervals.
Atlanta Oracle Application User’s Group August 18, 2000.
© 1998 FORWISS FORWISS Oracle Measurement Results Prof. Bayer, PhD Dipl.-Inform. Volker Markl Roland Pieringer.
© 1999 FORWISS FORWISS MISTRAL und DWH 6-2 Processing Relational Queries Using the Multidimensional Access Method UB-Tree Prof. R. Bayer, Ph.D. Dr. Volker.
Swarup Acharya Phillip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented By Vinay Hoskere.
Sorting.
1 C-Store: A Column-oriented DBMS By New England Database Group.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
© 2000 FORWISS, 1 MISTRAL Processing Relational Queries Using a Multidimensional Access Method.
Fine-grained Partitioning for Aggressive Data Skipping Calvin SIGMOD 2014 UC Berkeley.
CS Operating System & Database Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
Introduction to Computer Architecture. What is binary? We use the decimal (base 10) number system Binary is the base 2 number system Ten different numbers.
Ch 14 QQ T F 1.A database table consists of fields and records. T F 2.Good data validation techniques can help improve data integrity. T F 3.An index is.
Database Techniek Martin Kersten Peter Boncz CWI.
Introduction to Microsoft Windows 2000 Welcome to Chapter 1 Windows 2000 Server.
Prof. Bayer, DWH, Ch.5, SS Chapter 5. Indexing for DWH D1Facts D2.
Multi-Way Hash Join Effectiveness M.Sc Thesis Michael Henderson Supervisor Dr. Ramon Lawrence 2.
© 1999 FORWISS General Research Report Implementation and Optimization Issues of the ROLAP Algebra F. Ramsak, M.S. (UIUC) Dr. V. Markl Prof. R. Bayer,
Buffer-pool aware Query Optimization Ravishankar Ramamurthy David DeWitt University of Wisconsin, Madison.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Generalized Hash Teams for Join and Group-By Alfons Kemper Donald Kossmann Christian Wiesner Universität Passau Germany.
Performance. Performance Performance is a critical issue especially in a multi-user environment. Benchmarking is one way of testing this.
Prof. Bayer, DWH, Ch.7, SS20021 Chapt. 7 Multidimensional Hierarchical Clustering Fig. 3.1 Hierarchies in the `Juice and More´ schema Year (3) Month (12)
Unit C-Hardware & Software1 GNVQ Foundation Unit C Bits & Bytes.
ICOM 5016 – Introduction to Database Systems Lecture 13- File Structures Dr. Bienvenido Vélez Electrical and Computer Engineering Department Slides by.
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Random Sampling in Database Systems: Techniques and Applications Ke Yi Hong Kong University of Science and Technology Big Data.
CSE6011 Implementing a Warehouse  Monitoring: Sending data from sources  Integrating: Loading, cleansing,...  Processing: Query processing, indexing,...
Wander Join: Online Aggregation via Random Walks
External Sorting Chapter 13
External Sorting Chapter 13
Selected Topics: External Sorting, Join Algorithms, …
Prof. R. Bayer, Ph.D. Dr. Volker Markl
Chapt. 7 Multidimensional Hierarchical Clustering
External Sorting Chapter 13
Oracle Measurement Results
Presentation transcript:

© 1999 FORWISS FORWISS MISTRAL Performance of TPC-D Benchmark and Datawarehouses Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical University Munich and Bavarian Research Center for Knowledgebased Systems (FORWISS)

© 1999 FORWISS FORWISS Test Bed for Performance Measurements l Hardware – Compaq Proliant 5000 – 4 Pentium II 200 MHz – 512 MB RAM – hard disk: 7 * 4 GB = 28 GB l Operating System – Windows NT 4.0 l RDBMS – Oracle 8 – 8kB pages l Access Methods – Tetris Algorithm for UB-Trees – Oracle IOT (clustering B*-Tree) – Oracle FTS (full table scan)

© 1999 FORWISS FORWISS TPC-D Schema

© 1999 FORWISS FORWISS Shipping Priority Query (Q3) SELECT L_ORDERKEY, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE, O_ORDERDATE, O_SHIPPRIORITY FROM CUSTOMER, ORDER, LINEITEM WHERE C_MKTSEGMENT = 'FOOD' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE DATE GROUP BY L_ORDERKEY, O_ORDERDATE, O_SHIPPRIORITY ORDER BY REVENUE DESC, O_ORDERDATE

© 1999 FORWISS FORWISS Tetris algorithm Q3

© 1999 FORWISS FORWISS Response times 50% LINEITEM (Q3)

© 1999 FORWISS FORWISS Temporary Storage 50% LINEITEM (Q3)

© 1999 FORWISS FORWISS Sorting 50% of LINEITEM

© 1999 FORWISS FORWISS Forecasting Revenue Change Query (Q6) SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM WHERE L_SHIPDATE >= [date] AND L_SHIPDATE <= [date] + INVERVAL 1 YEAR AND L_DISCOUNT BETWEEN [discount] AND [discount] AND L_QUANTITY < [quantity]

© 1999 FORWISS FORWISS Forecasting Revenue Change Query (Q6)

© 1999 FORWISS FORWISS Performance of Q6

© 1999 FORWISS FORWISS Retrieving 3,3% of LINEITEM

© 1999 FORWISS FORWISS GFK Snowflake Schema

© 1999 FORWISS FORWISS TETRIS & MHC

© 1999 FORWISS FORWISS Performance Measurements GFK l DBMS – TransBase (covering, clustering compound B*-Trees) – UB/API on top of TransBase (UB-Tree, two ESQL Statements are optimized and processed per UB-Tree page access) – TransBase Hypercube (UB-Tree inside the DBMS Kernel) l Database – real world data warehouse from GFK – 3D Snowflake Schema »Time (3 years = 18 MP) »Segment (10500 outlets) »Product (~ items in 604 product groups) – 42 Mio fact tuples (~ 4 GB fact table size) l Computer – Sun ULTRA 1 Workstation (64 MB Main Memory)

© 1999 FORWISS FORWISS Indexes l MHC to encode hierarchies: – TIME_CS (5 bits) – SEGMENT_CS (24 bits) – PRODUCT_CS (29 bits) l Compound on (PRODUCT_CS, TIME_CS, SEGMENT_CS) or (TIME_CS, SEGMENT_CS, PRODUCT_CS) l UB-Tree (UB/API) on {TIME_CS, PRODUCT_CS, SEGMENT_CS}

© 1999 FORWISS FORWISS GFK Datawarehouse Reports  selectivity << 1%

© 1999 FORWISS FORWISS Compound: fixed 2MP, varying PG

© 1999 FORWISS FORWISS UB-Tree: fixed 2MP, varying PG

© 1999 FORWISS FORWISS Response Time & Result Set Size

© 1999 FORWISS FORWISS Clustering of UB-Trees Ø = 0.85 s / d c clustering factor

© 1999 FORWISS FORWISS Clustering depending on Result Set Size

© 1999 FORWISS FORWISS Summary UB-Tree l Excellent performance on large real DBs, > factor 10 l Very low storage requirement l 1st answer extremely fast, interactive use!! l Response time proportional to size of answer l Wide applicability: all DBs are multidimensional!! l Easy integration into DBMS, simple DDL extension l Very useful as middleware ? Patent applications