© 2000 FORWISS, 1 MISTRAL Processing Relational Queries Using a Multidimensional Access Method.

Slides:



Advertisements
Similar presentations
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
DBMS 2001Notes 4.2: Hashing1 Principles of Database Management Systems 4.2: Hashing Techniques Pekka Kilpeläinen (after Stanford CS245 slide originals.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
1 Introduction to Database Systems CSE 444 Lectures 19: Data Storage and Indexes November 14, 2007.
Transbase® Hypercube: A leading-edge ROLAP Engine supporting multidimensional Indexing and Hierarchy Clustering Roland Pieringer Transaction Software GmbH.
Spatial Indexing I Point Access Methods. PAMs Point Access Methods Multidimensional Hashing: Grid File Exponential growth of the directory Hierarchical.
BTrees & Bitmap Indexes
Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
Spatial Indexing I Point Access Methods.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
DEXA, Sept. 3, 2001R. Bayer, TUM1 XML Databases: Modelling and Multidimensional Indexing Rudolf Bayer Sept. 3, 2001.
8-1 Outline  Overview of Physical Database Design  File Structures  Query Optimization  Index Selection  Additional Choices in Physical Database Design.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #12.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
Indexing structures for files D ƯƠ NG ANH KHOA-QLU13082.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Mutlidimensional Indices Instructor: Randal Burns Lecture for 29 November 2005 Computer Science Johns Hopkins University.
Partitioning – A Uniform Model for Data Mining Anne Denton, Qin Ding, William Jockheck, Qiang Ding and William Perrizo.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Indexing for Multidimensional Data An Introduction.
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
© 1998 FORWISS FORWISS Oracle Measurement Results Prof. Bayer, PhD Dipl.-Inform. Volker Markl Roland Pieringer.
© 1999 FORWISS FORWISS MISTRAL und DWH 6-2 Processing Relational Queries Using the Multidimensional Access Method UB-Tree Prof. R. Bayer, Ph.D. Dr. Volker.
ICS 321 Fall 2011 Overview of Storage & Indexing (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 11/9/20111Lipyeow.
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
© 1999 FORWISS FORWISS MISTRAL Performance of TPC-D Benchmark and Datawarehouses Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical.
Prof. Bayer, DWH, Ch.5, SS Chapter 5. Indexing for DWH D1Facts D2.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
© 1999 FORWISS General Research Report Implementation and Optimization Issues of the ROLAP Algebra F. Ramsak, M.S. (UIUC) Dr. V. Markl Prof. R. Bayer,
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Chapter 5 Index and Clustering
Indexing OLAP Data Sunita Sarawagi Monowar Hossain York University.
Chap 5. Disk IO Distribution Chap 6. Index Architecture Written by Yong-soon Kwon Summerized By Sungchan IDS Lab
CS411 Database Systems Kazuhiro Minami 10: Indexing-1.
Prof. Bayer, DWH, Ch.6, SS Chapter 6: UB-tree for Multidimensional Indexing Note: all relational databases are multidimensional: a tuple in a relation.
Chapter 5 Record Storage and Primary File Organizations
Multidimensional Access Structures COMP3017 Advanced Databases Dr Nicholas Gibbins –
Module 11: File Structure
CPS216: Data-intensive Computing Systems
Indexing Structures for Files and Physical Database Design
Multidimensional Access Structures
Chapter 11: Indexing and Hashing
Indexing and Hashing Basic Concepts Ordered Indices
Multidimensional Indexes
Database Design and Programming
Self-organizing Tuple Reconstruction in Column-stores
Prof. R. Bayer, Ph.D. Dr. Volker Markl
Chapter 11: Indexing and Hashing
Oracle Measurement Results
Presentation transcript:

© 2000 FORWISS, 1 MISTRAL Processing Relational Queries Using a Multidimensional Access Method Volker Markl Rudolf Bayer FORWISS (Bayerisches Forschungszentrum für Wissensbasierte Systeme)

© 2000 FORWISS, 2 Staff Members MISTRAL Project Management Prof. Rudolf Bayer, Ph.D. (FORWISS Knowledge Bases Group Head) Dr. Volker Markl (MISTRAL Project Leader, Deputy Research Group Head) MISTRAL Research Assistants Dipl. Inform. Robert Fenk Dipl. Inform. Roland Pieringer Frank Ramsak, M.Sc. Dipl. Inform. Martin Zirkel MISTRAL Master Students and Interns Ralf Acker, Bulent Altan, Sonja Antunes, Michael Bauer, Sascha Catelin, Naoufel Boulila, Nils Frielinghaus, Sebastian Hick, Stefan Krause, Jörg Lanzinger, Christian Leiter, Yiwen Lue, Stephan Merkel, Nasim Nadjafi, Oliver Nickel, Daniel Ovadya, Markus Pfadenauer, Timka Piric, Sabine Rauschendorfer, Antonius Salim, Maximilian Schramm, Michael Streichsbier, Anton Tichatschek

© 2000 FORWISS, Range Queries Tetris Algorithm

© 2000 FORWISS, 4 Overview 1. Concept of the UB-Tree: Z-Regions 2. Insertion 3. Range Query Algorithm 4. Tetris Algorithm 5. Kernel Integration 6. Performance Overview

© 2000 FORWISS, 5 Relations and MD Space l Decision Support Relation (similar to TPC-D) – Fact(customer, product, time, Sales) Ô defines a three dimensional cube l Point Query – All sales for one customer for one specific product on a certain day l Partial Match Query – All sales for product X l Range Query – All sales for year 1999 for a specific product group for a specific customer group

© 2000 FORWISS, 6 Design Goals l clustering tuples on disk pages while preserving spatial proximity l efficient incremental organization l logarithmic worst-case guarantees for insertion, deletion and point queries l efficient handling of range queries l good average memory utilization

© 2000 FORWISS, 7 Z-Ordering (a)(b)

© 2000 FORWISS, 8 Z-regions/UB-Trees A Z-region [  :  ] is the space covered by an interval on the Z-curve and is defined by two Z-addresses  and . UB-Tree partitioning: [0 : 3],[4 : 20], [21 : 35], [36 : 47], [48 : 63] Z-region [4 : 20] 4 20 point data creating the UB-Tree on the left for a page capacity of 2 points

© 2000 FORWISS, 9 UB-Tree Insertion 1/2/3/4

© 2000 FORWISS, 10 UB-Tree Insertion 18/19

© 2000 FORWISS, 11 Multidimensional Range Query SELECT * FROM table WHERE (A 1 BETWEEN a 1 AND b 1 ) AND (A 2 BETWEEN a 2 AND b 2 ) AND..... (A n BETWEEN a n AND b n )

© 2000 FORWISS, 12 Theoretical Comparison of the Rangequery Performance ideal case s 1 *s 2 *P multidimensional index s 1  *s 2  *P multiple B-Trees, bitmap indexes s 1 *I 1 +s 2 *I 2 +s 1 *s 2 *T composite key clustering B-Tree s 1 *P

© 2000 FORWISS, 13 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 14 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 15 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 16 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 17 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 18 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 19 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 20 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 21 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 22 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 23 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 24 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 25 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 26 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 27 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 28 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 29 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 30 rangeQuery(Tuple ql, Tuple qh ) { Zaddress start = Z( ql ); Zaddress cur = start ; Zaddress end = Z( qh ); Page page = {}; while (1) { cur = getRegionSeparator( cur ); page = getPage( cur ); outputMatchingTuples( page, ql, qh ); if ( cur >= end ) break; cur = getNextZAddress( cur, start, end ); } B * -Tree linear Z-space UB-Tree

© 2000 FORWISS, 31 Range Queries and Data Distributions

© 2000 FORWISS, 32 Growing Databases 1000 tuples tuples

© 2000 FORWISS, 33 Summary UB-Trees ü 50% storage utilization, dynamic updates ü Efficient Z-address calculation (bit-interleaving) ü Logarithmic performance for the basic operations ü Efficient range query algorithm (bit-operations) ü Prototype UB/API above RDBMS (Oracle 8, Informix, DB2 UDB, TransBase, MS SQL 7.0) using ESQL/C ? Patent application

© 2000 FORWISS, 34 Standard Query Pattern SELECT * FROM table WHERE (A 1 BETWEEN a 1 AND b 1 ) AND (A 2 BETWEEN a 2 AND b 2 ) AND..... (A n BETWEEN a n AND b n ) ORDER BY A i, A j, A k,... (GROUP BY A i, A j, A k,...)

© 2000 FORWISS, 35 Z-Order/Tetris Order A 2 A A 2 A 1 0 T j (x) = x j ∘ Z(x 1,...,x j-1,x j+1,...,x d ) A 2 A 1

© 2000 FORWISS, 36 The Tetris Algorithm sort direction

© 2000 FORWISS, 37 Summary Tetris l Combines sort process and evaluation of multi- attribute restrictions in one processing step l I/O-time linear w.r. to result set size l temporary storage sublinear w.r. to result set size l Sorting no longer a “blocking operation” ? Patent application

© 2000 FORWISS, 38 Integration Issues l Starting point with TransBase: – clustering B*-Tree – appropriate data type for Z-values: variable bit strings l Modifications to B*-Tree in TransBase: – support for computed keys: »Z-values are only stored in the index, not together with the tuples »tuples are stored in Z-order – generalization of splitting algorithm: »computed page separators for improved space partitioning

© 2000 FORWISS, 39 l Minor extensions: l Major extensions: l New modules: l NO changes for: – DML – Multi-user support, i.e., locking, logging facilities  handled by underlying B*- Tree

© 2000 FORWISS, 40 Summary Integration l Integration of the UB-Tree has been achieved within one year l TransBase HyperCube is shipping since Systems 1999 and was awarded the 2001 IT-Prize by EUROCASE and the European Commission l UB-Trees speed up relational DBMS for multidimensional applications like Geo-DB and data warehouse up to two orders of magnitude l Speedup is even more dramatic for CD-ROM databases (archives)

© 2000 FORWISS, 41 Application Fields of the UB-Tree l Data Warehouses – Measurements with SAP BW Data »UB-Tree/API for Oracle »UB-Tree on top of Oracle outperforms conventional B-Tree and Bitmap indexes in Oracle! – Measurements with the GfK Data Warehouse »UB-Tree in TransBase HyperCube »significant performance increases (Factor of 10) l Geographic Databases l „Multidimensional Problems“ – Archiving Systems, Lifecycle-Management, Data Mining, OLAP, OLTP, etc.

© 2000 FORWISS, 42 The UB-Tree

© 2000 FORWISS, 43 Further Information Tetris HI MISTRAL UB-Tree Te mp tris