Prof. R. Bayer, Ph.D. Dr. Volker Markl

Slides:



Advertisements
Similar presentations
Vorlesung Datawarehousing Table of Contents Prof. Rudolf Bayer, Ph.D. Institut für Informatik, TUM SS 2002.
Advertisements

DBMS 2001Notes 4.2: Hashing1 Principles of Database Management Systems 4.2: Hashing Techniques Pekka Kilpeläinen (after Stanford CS245 slide originals.
Hashing and Indexing John Ortiz.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
CS CS4432: Database Systems II. CS Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop.
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Indexing for Multidimensional Data An Introduction.
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
© 1998 FORWISS FORWISS Oracle Measurement Results Prof. Bayer, PhD Dipl.-Inform. Volker Markl Roland Pieringer.
© 1999 FORWISS FORWISS MISTRAL und DWH 6-2 Processing Relational Queries Using the Multidimensional Access Method UB-Tree Prof. R. Bayer, Ph.D. Dr. Volker.
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
© 2000 FORWISS, 1 MISTRAL Processing Relational Queries Using a Multidimensional Access Method.
Indexing for Multidimensional Data An Introduction.
© 1999 FORWISS FORWISS MISTRAL Performance of TPC-D Benchmark and Datawarehouses Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Prof. Bayer, DWH, Ch.5, SS Chapter 5. Indexing for DWH D1Facts D2.
© 1999 FORWISS General Research Report Implementation and Optimization Issues of the ROLAP Algebra F. Ramsak, M.S. (UIUC) Dr. V. Markl Prof. R. Bayer,
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Indexing OLAP Data Sunita Sarawagi Monowar Hossain York University.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Indexing.
CS411 Database Systems Kazuhiro Minami 10: Indexing-1.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
Query Optimization Cases. D. ChristozovINF 280 DB Systems Query Optimization: Cases 2 Executable Block 1 Algorithm using Indices (if available) Temporary.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 8 Jianping Fan Dept of Computer Science UNC-Charlotte.
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
Indexing Multidimensional Data
Database System Architecture and Implementation
Data Indexing Herbert A. Evans.
Module 11: File Structure
15.1 – Introduction to physical-Query-plan operators
CPS216: Data-intensive Computing Systems
Indexing Structures for Files and Physical Database Design
CS 540 Database Management Systems
Indexing Goals: Store large files Support multiple search keys
CS 440 Database Management Systems
Physical Database Design
Multidimensional Access Structures
Lecture 16: Data Storage Wednesday, November 6, 2006.
COMP 430 Intro. to Database Systems
Database Management Systems (CS 564)
Database Performance Tuning and Query Optimization
Lecture 11: DMBS Internals
External Memory Hashing
File organization and Indexing
Chapter 11: Indexing and Hashing
Database.
Physical Database Design
CPSC-310 Database Systems
Indexing and Hashing Basic Concepts Ordered Indices
Selected Topics: External Sorting, Join Algorithms, …
Lecture 19: Data Storage and Indexes
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Database Design and Programming
Overview of Query Evaluation
Chapter 11 Database Performance Tuning and Query Optimization
Indexing 4/11/2019.
Chapter 11: Indexing and Hashing
CPSC-608 Database Systems
Indexing, Access and Database System Architecture
Chapter 11: Indexing and Hashing
Chapter 10.1: UB-tree for Multidimensional Indexing
Chapter 6: UB-tree for Multidimensional Indexing
Presentation transcript:

MISTRAL Processing Relational Queries Using the Multidimensional Access Method UB-Tree Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical University Munich and Bavarian Research Center for Knowledgebased Systems (FORWISS)

Agenda 1. Concept of the UB-Tree 2. Range Query Algorithm 3. Tetris Algorithm

Next Generation DB-Applications - huge databases - multidimensional - complex queries Examples: - Datawarehouses - geographic informationsystems - time series

B-Tree and UB-Tree B-Tree UB-Tree invented in 1969, published in 1970/72 enabling technology for all commercial DBMS for 25 years UB-Tree invented in 1996 enabling technology for next generation DB applications

UB-Tree Access method (TransBase HC) enable next generations DB-applications!! buying patterns of consumers geographic data and time series datamining mobile phone calls 10 million users * 10 calls/day = 100 million records caller callee time duration geographic location ~ 100 Bytes/call  10 GB/day = 3.6 TeraByte/year

Design Goals  revolution in DB-applications clustering tuples on disk pages while preserving spatial proximity efficient incremental organization logarithmic worst-case guarantees for insertion, deletion and point queries efficient handling of range queries good average memory utilization  revolution in DB-applications

Z-regions/UB-Trees A Z-region [a : b] is the space covered by an interval on the Z-curve and is defined by two Z-addresses a and b. Z-region [0.1 : 1.1.1] UB-Tree partitioning: 0.1] 1.1.1] 2.1] 3] 4] point data creating the UB-Tree on the left for a page capacity of 2 points

UB-Tree Insertion 1/2/3/4 Note: partioning is not unique!!

UB-Tree Insertion 18/19

Multidimensional Range Query SELECT * FROM table WHERE (A1 BETWEEN a1 AND b1) AND (A2 BETWEEN a2 AND b2) AND ..... (An BETWEEN an AND bn)

Comparison of the Rangequery Performance compound primary B-Tree s1*P multiple B-Trees, bitmap indexes s1*I1+s2*I2+s1*s2*T multidimensional index s1 *s2 *P ideal case s1*s2*P

Example of a Range Query rangequery(ql,qh) = a := Z(ql);  := Z(qh) while a <  p := read page(a) output intersection(p,ql,qh) a := getnext(a,ql,qh) getnext(a,ql,qh) = l := last-intersection-in-region(a,ql,qh) f := first-intersection-in-brother/father(l,ql,qh)) return alpha(f)

Two Visualized Range-Queries

Range Queries in sparsely and densely populated parts of the Universe

Range Queries and Growing Databases 1000 tuples 50 000 tuples

Summary UB-Trees most DBMS applications benefit 50%-83% storage utilization, dynamic updates Efficient Z-address calculation (bit-interleaving) Logarithmic performance for the basic operations Efficient range query algorithm (bit-operations) Prototype UB/API above RDBMS (Oracle 8, Informix, DB2 UDB, TransBase, soon: MS SQL 7.0) using ESQL/C Patent application most DBMS applications benefit

Sorting Query Boxes SELECT * FROM table WHERE (A1 BETWEEN a1 AND b1) AND (A2 BETWEEN a2 AND b2) AND ..... (An BETWEEN an AND bn) ORDER BY Ai, Aj, Ak, ... (GROUP BY Ai, Aj, Ak, ...)

The Tetris Algorithm sort direction

Summary Tetris  speedup for all DB operations Combines sort process and evaluation of multi-attribute restrictions in one processing step I/O-time linear w.r. to result set size temporary storage sublinear w.r. to result set size Sorting no longer a “blocking operation” Patent application  speedup for all DB operations