To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

On-line Index Selection for Physical Database Tuning
Database Planning, Design, and Administration
Overcoming Limitations of Sampling for Agrregation Queries Surajit ChaudhuriMicrosoft Research Gautam DasMicrosoft Research Mayur DatarStanford University.
A lightweight framework for testing database applications Joe Tang Eric Lo Hong Kong Polytechnic University.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Fast Algorithms For Hierarchical Range Histogram Constructions
A comparison of MySQL And Oracle Jeremy Haubrich.
STHoles: A Multidimensional Workload-Aware Histogram Nicolas Bruno* Columbia University Luis Gravano* Columbia University Surajit Chaudhuri Microsoft Research.
IBM Software Group ® Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
Outline SQL Server Optimizer  Enumeration architecture  Search space: flexibility/extensibility  Cost and statistics Automatic Physical Tuning  Database.
CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads Debabrata Dash, Anastasia Ailamaki, Neoklis Polyzotis 1.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
1 Primitives for Workload Summarization and Implications for SQL Prasanna Ganesan* Stanford University Surajit Chaudhuri Vivek Narasayya Microsoft Research.
Database Implementation Issues CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 5 Slides adapted from those used by Jennifer Welch.
Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.
Physical Database Monitoring and Tuning the Operational System.
Chapter 3: Data Storage and Access Methods
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
Evaluating Top-k Queries over Web-Accessible Databases Nicolas Bruno Luis Gravano Amélie Marian Columbia University.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Oracle Challenges Parallelism Limitations Parallelism is the ability for a single query to be run across multiple processors or servers. Large queries.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise.
Maintaining a Database Access Project 3. 2 What is Database Maintenance ?  Maintaining a database means modifying the data to keep it up-to-date. This.
Competitive algorithms for the dynamic selection of component implementations D M Yellin.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
M.Kersten MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam.
Chapter 5 Index and Clustering
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Practical Database Design and Tuning
CS 540 Database Management Systems
An Efficient, Cost-Driven Index Selection Tool for MS-SQL Server
Informix Red Brick Warehouse 5.1
RE-Tree: An Efficient Index Structure for Regular Expressions
Methodology – Physical Database Design for Relational Databases
Physical Database Design for Relational Databases Step 3 – Step 8
Database Performance Tuning and Query Optimization
High Coverage Detection of Input-Related Security Faults
Spatial Online Sampling and Aggregation
Data Lifecycle Review and Outlook
Database Implementation Issues
Automatic Physical Design Tuning: Workload as a Sequence
Practical Database Design and Tuning
DATABASE IMPLEMENTATION ISSUES
Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
Chapter 11 Database Performance Tuning and Query Optimization
A Framework for Testing Query Transformation Rules
Query Optimization.
Database Implementation Issues
Efficient Processing of Top-k Spatial Preference Queries
Database Implementation Issues
Presentation transcript:

To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06

2 A DBA’s Dilemma Physical design tuning is important Workloads and data change over time Installations often become suboptimal Current tools: good but expensive DBAs: Avoid suboptimal installations Periodically run expensive tools If no improvement, wasted resources Tuner DBMS SELECT … INSERT … SELECT … Recommendation: { Index1, Index2, View1, View2}

3 A Lightweight Alerter Low-overhead diagnostics Reliable lower-bound improvement No false positives “Proof” with valid configuration Upper-bound improvement Reduce false negatives

4 Outline Instrumenting the optimizer Access path selection Index requests Lower bounds Local transformations Alerting algorithm Upper bounds Experimental results

5 Access Path Selection Single entry-point for access-path selection (System-R, Cascades) Intercept requests during optimization, save logical properties for later

6 Access Path Requests SELECT T.b FROM T1, T2, T3 WHERE T1.x=T2.y AND T1.w=T3.z AND T1.a=5 AND T3.b=8

7 Monitoring Access Path Requests “ AND/OR trees” Encode relationships between requests Aggregated across queries 2-level normalized AND/OR tree.

8 Local Transformations Requests encode properties of any physical plan rooted at the corresponding operator Allow cost inferences for varying physical designs without calling the optimizer Result is upper bound of query cost after true optimization If cost is 0.02, query is = 0.06 faster

9 Impact of Hypothetical Indexes Single index, single request Exploits logical information about request Safe inferences on subset of valid plans Only need costs, do not “build” plans Multiple indexes, multiple requests Analyze all available indexes for each request Exploit AND/OR tree for multiple requests Measures lower bound in difference between current and original configurations

10 Alerting Algorithm For each request in T, obtain index that results in best strategy Repeat while space constraint is not satisfied and improvement still large enough. AND/OR tree gathered during original optimization No additional optimizer calls! If size between storage bounds and improvement is big enough, save configuration for alert. Transformations: - Index Merge. - Index Deletion.

11 Upper Bounds Reduce false negatives Alert if: improvement is at least 25% OR maximum improvement is 75% Fast Upper Bounds Track all requests (not only AND/OR tree) Group requests by table Calculate “required work” Tighter Upper Bounds Add new optimization phase that only considers viable plans More expensive, but tightest upper bound

12 Handling Updates Update queries are handled as: (select core) + (update shell) Optimizer instrumentation: also gathers update information Lower bounds: small changes to main algorithm (skyline of alternatives, non- monotonic improvement) Upper bounds: Add necessary work for update shells

13 Experimental Evaluation Real and synthetic databases Metrics: Execution time and Improvement Experiments: Monitoring Overhead (server optimization) Diagnostics Overhead (alerting client) Quality of bounds/recommendation

14 Performance Server Overhead for Upper Bounds (Lower Bound Overhead << 1%) Client Overhead for lower + upper bounds TPC-H Database and workloads

15 Varying Workloads TPC-H workloads W 1 (first 11 queries) W 2 (last 11 queries) W 3 (mix). Initial design tuned for W 1

16 Varying Initial Physical Design TPC-H database and workloads C i is recommendation of alerter after executing the workload under C i-1

17 Conclusions Alerter fills gap in automatic physical design tools Low server/client overhead, can monitor/diagnose very efficiently Lower bounds are supported by valid (applicable) configurations Upper bounds provide additional flexibility for defining policies

18 Lower and Upper bounds for improvement Single-Query Workloads TPC-H Database and workloads

19 Complex Workloads TPCH MIRMS