On-line Index Selection for Physical Database Tuning Karl Schnaitter UCSC & Aster Data Advisor: Neoklis Polyzotis ISSDM Mentor: John Bent SRL/ISSDM Symposium 2010
Index Tuning for Databases SELECT b FROM R WHERE a = 10 Execution Plan 1 R a b 5 “blue” 10 “red” 97 “green” · · · Reads whole table Discards rows where a ≠ 10 Returns b from remaining rows Table Scan · · · Execution Plan 2 R a b 5 “blue” 10 “red” 97 “green” · · · Only accesses rows where a = 10 May be much faster than a table scan Index Scan index(R(a))
Introduction The index selection problem for databases: The goal is to choose the indexes that improve database performance A crucial part of performance tuning Index selection is very challenging Query benefit must be weighed against the cost of creation, maintenance and storage Dynamic query loads add more difficulty In practice, index selection is typically handled by an experienced database administrator
Previous Work Auto-index tuning is an active research topic Hammer and Chan, SIGMOD 1976 Finkelstein et al., TODS 1988 Chaudhuri and Narasaya, VLDB 1997 Bruno and Chaudhuri, ICDE 2007 …and numerous other references
Off-line Index Advisor Off-line Tuning Tuning Workload User Actual Workload Off-line Index Advisor Recommended Indexes DBA Create/Drop Indexes DBA controls index maintenance DBA must know the workload in advance Limited support for interactive tuning What-if Optimizer Database Server
On-line Tuning Tuning is based on actual workload On-line Index Advisor DBA User Tuning is based on actual workload DBA is out of the loop Performance can become unpredictable What-if Optimizer Create/Drop Indexes Actual Workload Database Server
Our Proposal: Semi-automatic Tuning Feedback Semi-automatic Index Advisor Recommended Indexes User DBA Create/Drop Indexes What-if Optimizer Actual Workload Database Server
Semi-automatic Index Advisor Example Session I like a; I want d I don’t like b Semi-automatic Index Advisor {a, b, c} User DBA {a, d, e, c} {a, f, c} DBA makes decisions on index-maintenance Workload is analyzed online DBA expertise is coupled with index advisor What-if Optimizer Create/Drop Indexes Actual Workload Database Server
Our Proposal: Semi-automatic Tuning Feedback Semi-automatic Index Advisor Recommended Indexes User DBA Create/Drop Indexes What-if Optimizer Actual Workload Database Server
Generating Recommendations workload candidate selection User index candidates Work Function Algorithm (WFA) of Borodin & El Yaniv 1998 + performance guarantees - exponential complexity recommendation logic Work Function Algorithm recommended index set DBA
Example of Index Interaction SELECT b FROM R WHERE a = 10 Execution Plan 1 R a b 5 “blue” 10 “red” 97 “green” · · · Only accesses rows where a = 10 May be much faster than a table scan Index Scan index(R(a)) Execution Plan 2 ++ Example of index interaction Only accesses index If plan 2 is cheaper, the index on R.a becomes obsolete Index-Only Scan index(R(a,b))
Interaction-Based Partitioning Consider an undirected graph Vertex for each candidate index Edge between interacting indexes ++ Describe stable partitioning We call this a stable partition of the indexes Different components have independent benefits
Improvement: Candidate Partitioning workload candidate selection partitioned index candidates User recommendation logic Need to handle feedback WFA WFA WFA recommended index set DBA
Feedback Model User provides votes for some indexes Positive votes for indexes they want Negative votes for indexes they don’t want Votes may be explicit E.g., user may reject the system’s recommendation to create an index Votes may also be implicit Creating an index casts a positive vote Dropping an index casts a negative vote ++ What is the interface for feedback
Handling Feedback Votes create a soft constraint on future recommendations Include indexes with positive votes Exclude indexes with negative votes Reverse these decisions if sufficient evidence is seen after the votes were cast ++ High-level description of how feedback is taken into account
partitioned index candidates Final Solution: WFIT workload candidate selection partitioned index candidates User recommendation logic feedback WFA WFA WFA recommended index set DBA
Behavior of WFIT with Feedback “Good” votes follow the best indexes chosen off-line “Bad” votes are opposite of “Good” votes Normalized against optimal schedule ++ Example of empirical results
Conclusions Semi-automatic tuning is a promising new approach to physical database design Combines best features of existing approaches User feedback is the key ingredient We recommend indexes with provable performance guarantees Using work function algorithm as the foundation Future work may include: More efficient workload analysis A GUI frontend to present recommendations