CMU SCS Faloutsos & PavloCMU SCS 15-415/6151 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Lecture #18: Physical Database.

Slides:



Advertisements
Similar presentations
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos - A. Pavlo Lecture#2: E-R diagrams.
Advertisements

Physical Database Design and Tuning R&G - Chapter 20 Although the whole of this life were said to be nothing but a dream and the physical world nothing.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Database Tuning. Overview v After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for our database.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications Lecture #19 (not in book) Database Design Methodology handout.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications Lecture #17: Schema Refinement & Normalization - Normal Forms (R&G,
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#12: External Sorting.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#13: Query Evaluation.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#14: Implementation of Relational Operations.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#14(b): Implementation of Relational.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos & A. Pavlo Lecture#6: Rel. model - SQL part1 (R&G, chapter.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapters 14.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Physical Database Design Section Chapter 20.
1 Normalization. 2 Normal Forms v If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of redundancies are avoided/minimized.
1 Overview of Indexing Chapter 8 – Part II. 1. Introduction to indexing 2. First glimpse at indices and workloads.
Manajemen Basis Data Pertemuan 7 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
1 Physical Design: Indexing Yanlei Diao UMass Amherst Feb 13, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
DB performance tuning using indexes Section 8.5 and Chapters 20 (Raghu)
Physical Database Design R&G Chapter 16 Lecture 26.
1 Physical Database Design and Tuning Module 5, Lecture 3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Physical Database Design Chapter 20.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapter 15.
1 Overview of Indexing Chapter 8 – Part II. 1. Introduction to indexing 2. First glimpse at indices and workloads.
Physical Database Design and Tuning R&G - Chapter 20 Although the whole of this life were said to be nothing but a dream and the physical world nothing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
CSC271 Database Systems Lecture # 30.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Physical Database Design and Database Tuning, R. Ramakrishnan and J. Gehrke, modified by Ch. Eick 1 Physical Database Design Part II.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Physical Database Design and Tuning.
Database Tuning Prerequisite Cluster Index B+Tree Indexing Hash Indexing ISAM (indexed Sequential access)
1 ICS 184: Introduction to Data Management Lecture Note 10 SQL as a Query Language (Cont.)
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Physical Database Design and Tuning Chapter 20.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
1 Overview of Storage and Indexing Chapter 8. 2 Data on External Storage  Disks: Can retrieve random page at fixed cost  But reading several consecutive.
Database Management COP4540, SCS, FIU Physical Database Design (2) (ch. 16 & ch. 6)
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Query Optimization Chapter 13.
Physical Database Design I, Ch. Eick 1 Physical Database Design I Chapter 16 Simple queries:= no joins, no complex aggregate functions Focus of this Lecture:
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Physical DB Design Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Physical Database Design Section Chapter 20.
1 CS122A: Introduction to Data Management Lecture #15: Physical DB Design Instructor: Chen Li.
Practical Database Design and Tuning
Introduction to Query Optimization
Modeling Your Data Chapter 2 cs542
Evaluation of Relational Operations: Other Operations
CS222: Principles of Data Management Notes #13 Set operations, Aggregation Instructor: Chen Li.
Examples of Physical Query Plan Alternatives
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#27: Final Review
Practical Database Design and Tuning
Physical Database Design
Chapter 8 – Part II. A glimpse at indices and workloads
Physical Design and Tuning Example
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Evaluation of Relational Operations: Other Techniques
CSE594: REVIEW.
Physical Database Design and Tuning
Physical Database Design
Evaluation of Relational Operations: Other Techniques
Physical Database Design
Physical Database Design and Tuning
Presentation transcript:

CMU SCS Faloutsos & PavloCMU SCS /6151 Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications Lecture #18: Physical Database Design (R&G ch. 20)

CMU SCS Faloutsos & PavloCMU SCS /6152 Overview Introduction Index selection and clustering Database tuning (de-normalization etc) Impact of concurrency

CMU SCS Faloutsos & PavloCMU SCS /6153 Introduction After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for our database. Next step?

CMU SCS Faloutsos & PavloCMU SCS /6154 Introduction After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for our database. Next step? choose indexes, make clustering decisions, and to refine the conceptual and external schemas (if necessary) to meet performance goals. How to decide the above?

CMU SCS Faloutsos & PavloCMU SCS /6155 Introduction How to decide the above? Paraphrasing [Sun Tzu / Sun Wu / Sunzi] Know [the] other, know [the] self, hundred battles without danger

CMU SCS Faloutsos & PavloCMU SCS /6156 Introduction How to decide the above? Paraphrasing [Sun Tzu / Sun Wu / Sunzi] Know [the] other, know [the] self, hundred battles without danger workload Q-opt internals

CMU SCS Faloutsos & PavloCMU SCS /6157 Introduction We must begin by understanding the workload: – The most important queries and how often they arise. – …………………… updates ……………………….. – The desired performance for these queries and updates.

CMU SCS Faloutsos & PavloCMU SCS /6158 Decisions to Make ??

CMU SCS Faloutsos & PavloCMU SCS /6159 Decisions to Make What indexes should we create? For each index, what kind of an index should it be? Should we make changes to the conceptual schema?

CMU SCS Faloutsos & PavloCMU SCS /61510 Decisions to Make What indexes should we create? – Which relations should have indexes? What field(s) should be the search key? Should we build several indexes? For each index, what kind of an index should it be? – Clustered? Hash/tree? Should we make changes to the conceptual schema? – Consider alternative normalized schemas? (Remember, there are many choices in decomposing into BCNF, etc.) – Should we ``undo some decomposition steps and settle for a lower normal form? (Denormalization.) – Horizontal partitioning, replication, views...

CMU SCS Faloutsos & PavloCMU SCS /61511 Overview Introduction Index selection and clustering Database tuning (de-normalization etc) Impact of concurrency

CMU SCS Faloutsos & PavloCMU SCS /61512 Example 1 which index, if any, would you build? SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno EMPDEPT dnodnamemgr toy enamedno

CMU SCS Faloutsos & PavloCMU SCS /61513 Example 1 Hash index on D.dname supports Toy selection. – Given this, index on D.dno is not needed. Hash index on E.dno allows us to get matching (inner) Emp tuples for each selected (outer) Dept tuple. SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno EMP DEPT toy dno INL outer inner

CMU SCS Faloutsos & PavloCMU SCS /61514 Example 1 What if WHERE included: ``... AND E.age=25 ? SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno EMPDEPT dnodnamemgr toy enamednoage

CMU SCS Faloutsos & PavloCMU SCS /61515 Example 1 What if WHERE included: ``... AND E.age=25 ? – Could retrieve Emp tuples using index on E.age, then join with Dept tuples satisfying dname selection. Comparable to strategy that used E.dno index. – So, if E.age index is already created, this query provides much less motivation for adding an E.dno index. SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno

CMU SCS Faloutsos & PavloCMU SCS /61516 Example 2 SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE E.sal BETWEEN AND AND E.hobby=Stamps AND E.dno=D.dno EMPDEPT dnodnamemgrenamednosalhobby

CMU SCS Faloutsos & PavloCMU SCS /61517 Example 2 Emp should be the outer relation – indices for Dept? –. What index for Emp? SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE E.sal BETWEEN AND AND E.hobby=Stamps AND E.dno=D.dno

CMU SCS Faloutsos & PavloCMU SCS /61518 Example 2 Emp should be the outer relation – indices for Dept? – hash index on D.dno. What index for Emp? SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE E.sal BETWEEN AND AND E.hobby=Stamps AND E.dno=D.dno

CMU SCS Faloutsos & PavloCMU SCS /61519 Example 2 Emp should be the outer relation – indices for Dept? – hash index on D.dno. What index for Emp? – B+ tree on E.sal, OR an index on E.hobby. Only one is probably enough – whichever has better selectivity. As a rule of thumb, equality selections more selective than range selections. Notice: in both examples, the choice of indexes depends on the plan(s) we expect from the optimizer. Have to understand optimizers! SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE E.sal BETWEEN AND AND E.hobby=Stamps AND E.dno=D.dno

CMU SCS Faloutsos & PavloCMU SCS /61520 Clustering and Joins What plan? what clustering? SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno EMPDEPT dnodnamemgr toy enamedno

CMU SCS Faloutsos & PavloCMU SCS /61521 Clustering and Joins Index on Emp.dno – clustered or not? SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno EMP DEPT toy dno

CMU SCS Faloutsos & PavloCMU SCS /61522 Clustering and Joins Index on Emp.dno – clustered or not? A: clustered will be much faster SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=Toy AND E.dno=D.dno EMP DEPT toy dno

CMU SCS Faloutsos & PavloCMU SCS /61523 Overview Introduction Index selection and clustering Database tuning (de-normalization etc) Impact of concurrency

CMU SCS Faloutsos & PavloCMU SCS /61524 Tuning the Conceptual Schema The choice of conceptual schema should be guided by the workload, in addition to redundancy issues: – We may settle for a 3NF schema rather than BCNF. – Workload may influence the choice we make in decomposing a relation into 3NF or BCNF. – We may further decompose a BCNF schema! – We might denormalize (i.e., undo a decomposition step), or we might add fields to a relation. – We might consider horizontal decompositions.

CMU SCS Faloutsos & PavloCMU SCS /61525 Tuning the Conceptual Schema If such changes are made after a database is in use: called schema evolution Q: How to mask these changes from applications? Ssnname Student Ssnnameyear New_Student

CMU SCS Faloutsos & PavloCMU SCS /61526 Tuning the Conceptual Schema If such changes are made after a database is in use: called schema evolution Q: How to mask these changes from applications? A: Views! Ssnname Student Ssnnameyear New_Student

CMU SCS Faloutsos & PavloCMU SCS /61527 Tuning the Conceptual Schema If such changes are made after a database is in use: called schema evolution Q: How to mask these changes from applications? A: Views! Ssnname student Ssnnameyear new_student create view student as select ssn, name from new_student

CMU SCS Faloutsos & PavloCMU SCS /61528 Tuning the Conceptual Schema The choice of conceptual schema should be guided by the workload, in addition to redundancy issues: – We may settle for a 3NF schema rather than BCNF. – Workload may influence the choice we make in decomposing a relation into 3NF or BCNF. – We may further decompose a BCNF schema! – We might denormalize (i.e., undo a decomposition step), or we might add fields to a relation. – We might consider horizontal decompositions.

CMU SCS Faloutsos & PavloCMU SCS /61529 Example? Q: When would we choose 3NF instead of BCNF?

CMU SCS Faloutsos & PavloCMU SCS /61530 Example? Q: When would we choose 3NF instead of BCNF? A: Student-Teacher-subJect (STJ) S J -> T T -> J and queries ask for all three attributes ( select * )

CMU SCS Faloutsos & PavloCMU SCS /61531 Tuning the Conceptual Schema The choice of conceptual schema should be guided by the workload, in addition to redundancy issues: – We may settle for a 3NF schema rather than BCNF. – Workload may influence the choice we make in decomposing a relation into 3NF or BCNF. – We may further decompose a BCNF schema! – We might denormalize (i.e., undo a decomposition step), or we might add fields to a relation. – We might consider horizontal decompositions.

CMU SCS Faloutsos & PavloCMU SCS /61532 Decomposition of a BCNF Relation Q: Scenario?

CMU SCS Faloutsos & PavloCMU SCS /61533 Decomposition of a BCNF Relation Q: Scenario? A: eg., STUDENT(ssn, name, address, ph#,...) with many queries like select ssn, name from student

CMU SCS Faloutsos & PavloCMU SCS /61534 Tuning the Conceptual Schema The choice of conceptual schema should be guided by the workload, in addition to redundancy issues: – We may settle for a 3NF schema rather than BCNF. – Workload may influence the choice we make in decomposing a relation into 3NF or BCNF. – We may further decompose a BCNF schema! – We might denormalize (i.e., undo a decomposition step), or we might add fields to a relation. – We might consider horizontal decompositions.

CMU SCS Faloutsos & PavloCMU SCS /61535 De-normalization Q: Scenario?

CMU SCS Faloutsos & PavloCMU SCS /61536 De-normalization Q: Scenario? A: E.g., STUDENT (ssn, name) TAKES (ssn, cid, grade) COURSE (cid, cname) –and many queries like: class roster for db- apps Q: resulting table(s) and views?

CMU SCS Faloutsos & PavloCMU SCS /61537 Tuning the Conceptual Schema The choice of conceptual schema should be guided by the workload, in addition to redundancy issues: – We may settle for a 3NF schema rather than BCNF. – Workload may influence the choice we make in decomposing a relation into 3NF or BCNF. – We may further decompose a BCNF schema! – We might denormalize (i.e., undo a decomposition step), or we might add fields to a relation. – We might consider horizontal decompositions.

CMU SCS Faloutsos & PavloCMU SCS /61538 Horizontal Decompositions Sometimes, might want to replace relation by a collection of relations that are selections. Eg., STUDENT (ssn, name, status) decomposed to CurrentStudent (ssn, name, status) Alumni (ssn, name, status) Q: under what scenario would this help performance?

CMU SCS Faloutsos & PavloCMU SCS /61539 Horizontal Decompositions Sometimes, might want to replace relation by a collection of relations that are selections. Eg., STUDENT (ssn, name, status) decomposed to CurrentStudent (ssn, name, status) Alumni (ssn, name, status) Q: under what scenario would this help performance? A: most queries on current, & too large alumni table

CMU SCS Faloutsos & PavloCMU SCS /61540 Horizontal Decompositions Sometimes, might want to replace relation by a collection of relations that are selections. Eg., STUDENT (ssn, name, status) decomposed to CurrentStudent (ssn, name, status) Alumni (ssn, name, status) Q: How to mask the change?

CMU SCS Faloutsos & PavloCMU SCS /61541 Masking Conceptual Schema Changes Masks change But performance-minded users should query the correct table CREATE VIEW STUDENT(ssn, name, status) AS SELECT * FROM CurrentStudent UNION SELECT * FROM Alumni

CMU SCS Faloutsos & PavloCMU SCS /61542 Tuning Queries and Views If a query runs slower than expected, what to check?

CMU SCS Faloutsos & PavloCMU SCS /61543 Tuning Queries and Views If a query runs slower than expected, check –the plan that is used! (and adjust indices/query/views)

CMU SCS Faloutsos & PavloCMU SCS /61544 Tuning Queries and Views If a query runs slower than expected, check –the plan that is used! (and adjust indices/query/views) –whether statistics are too old

CMU SCS Faloutsos & PavloCMU SCS /61545 Tuning Queries and Views If a query runs slower than expected, check –the plan that is used! (and adjust indices/query/views) –whether statistics are too old or –whether an index needs to be re-built, or

CMU SCS Faloutsos & PavloCMU SCS /61546 Tuning Queries and Views Sometimes, the DBMS may not be executing the plan you had in mind. Common areas of weakness:

CMU SCS Faloutsos & PavloCMU SCS /61547 Tuning Queries and Views Sometimes, the DBMS may not be executing the plan you had in mind. Common areas of weakness: – Selections involving null values. – Selections involving arithmetic or string expressions. – Selections involving OR conditions. – Lack of evaluation features like index-only strategies or certain join methods or poor size estimation.

CMU SCS Faloutsos & PavloCMU SCS /61548 Tuning Queries and Views Sometimes, the DBMS may not be executing the plan you had in mind. Common areas of weakness: – Selections involving null values. – Selections involving arithmetic or string expressions. – Selections involving OR conditions. – Lack of evaluation features like index-only strategies or certain join methods or poor size estimation. > 3* salary like %main%

CMU SCS Faloutsos & PavloCMU SCS /61549 Rewriting SQL Queries Complicated by interaction of: – NULLs, duplicates, aggregation, subqueries Guideline: Use only one query block, if possible. SELECT DISTINCT * FROM Sailors S WHERE S.sname IN (SELECT Y.sname FROM YoungSailors Y) ?? =

CMU SCS Faloutsos & PavloCMU SCS /61550 Rewriting SQL Queries Complicated by interaction of: – NULLs, duplicates, aggregation, subqueries Guideline: Use only one query block, if possible. SELECT DISTINCT * FROM Sailors S WHERE S.sname IN (SELECT Y.sname FROM YoungSailors Y) SELECT DISTINCT S.* FROM Sailors S, YoungSailors Y WHERE S.sname = Y.sname =

CMU SCS Faloutsos & PavloCMU SCS /61551 More Guidelines for Query Tuning Minimize the use of DISTINCT : when?

CMU SCS Faloutsos & PavloCMU SCS /61552 More Guidelines for Query Tuning Minimize the use of DISTINCT : when? A1: when duplicates are acceptable, or A2: if answer contains a key. SELECT DISTINCT ssn FROM Student;

CMU SCS Faloutsos & PavloCMU SCS /61553 More Guidelines for Query Tuning Consider DBMS use of index when writing arithmetic expressions: E.age=2*D.age will benefit from index on E.age, but might not benefit from index on D.age !

CMU SCS Faloutsos & PavloCMU SCS /61554 More Guidelines for Query Tuning Minimize the use of GROUP BY and HAVING : SELECT MIN (E.age) FROM Employee E GROUP BY E.dno HAVING E.dno=102 ??

CMU SCS Faloutsos & PavloCMU SCS /61555 More Guidelines for Query Tuning Minimize the use of GROUP BY and HAVING : SELECT MIN (E.age) FROM Employee E GROUP BY E.dno HAVING E.dno=102 SELECT MIN (E.age) FROM Employee E WHERE E.dno=102

CMU SCS Faloutsos & PavloCMU SCS /61556 Guidelines for Query Tuning (Contd.) Avoid using intermediate relations: SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe SELECT T.dno, AVG (T.sal) FROM Temp T GROUP BY T.dno vs. and ???

CMU SCS Faloutsos & PavloCMU SCS /61557 Guidelines for Query Tuning (Contd.) Avoid using intermediate relations: SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe SELECT T.dno, AVG (T.sal) FROM Temp T GROUP BY T.dno vs. SELECT E.dno, AVG (E.sal) FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe GROUP BY E.dno and

CMU SCS Faloutsos & PavloCMU SCS /61558 Guidelines for Query Tuning (Contd.) Avoid using intermediate relations: SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe SELECT T.dno, AVG (T.sal) FROM Temp T GROUP BY T.dno vs. SELECT E.dno, AVG (E.sal) FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe GROUP BY E.dno and EmpDept join Group-by selection EmpDept join Group-by selection

CMU SCS Faloutsos & PavloCMU SCS /61559 Guidelines for Query Tuning (Contd.) Avoid using intermediate relations: SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe SELECT T.dno, AVG (T.sal) FROM Temp T GROUP BY T.dno vs. SELECT E.dno, AVG (E.sal) FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=Joe GROUP BY E.dno and Does not materialize the intermediate reln Temp. If there is a dense B+ tree index on, an index-only plan can be used to avoid retrieving Emp tuples in the second query!

CMU SCS Faloutsos & PavloCMU SCS /61560 Overview Introduction Index selection and clustering Database tuning (de-normalization etc) Impact of concurrency

CMU SCS Faloutsos & PavloCMU SCS /61561 Concurrency Reduce lock durations Reduce hot spots

CMU SCS Faloutsos & PavloCMU SCS /61562 Concurrency Reduce lock durations

CMU SCS Faloutsos & PavloCMU SCS /61563 Concurrency Reduce lock durations –make transactions faster –break long transactions in shorter ones (but...) –build a data warehouse –consider lower isolation level

CMU SCS Faloutsos & PavloCMU SCS /61564 Concurrency Reduce hot spots

CMU SCS Faloutsos & PavloCMU SCS /61565 Concurrency Reduce hot spots –delay operations on hot spots –optimize access patterns –partition (batch) operations on hot spots –choice of index (root of B-tree -> hot spot)

CMU SCS Faloutsos & PavloCMU SCS /61566 Summary Database design consists of several tasks: requirements analysis, conceptual design, schema refinement, physical design and tuning. – In general, have to go back and forth between these tasks to refine a database design, and decisions in one task can influence the choices in another task. Also see the paper by Roussopoulos + Yeh (on the course web site)

CMU SCS Faloutsos & PavloCMU SCS /61567 Summary (contd) workload is vital: – What are the important queries and updates? What attributes/relations are involved? then: – refine conceptual schema and views – tune queries (indices, clustering, re-writing) Know the workload Know the Q-opt internals

CMU SCS Faloutsos & PavloCMU SCS /61568 Summary - schema refinement May choose 3NF or lower normal form over BCNF. May denormalize, or undo some decompositions. May decompose a BCNF relation further! May choose a horizontal decomposition of a relation. Importance of dependency-preservation based upon the dependency to be preserved, and the cost of the IC check (see text) 1 of 3

CMU SCS Faloutsos & PavloCMU SCS /61569 Tuning Queries and Views Sometimes, the DBMS may not be executing the plan you had in mind. Common areas of weakness: – Selections involving null values. – Selections involving arithmetic or string expressions. – Selections involving OR conditions. – Lack of evaluation features like index-only strategies or certain join methods or poor size estimation. > 3* salary like %main% 2 of 3

CMU SCS Faloutsos & PavloCMU SCS /61570 Summary - Tuning So, may have to rewrite the query/view: Avoid nested queries, temporary relations, complex conditions, and operations like DISTINCT and GROUP BY. 3 of 3