Information Resources Management April 3, 2001
Agenda n Administrivia n Physical Database Design n Database Integrity n Performance
Administrivia n Exam 2
Regrade Requests Exam SQL n Create Database n Enter query(s) as submitted n Submit to me n Database (electronic) n Graded homework (paper) n Reserve the right to change test data and reexecute query
Foreign Keys n Inserts require all FK values be the value of a primary key in the reference table n Update and delete constraints are also possible
Referential Integrity n ON DELETE CASCADE/RESTRICT/SET NULL n ON UPDATE CASCADE/RESTRICT/SET NULL n Default n ON DELETE RESTRICT n ON UPDATE CASCADE
Example CREATE TABLE PCAccess (PC#INTEGER, EmpIDCHAR(9), EmpIDCHAR(9), AccessTypeCHAR(15), AccessTypeCHAR(15), PRIMARY KEY (PC#, EmpID), PRIMARY KEY (PC#, EmpID), FOREIGN KEY (EmpID) REFERENCES (Employee), FOREIGN KEY (EmpID) REFERENCES (Employee), FOREIGN KEY (PC#) REFERENCES (PC)) FOREIGN KEY (PC#) REFERENCES (PC))
Example - PCAccess Table
Example #1 INSERT INTO PCAccess (PC#) VALUES (4)
Example #2 INSERT INTO PCAccess (PC#, EmpID) VALUES (4,5)
Example #3 UPDATE Employee SET EmpID = 10 WHERE EmpID = 1
Example #4 UPDATE PCAccess SET EmpID = 10 WHERE EmpID = 1
Example #5 DELETE FROM Employee WHERE EmpID = 2
Example #6 DELETE FROM PCAccess WHERE EmpID = 2
Example #7 DELETE FROM PC WHERE PC# = 3
Cascading n Chain followed until the end n Especially for deletes n If mix of CASCADE, RESTRICT, SET NULL n Will get all or nothing
Update & Delete Constraints CREATE TABLE T1 (ACHAR(5) BCHAR(5) BCHAR(5) PRIMARY KEY (A,B) FOREIGN KEY (A) REFERENCES (T2) ON DELETE RESTRICT ON UPDATE RESTRICT) CREATE TABLE T2 (CCHAR(5) DVARCHAR(30) PRIMARY KEY (C)) DVARCHAR(30) PRIMARY KEY (C))
Constraints T1T2 Want to update the value of X1 to be X11. What has to happen?
Performance n Requires Knowledge of n DBMS n Applications n Data n Users & Expectations n Environment
Performance Classes n OLTP n On-Line Transaction Processing n OLAP n On-Line Analytic Processing n Mix of OLTP and OLAP
OLTP n Throughput Driven n Throughput - number of transactions per unit of time n Lots of Transactions n Mix of Update and Query n High Concurrency
OLAP n Response Time Driven n Response Time - single transaction n Very Large, Possibly Complex, Transactions n Query Evaluation and Optimization
Performance Tuning n Consider the Mix of OLTP & OLAP n Interference Between Types Example: Single daily large analytic transaction, rest simple transactions, locking could prevent others from running.
Tuning Levels n DBMS n Hardware n Design n Interactions Between Levels
DBMS Parameter Tuning n Specific to DBMS n Buffers - Buffer Pool n Logging - Checkpoints n Lock Management n Space Allocation - Log, Data, Freespace n Thread Management n Operating System Tuning
Hardware Tuning n Memory n CPU n Disk n RAID n Number of Drives n Partitioning n Architecture -- Parallel Systems?
RAID n Redundant Array of Inexpensive Disks n Appears as single disk n Physical storage difference - no database differences n Increase performance n Provide recovery from disk failure
Negative Effect of RAID n MTBF (mean time between failures) n Increase by factor = # of drives used
How RAID Works Striping - dividing equally across all disks Stripe 1 Stripe 2 Stripe 3 Stripe n
RAID Levels n RAID-0 n RAID-1 n RAID-2 n RAID-3 n RAID-4 n RAID-5 n RAID-6
RAID-0 n All disks store unique data n Very fast n No fault tolerance or recovery
RAID-1 n Fully Redundant n Faster Reads/Slower Writes n High fault tolerance -- easy recovery
RAID-2 n Each record spans all drives n Some disks store ECC (error correction codes) n Parity checks allow error detection and correction a 2a 3a 1b 2b 3b ECC
RAID-3 n Each record spans all drives n One disk stores ECC n Single-User a 2a 3a 1b 2b 3b 1c 2c 3c ECC
RAID-4 n Each record stored on a single disk n One drive for ECC n Multi-user reads; Single-User writes ECC
RAID-5 (Rotating Parity Array) n Drive has both data and ECC n ECCs rotate to different drive n Multi-user reads and writes ECC
RAID-6 P+Q Redundancy n P - “parity” n Q - “extra parity” n 2 bits of ECC per 4 bits of data n Handles multiple disk failures n Reed-Solomon codes n Introduction to the Theory of Error- Correcting Codes, Pless (1989)
Your Mileage May Vary “We note that numerous improvements have been proposed to the basic RAID schemes described here. As a result, sometimes there is confusion about the exact definitions of the different RAID levels.”
RAID Usage n 1, 3, and 5 outperform others n RAID-1 - fastest, no storage cost, but not fault tolerant n RAID-3 - single-user only n RAID-5 - higher speed than single disk, fault tolerant, multi-user, but some storage cost and slower write times
Design Tuning n Transactions n Physical Database
Transaction Tuning The DBMS optimizes so why worry? n An optimized poorly written transaction can always be outperformed by a well- written nonoptimized one. n EXPLAIN (DB2) n What did the optimizer come up with?
Transaction Tuning n Distributed Databases n Client-Server n Network performance becomes an additional concern
Transaction Tuning n DBA participation in program reviews and walkthroughs n Continuous Monitoring
Transaction Tuning Heuristics n Single query instead of multiple queries n “multiple” includes sub-queries n Avoid long-running transactions n Avoid large quantities of updates n Locking and logging n Reduce number of tables joined
Transaction Tuning Heuristics n Reduce sorting n Return less data rather than more n Don’t shift logic from query to program n Optimizer is likely to be faster n Less data is returned
Transaction Tuning Example Get the names of all managers whose offices have property listed in Pgh. SELECT * FROM Property as P, Office as O, Manager as M, Employee as E WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr)
Transaction Tuning Example SELECT * FROM Property as P, Office as O, Manager as M, Employee as E WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr) More is selected than is needed.
Transaction Tuning Example SELECT * FROM Property as P, Office as O, Manager as M, Employee as E WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr) Some joined tables can be eliminated.
Transaction Tuning Example SELECT * FROM Property as P, Office as O, Manager as M, Employee as E WHERE P.OfficeNbr = O.OfficeNbr AND O.OfficeNbr = M.OfficeNbr AND M.EmpID = E.EmpID AND PropertyID IN (SELECT PropertyID FROM Property as P2 WHERE P2.City = ‘Pgh’ AND P2.OfficeNbr = M.OfficeNbr) Subquery is executed once per office.
Transaction Tuning Example SELECT E.Name FROM Employee as E WHERE E.MgrFlag = 1 AND OfficeNbr IN (SELECT OfficeNbr FROM Property as P WHERE P.City = ‘Pgh’) Version without any joins - 2 single table queries only.
Transaction Tuning Example SELECT DISTINCT E.Name FROM Property as P, Employee as E WHERE P.OfficeNbr = E.OfficeNbr AND E.MgrFlag = 1 AND P.City = ‘Pgh’ Single query with join.
Transaction Tuning n Explain (or similar tool) can help to identify how each transaction will access the data and what temporary tables will have to be created to execute the query n With multiple options, test them n Order of conditions in WHERE can affect the optimization and performance n I.E., put MgrFlag = 1 first
Physical Database Tuning n Indices n Schema Tuning n Retaining Normalization n Denormalization
Indices n Unique n Nonunique n Single Attribute n Multiple Attributes (concatenated or composite key) n Primary Key n Secondary Index
Additional Indices n Index decreases read time but increases update time n Based on queries - even single query n (EXPLAIN) n Indices need reorganization n Inserts, Updates, Deletes n Specify freespace n Reduce frequency of reorganizations
Schema Tuning - Staying Normal n Split Tables - Vertical Partitioning n Highly used vs. infrequently used columns n Don’t partition if result will be more joins n Keys are duplicated
Schema Tuning - Staying Normal n Variable length fields (VARCHAR, others) n Indeterminant record lengths n Row locations vary n Vertically partition row into two tables, one with fixed and one with variable columns
Schema Tuning - Leaving Normal n Normalization n Eliminates duplication n Reduces anomalies n Does not result in efficiency n Denormalize for performance
Denormalization Warnings n Increases chance of errors or inconsistencies n May result in reprogramming if business rules change n Optimizes based on current transaction mix n Increases duplication and space required n Increases programming complexity n Always normalize first then denormalize
Denormalization n Partition Rows n Combine Tables n Combine and Partition n Replicate Data
Combining Opportunities n One-to-one (optional) n allow nulls n Many-to-many (assoc. entity) n 2 tables instead of 3 n Reference data (one-to-many) n “one” not use elsewhere n few of “many”
Combining Examples n Employee-Spouse (name and SSN only) n Owner-PctOwned - Property n few owners with multiple properties n Property-Type (description) n one type per property
Partitioning n Horizontal n By row type n Separate processing by type n Supertype/subtype decision n Vertical (already seen) n Both
Replication n Intentionally repeating data n Example: Owner-PctOwned-Property n Owner includes PctOwned & PropertyID n Property includes majority OwnerSSN and PctOwned
Performance Tuning n Not a one-time event n Monitoring probably more important n Things change n applications, database (table) sizes, data characteristics n hardware, operating system, DBMS