Physical Database Design and Referential Integrity

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

SLIDE 1IS 257 – Fall 2010 Physical Database Design and Referential Integrity University of California, Berkeley School of Information IS 257:
SLIDE 1IS 257 – Spring 2004 Physical Database Design and Referential Integrity University of California, Berkeley School of Information Management.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
SLIDE 1IS 257 – Fall 2008 Physical Database Design and Referential Integrity University of California, Berkeley School of Information IS 257:
SLIDE 1IS 257 – Fall 2006 Physical Database Design and Referential Integrity University of California, Berkeley School of Information IS 257:
9/28/2000SIMS 257: Database Management -- Ray Larson More on Physical Database Design and Referential Integrity University of California, Berkeley School.
SLIDE 1IS Fall 2002 Physical Database Design and Referential Integrity University of California, Berkeley School of Information Management.
10/3/2000SIMS 257: Database Management -- Ray Larson Relational Algebra and Calculus University of California, Berkeley School of Information Management.
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 1 1. Chapter 2: Relational Databases and Multi-Table Queries Exploring Microsoft Office.
With Microsoft Office 2007 Intermediate© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Office 2007 Intermediate.
With Microsoft Access 2007 Volume 1© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Access 2007 Volume 1 Chapter.
Copyright © 2003 by Prentice Hall Module 4 Database Management Systems 1.What is a database? Data hierarchy and data organization Field, record, file,
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
Copyright © Curt Hill Index Creation SQL.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
1 © Prentice Hall, 2002 Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
CpSc 462/662: Database Management Systems (DBMS) (TEXNH Approach) Constraints, Triggers and Index James Wang.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Oracle 11g: SQL Chapter 4 Constraints.
Chapter 4 Constraints Oracle 10g: SQL. Oracle 10g: SQL 2 Objectives Explain the purpose of constraints in a table Distinguish among PRIMARY KEY, FOREIGN.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Session 1 Module 1: Introduction to Data Integrity
11-1 © Prentice Hall, 2004 Chapter 11: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
CDT/1 Creating data tables and Referential Integrity Objective –To learn about the data constraints supported by SQL2 –To be able to relate tables together.
1 CS122A: Introduction to Data Management Lecture #14: Indexing Instructor: Chen Li.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Database Constraints ICT 011. Database Constraints Database constraints are restrictions on the contents of the database or on database operations Database.
Database Constraints Ashima Wadhwa. Database Constraints Database constraints are restrictions on the contents of the database or on database operations.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Getting started with Accurately Storing Data
Fundamentals of DBMS Notes-1.
Managing Tables, Data Integrity, Constraints by Adrienne Watt
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
Record Storage, File Organization, and Indexes
CHAPTER 7 DATABASE ACCESS THROUGH WEB
Chapter 6 - Database Implementation and Use
Constraints and Triggers
Modern Systems Analysis and Design Third Edition
Database Performance Tuning and Query Optimization
What is Database Administration
Module 5: Implementing Data Integrity by Using Constraints
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Introduction lecture1.
Lecturer: Mukhtar Mohamed Ali “Hakaale”
Chapter 11: Indexing and Hashing
Disk storage Index structures for files
Lecture 12 Lecture 12: Indexing.
Teaching slides Chapter 8.
Physical Database Design
Chapter 6: Physical Database Design and Performance
SQL DATA CONSTRAINTS.
CS122 Using Relational Databases and SQL
Unit I-2.
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Relational Database Design
Chapter 11 Database Performance Tuning and Query Optimization
Database Design: Relational Model
CS122 Using Relational Databases and SQL
CS1222 Using Relational Databases and SQL
CS122 Using Relational Databases and SQL
SQL – Constraints & Triggers
Indexes and more Table Creation
CS122 Using Relational Databases and SQL
Presentation transcript:

Physical Database Design and Referential Integrity University of California, Berkeley School of Information IS 257: Database Management IS 257 – Fall 2015

Lecture Outline Indexes and What to index Parallel storage systems (RAID) Integrity constraints IS 257 – Fall 2015

Indexes Most database applications require: locating rows in tables that match some condition (e.g. SELECT operations) Joining one table with another based on common values of attributes in each table Indexes can greatly speed up these processes and avoid having to do sequential scanning of database tables to resolve queries IS 257 – Fall 2015

Type of Keys Primary keys -- as we have seen before -- uniquely identify a single row in a relational table Secondary keys -- are search keys that may occur multiple times in a table Bitmap Indexes Table of bits where each row represents a distinct key value and each column is a bit – 0 or 1 for each record IS 257 – Fall 2015

Primary Key Indexes In MySQL – will create a unique index In Access – also this will be created automatically when a field is selected as primary key in the table design view select an attribute row (or rows) and clock on the key symbol in the toolbar. The index is created automatically as one with (No Duplicates) In SQL CREATE UNIQUE INDEX indexname ON tablename(attribute); IS 257 – Fall 2015

Secondary Key Indexes In Access -- Secondary key indexes can be created on any field. In the table design view, select the attribute to be indexed In the “Indexed” box on the General field description information at the bottom of the window, select “Yes (Duplicates OK)” In SQL (including MySQL) CREATE INDEX indxname on tablename(attribute); MySQL suggests that CREATE TABLE be used for most index creation. E.g. adding “create table..,idnum int index using btree,” IS 257 – Fall 2015

MySQL Index Creation syntax CREATE [ONLINE|OFFLINE] [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name [index_type] ON tbl_name (index_col_name,...) [index_option] ... index_col_name: col_name [(length)] [ASC | DESC] index_type: USING {BTREE | HASH} index_option: KEY_BLOCK_SIZE [=] value | index_type | WITH PARSER parser_name | COMMENT 'string’ CREATE INDEX cannot be used to create a PRIMARY KEY; use ALTER TABLE instead IS 257 – Fall 2015

When to Index Tradeoff between time and space: Indexes permit faster processing for searching But they take up space for the index They also slow processing for insertions, deletions, and updates, because both the table and the index must be modified Thus they SHOULD be used for databases where search is the main mode of interaction The might be skipped if high rates of updating and insertions are expected, and access or retrieval operations are rare IS 257 – Fall 2015

When to Use Indexes Rules of thumb Indexes are most useful on larger tables Specify a unique index for the primary key of each table (automatically done for many DBMS) Indexes are most useful for attributes used as search criteria or for joining tables Indexes are useful if sorting is often done on the attribute Most useful when there are many different values for an attribute Some DBMS limit the number of indexes and the size of the index key values Some indexes will not retrieve NULL values IS 257 – Fall 2015

Lecture Outline Review Access Methods Indexes and What to index Physical Database Design Access Methods Indexes and What to index Parallel storage systems (RAID) Integrity constraints IS 257 – Fall 2015

Parallel Processing with RAID In reading pages from secondary storage, there are often situations where the DBMS must retrieve multiple pages of data from storage -- and may often encounter rotational delay seek positioning delay in getting each page from the disk IS 257 – Fall 2015

Disk Timing (and Problems) Rotational Delay Read Head fingerprint Hair Seek Positioning Delay A rotational delay is the amount of time between information requests and how long it takes the hard drive to move to the sector where the requested data is located. In other words, it is a time measurement, in milliseconds (ms), of how long before a rotating drive can transfer data. Seek time: Say you're reading some data from the (0,4). You receive instructions to read from track (2,5). The time it takes for you to move from track 0 to track 2 is seek time. IS 257 – Fall 2015

IS 257 – Fall 2015

RAID Provides parallel disks (and software) so that multiple pages can be retrieved simultaneously RAID stands for “Redundant Arrays of Inexpensive Disks” invented by Randy Katz and Dave Patterson here at Berkeley Some manufacturers have renamed the “inexpensive” part (for obvious reasons) IS 257 – Fall 2015

RAID Technology Parallel Writes Disk 2 Disk 3 Disk 4 Disk 1 1 2 3 4 5 6 7 8 9 10 11 12 * * * * Reads Stripe One logical disk drive IS 257 – Fall 2015

Raid 0 1 2 3 4 5 6 7 8 9 10 11 12 * * * * One logical disk drive Parallel Writes Disk 2 Disk 3 Disk 4 Disk 1 1 2 3 4 5 6 7 8 9 10 11 12 * * * * Reads Stripe One logical disk drive IS 257 – Fall 2015

Raid 1 provides full redundancy for any data stored Parallel Writes Disk 1 Disk 2 Disk 3 Disk 4 1 1 2 2 Stripe 3 3 4 4 Stripe 5 5 6 6 Stripe * * * * * * * * * * * * Parallel Reads Raid 1 provides full redundancy for any data stored IS 257 – Fall 2015

RAID-2 1a 1b ecc ecc 2a 2b ecc ecc 3a 3b ecc ecc * * * * Writes span all drives Disk 2 Disk 3 Disk 4 Disk 1 1a 1b ecc ecc 2a 2b ecc ecc 3a 3b ecc ecc * * * * Reads span all drives Stripe Raid 2 divides blocks across multiple disks with error correcting codes IS 257 – Fall 2015

RAID-3 1a 1b 1c ecc 2a 2b 2c ecc 3a 3b 3c ecc * * * * Writes span all drives Disk 2 Disk 3 Disk 4 Disk 1 1a 1b 1c ecc 2a 2b 2c ecc 3a 3b 3c ecc * * * * Reads span all drives Stripe Raid 3 divides very long blocks across multiple disks with a single drive for ECC IS 257 – Fall 2015

Raid 4 like Raid 3 for smaller blocks with multiple blocks per stripe Disk 2 Disk 3 Disk 4 Disk 1 1 2 3 ecc 4 5 6 ecc 7 8 9 ecc * * * * Stripe Parallel Writes Reads Raid 4 like Raid 3 for smaller blocks with multiple blocks per stripe IS 257 – Fall 2015

RAID-5 1 2 3 4 5 6 7 8 9 10 11 12 * * * * ecc ecc ecc ecc Parallel Writes Disk 2 Disk 3 Disk 4 Disk 1 1 2 3 4 5 6 7 8 9 10 11 12 ecc ecc ecc ecc * * * * Reads Stripe Raid 5 divides blocks across multiple disks with error correcting codes IS 257 – Fall 2015

RAID for DBMS What works best for Database storage? RAID-1 is best when 24/7 fault tolerant processing is needed RAID-5 is best for read-intensive applications with very large data sets IS 257 – Fall 2015

Lecture Outline Review Access Methods Indexes and What to index Physical Database Design Access Methods Indexes and What to index Parallel storage systems (RAID) Integrity constraints IS 257 – Fall 2015

Integrity Constraints The constraints we wish to impose in order to protect the database from becoming inconsistent. Five types Required data attribute domain constraints entity integrity referential integrity enterprise constraints IS 257 – Fall 2015

Required Data Some attributes must always contain a value -- they cannot have a null For example: Every employee must have a job title. Every diveshop diveitem must have an order number and an item number. IS 257 – Fall 2015

Attribute Domain Constraints Every attribute has a domain, that is a set of values that are legal for it to use For example: The domain of sex in the employee relation is “M” or “F” Domain ranges can be used to validate input to the database. IS 257 – Fall 2015

Entity Integrity The primary key of any entity cannot be NULL. IS 257 – Fall 2015

Referential Integrity A “foreign key” links each occurrence in a relation representing a child entity to the occurrence of the parent entity containing the matching candidate key Referential Integrity means that if the foreign key contains a value, that value must refer to an existing occurrence in the parent entity For example: Since the ‘Order ID’ in the diveitem relation refers to a particular diveords primary key, that key –and row- must exist for referential integrity to be satisfied IS 257 – Fall 2015

Referential Integrity Referential integrity options are declared when tables are defined (in most systems) There are many issues having to do with how particular referential integrity constraints are to be implemented to deal with insertions and deletions of data from the parent and child tables. IS 257 – Fall 2015

Insertion rules A row should not be inserted in the referencing (child) table unless there already exists a matching entry in the referenced table. Inserting into the parent table should not cause referential integrity problems Unless it is itself a child… Sometimes a special NULL value may be used to create child entries without a parent or with a “dummy” parent. IS 257 – Fall 2015

Deletion rules A row should not be deleted from the referenced table (parent) if there are matching rows in the referencing table (child). Three ways to handle this Restrict -- disallow the delete Nullify -- reset the foreign keys in the child to some NULL or dummy value Cascade -- Delete all rows in the child where there is a foreign key matching the key in the parent row being deleted IS 257 – Fall 2015

Referential Integrity This can be implemented using external programs that access the database newer databases implement executable rules or built-in integrity constraints For example in Diveshop… IS 257 – Fall 2015

DIVEORDS SQL CREATE TABLE `DIVEORDS` ( `Order_No` int(11) NOT NULL, `Customer_No` int(11) default NULL, `Sale_Date` datetime default NULL, `Ship_Via` varchar(255) default NULL, `Ship_Cost` double default NULL, …some things deleted for space…, `VacationCost` double default NULL, PRIMARY KEY (`Order_No`), KEY `Customer_No` (`Customer_No`), KEY `DESTDIVEORDS` (`Destination`), KEY `DIVECUSTDIVEORDS` (`Customer_No`), KEY `DIVEORDSShip_Via` (`Ship_Via`), KEY `SHIPVIADIVEORDS` (`Ship_Via`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; IS 257 – Fall 2015

DIVEITEM SQL CREATE TABLE `DIVEITEM` ( `Order_No` int(11) NOT NULL, `Item_No` int(11) default NULL, `Rental_Sale` varchar(255) default NULL, `Qty` smallint(6) default NULL, `Line_Note` varchar(255) default NULL, KEY `DIVEORDSDIVEITEM` (`Order_No`), KEY `DIVESTOKDIVEITEM` (`Item_No`), KEY `Item_No` (`Item_No`), FOREIGN KEY (`Order_No`) REFERENCES DIVEORDS(`Order_No`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=latin1; Note that only the InnoDB or NDB Engines in MySQL support actual actions and checking for Foreign Keys IS 257 – Fall 2015

Enterprise Constraints These are business rule that may affect the database and the data in it for example, if a manager is only permitted to manage 10 employees then it would violate an enterprise constraint to manage more IS 257 – Fall 2015