Physical Database Design CIT 381 - alternate keys - named constraints - indexes.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
1 Lecture 8: Data structures for databases II Jose M. Peña
ICS 421 Spring 2010 Indexing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 02/18/20101Lipyeow Lim.
De-normalize if… Performance is unsatisfactory Table has a low update rate –(sacrifice flexibility) Table has a high query rate –(speed up retrieval)
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
Physical Database Monitoring and Tuning the Operational System.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
PARTITIONING “ A de-normalization practice in which relations are split instead of merger ”
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Oracle Database Administration Database files Logical database structures.
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
CSC271 Database Systems Lecture # 30.
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 9 Index Management.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Database Tuning Prerequisite Cluster Index B+Tree Indexing Hash Indexing ISAM (indexed Sequential access)
Lecture 12 Designing Databases 12.1 COSC4406: Software Engineering.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
1 Index Structures. 2 Chapter : Objectives Types of Single-level Ordered Indexes Primary Indexes Clustering Indexes Secondary Indexes Multilevel Indexes.
SQL Server Indexes Indexes. Overview Indexes are used to help speed search results in a database. A careful use of indexes can greatly improve search.
Chapter 6 Database Administration
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
M1G Introduction to Database Development 2. Creating a Database.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)
Indexes / Session 2/ 1 of 36 Session 2 Module 3: Types of Indexes Module 4: Maintaining Indexes.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Methodology – Physical Database Design for Relational Databases.
File and Database Design Class 22. File and database design: 1. Choosing the storage format for each attribute from the logical data model. 2. Grouping.
Indexes and Views Unit 7.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
1 Indexes ► Sort data logically to improve the speed of searching and sorting operations. ► Provide rapid retrieval of specified rows from the table without.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
SQL Basics Review Reviewing what we’ve learned so far…….
IT 5433 LM4 Physical Design. Learning Objectives: Describe the physical database design process Explain how attributes transpose from the logical to physical.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
Converting ER/EER to logical schema; physical design issues 1.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Practical Database Design and Tuning
Data Indexing Herbert A. Evans.
INLS 623– Database Systems II– File Structures, Indexing, and Hashing
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Physical Database Design and Performance
Physical Database Design for Relational Databases Step 3 – Step 8
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Session #, Speaker Name Indexing Chapter 8 11/19/2018.
Physical Database Design
Practical Database Design and Tuning
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Presentation transcript:

Physical Database Design CIT alternate keys - named constraints - indexes

Constraints We have seen primary key constraints and not null constraints. We can name the constraint: CREATE TABLE Student ( STUD_NUMinteger, STUD_FNAMECHAR(10), STUD_LNAMECHAR(20), STUD_ADDRESSCHAR(30), STUD_DEPT_IDINTEGER, CONSTRAINT stud_pk PRIMARY KEY(STUD_NUM), CONSTRAINT stud_ln NOT NULL (STUD_LNAME) )

Why name constraints? For easier control: DROP CONSTRAINT stud_ln; - easy to remove a constraint without rebuilding table SET CONSTRAINTS stud_pk DEFERRED; - this says do not enforce constraint until transaction is complete (Informix)

UNIQUE A way to specify alternate keys. Let’s add such a constraint to the Student table - say the student name forms (another, or candidate) key. ALTER TABLE Student ADD CONSTRAINT stud_name_key UNIQUE (STUD_FNAME, STUD_LNAME);

Also Foreign Key Constraint ALTER TABLE Student ADD CONSTRAINT stud_fk1 FOREIGN KEY STUD_DEPT_ID REFERENCES Department (DEPT_ID); Of course, these constraints can be declared when the table is created (or added in the Relationship View of Access). Naming the constraint is optional.

Physical Design There are four main aspects to physical design: ER Model to Relational Model mapping Denormalization Indexing Physical storage issues (such as fragmentation)

Relational Mapping Here we convert entity-relationship diagrams to relations (=tables) Entities become tables Relationships become foreign keys, except,… Many-to-many (non-specific) relationships become tables Data types get set, depending on chosen DBMS (MySQL, Oracle, Access, etc.)

Denormalization From ER Studio user guide Denormalization is an unavoidable part of designing databases. No matter how elegant a logical design can appear on paper, it often breaks down in practice because of the complex or expensive queries required to make it work. Sometimes, the best remedy for the performance problems is to depart from the blueprint, the logical design. Indeed, denormalization is perhaps the most important reason for separating logical and physical designs - you need not compromise your blueprint while still addressing real-world performance problems.

Indexing An index is a data structure associated with a table allowing faster look-up access to that table. -Usually they are a B-tree - Others: hash table (common), R-tree (not common) -Note: in DB-speak, the plural of index is indexes, not the usual indices.

Creating an index CREATE INDEX stud_idx1 ON Student (STUD_NUM); This will create an index on the primary key. Usually this is done by default. If you expect queries to look at that field in descending order, consider CREATE INDEX stud_idx1 ON Student (STUD_NUM DESC);

Secondary Indexes If we expect many queries on the student last name CREATE INDEX stud_idx2 ON Student (STUD_LNAME);

… or if we have many queries on the (lastName, firstName) pair CREATE INDEX stud_idx3 ON Student (STUD_LNAME, STUD_FNAME); If we did not have the UNIQUE constraint, we could have enforced it through the index: CREATE UNIQUE INDEX stud_idx3 ON Student (STUD_LNAME, STUD_FNAME);

B Trees The most common indexing structure, using a tree structure: - each node is set to be a disk block - hence smaller search keys increase fan-out Root * 3*5* 7*14*16* 19*20*22*24*27* 29*33*34* 38* 39* 13

Use of Indexes Speed up many sorts of queries Assist in computation of join operations Used in sorting a table (for ORDER BY or GROUP BY) Downsides: table updates now become slow - an insertion into a table requires insertion of search key into each of its indexes Indexes can use a lot of space - often more than the table

From ER Studio user guide “One purpose of indexes is to improve performance by providing a more efficient mechanism for locating data. Indexes work like a card catalog in a library: instead of searching every shelf for a book, you can find a reference to the book in the card catalog, which directs you to the book’s specific location. Logical indexes store pointers to data so that a search of all of the underlying data is not necessary. Indexes are one of the most important mechanisms for improving query performance.”

“However, injudiciously using indexes can negatively affect performance. You must determine the optimal number of indexes to place on a table, the index type and their placement in order to maximize query efficiency.”

Index Number (from guide) “While indexes can improve read (query) performance, they slow write (insert, update, and delete) performance. This is because the indexes themselves are modified whenever the data is modified. As a result, you must be judicious in the use of indexes. If you know that a table is subject to a high level of insert, update and delete activity, you should limit the number of indexes placed on the table. Conversely, if a table is basically static, like most lookup tables, then a high number of indexes should not impair overall performance.”

Index Type (from guide) “Generally, there are two types of queries: point queries, which return a narrow data set, and range queries, which return a larger data set. For those databases that support them, clustered indexes are better suited to satisfying range queries, or a set of index columns that have a relatively low cardinality. Non-clustered indexes are well suited to satisfying point queries.”

Bulk Loading To insert a large amount of data into a table 1.Drop all indexes 2.Sort the data to be inserted 3.Insert the data (sorting helps disk blocks line up) 4.Rebuild indexes reconstruction from scratch is often faster than one-by-one insertion

Fragmentation Split the contents of the table … into separate locations on disk onto several disks Problem: disk i/o is slow Two types: vertical fragmentation some columns here, some there horizontal fragmentation some rows here, some there

Physical Placement Put frequently joined tables on separate hard drives. This yields parallel i/o. Alternately, very frequently joined tables should be merged (denormalized). Note: about 80% of cpu cycles are spent performing joins.

From ER Studio guide Two key concerns of every database administrator are free space management and data fragmentation. If you do not properly plan for the volume and growth of your tables and indexes, these two administrative issues could severely impact system availability and performance. Therefore, when designing your physical model, you should consider the initial extent size and logical partition size.