Issues in Database Performance Performance in Read / write are hardware issues => throw money at it Performance of DB = ability of engine to locate data.

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
DBMS 2001Notes 4.2: Hashing1 Principles of Database Management Systems 4.2: Hashing Techniques Pekka Kilpeläinen (after Stanford CS245 slide originals.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Dr. Kalpakis CMSC 661, Principles of Database Systems Index Structures [13]
 Presented By:Payal Gupta  Roll Number:106 (225 in scetion 2)  Professor :Tsau Young Lin.
I/O Trap Reading existing data Changing existing data –Update existing records –Adding new records –Deleting records All these involve going to disk =>
1 Overview of Storage and Indexing Chapter 8 (part 1)
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
1 - Oracle Server Architecture Overview
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
Beyond data modeling Model must be normalised – purpose ? Outcome is a set of tables = logical design Then, design can be warped until it meets the realistic.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Alternative: Bitmap Indexing Imagine the following query in huge table Find customers living in London, with 2 cars and 3 children occupying a 4 bed house.
CS 4432lecture #10 - indexing & hashing1 CS4432: Database Systems II Lecture #10 Professor Elke A. Rundensteiner.
E.G.M. PetrakisHashing1 Hashing on the Disk  Keys are stored in “disk pages” (“buckets”)  several records fit within one page  Retrieval:  find address.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Indexing structures for files D ƯƠ NG ANH KHOA-QLU13082.
CHP - 9 File Structures. INTRODUCTION In some of the previous chapters, we have discussed representations of and operations on data structures. These.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
1 Physical Data Organization and Indexing Lecture 14.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
1 Index Structures. 2 Chapter : Objectives Types of Single-level Ordered Indexes Primary Indexes Clustering Indexes Secondary Indexes Multilevel Indexes.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
File Organization Lecture 1
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Chapter 9 Database Systems © 2007 Pearson Addison-Wesley. All rights reserved.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure II Some of the slides are from slides of.
Spring 2003 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
Chap 5. Disk IO Distribution Chap 6. Index Architecture Written by Yong-soon Kwon Summerized By Sungchan IDS Lab
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Chapter 5 Record Storage and Primary File Organizations
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
Select Operation Strategies And Indexing (Chapter 8)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 10.
Chapter 5 Ranking with Indexes. Indexes and Ranking n Indexes are designed to support search  Faster response time, supports updates n Text search engines.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
Module 11: File Structure
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Indexing and hashing.
Database Performance Tuning and Query Optimization
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Disk Storage, Basic File Structures, and Buffer Management
Chapter 11 Database Performance Tuning and Query Optimization
Indexing 4/11/2019.
Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
Presentation transcript:

Issues in Database Performance Performance in Read / write are hardware issues => throw money at it Performance of DB = ability of engine to locate data Factors affecting speed of retrieval –Cache (sizing of objects) –Access method (table scan / indexes…) –Contention between processes –Indirect processes (roll back, archiving…)

Reading data in Oracle 1 – determine how to access block where data is located –read data dictionary stored in separate part of DB –DD may be loaded up in cache (aka row cache) to limit I/O activity 2 - DD indicates preferred access method for block –B tree index, partitioning, hash clusters etc… 3 - Search begins either in full scan or with index until data found

Designing DB for reading Supply methods for high precision access to data But some queries will defeat the strategies E.g. credit cards transactions – monthly report of scattered items No solution = take off-line

Changing data Oracle makes hard work of changes –Rollback data (immediate) –Log files (long term) Changed blocks read and updated in buffer Released to disk as buffer is cleared But rollback info generate most I/O operations In sensitive environments, simultaneous archiving makes it worse (ARCHIVELOG mode)

Indexing Problems Super-fast indexes need updating as data is changed => DB slows down. More complex index = more complex update mechanism = more rollback …. DB physical structure degrades, so does index (eg split blocks) Performance decreases over time Rebuild needed (which interferes with operations) INDEX_STATS tells you how big the index has grown

Side effects Role of DBMS enforce data consistency A reading process may need an older version of the data –Need to create a private version of the data => processes that should be Read Only require writes Attempt to describe all required operations in executing a query requiring old data

Solution To create an older version of data –Must apply roll back –find old roll back block (I/O) –Roll back index (I/O) –Find data (I/O) –Roll back data –Read old data –Reverse all changes (multiple I/O) Significant I/O implications + buffer full of old stuff

Conclusions Writers and readers DO interfere with each other The mechanisms used by Oracle to bypass locking have performance side effects Performance come with minimising I/O –i.e. with good access techniques –Precision of data location –Physical proximity of related data (cf: caching) However: Techniques to reduce I/O numbers tend to reduce the speed of access!

Indexing example (see figure 3.1): Alphabetic search using B tree index for name = Oscar Smith Split the table in sections (eg: half) and read until find start beyond letter “S” + go back one Do same in branch block Smith, N is the one Then look for page with Smith, N in header Scan for actual entry in index Read address Move to table Read

About Btree indexes Root and Branch blocks = approx 2% of index (small) In frequent hit situations, both blocks loaded in Data Buffer all the time Then only 2 I/O may be required: –One to read leaf block –One to read the table In practice, read in index and in table may require reading several blocks –?

Creating and Using Indexes Important for live access, Even more for querying multiple tables Value matching is costly process in RDB –No pointers –Connection purely on comparison basis only –One value against all values in joint field If link between 2 huge tables, perf is low All RDBs use some form of indexing Some complexity involved as index can reduce physical I/O at the cost of logical I/O (CPU time)

Btree indexing Creating an index means creating a table with X+1 columns –X = number of columns in index –Rowid (added field) [table block + row] Index is then copied into consecutive blocks PCTFREE function leaves space for data growth –high value will generate many leaf blocks but reduce occurrence of split blocks in time Pointer is added to previous and next leaf blocks in header of block

Btree indexing (2) Then branch layer is built: If index > one block –Collect all first entries + block address of each leaf block –Write down into the first level branch block (packed) –If branch block is full, initiates second level of branch blocks etc…. Room is saved in branch blocks: –No forward and backward pointer in branch blocks –Entries are “trimmed” to the bare minimum –First entries are omitted See figure 6.1

Syntax CREATE INDEX name ON table name (field1, field2 …) PCTFREE 50; utility programmes to assess performance of indexes – eg INDEX_STATS View

Updating indexes Index entries are NEVER changed –Marked as deleted and re-inserted Space made available cannot be used until after index is re-built Inserts that don’t fit split the block (rarely 50/50!) If a blocks becomes empty, it is marked as free, but is never removed Also, blocks never merge automatically

Some problems Some situations cannot be addressed with indexes e.g. In a FIFO processing situation (e.g. a queue), indexes will prove counterproductive Index may grow to stupid proportions even with small error rate (unsuccessful processing of data) Every time a transaction is added or processed (deleted) the index must change

Alternative: Bitmap indexing Imagine following query in huge table Find customers living in London, with 2 cars and 3 children occupying a 4 bed house Index not useful – why? –Too big –If query changes in any way =>new index needed –Maintaining a set of indexes for each query would just be too costly Use a bitmap (see table 6.1, 6.2 and 6.3)

Bitmap indexes (2) Special for data warehouse type DBs Build one bitmap for each relevant parameter Combine bitmaps using the “and” SQL keyword Also possible to use “not”ing of bitmap (see table 6.4, 6.5)

Key points to remember What is the key advantage of a bitmap index? What situation does it best suit ? Bitmaps can also be packed by Oracle compression features But size is unpredictable – why?

Example: Table with 1,000,000 rows Bitmap on one column that can contain one of 8 different values (e.g. city names) Data is such that all same city together 125,000 times –Write the bitmap –Imagine what compression can be achieved Data is such that cities are in random order, but same number of each –Same questions

solution First scenario Bitmap for first city: 125,000 ones and 875,000 zeros [trimmed off] Size ~ 125,000 bits or approx 18Kbytes Full bitmap = 156 Kbytes Second scenario: Bitmap for first city sequences of 1’s and 7 zeros, repeated 125,000 times Size ~ 1,000,000 bits or approx 140 Kbytes Full bitmap = 1.12 Mbytes But BTree index for such data would be around 12MB

Conclusion Bitmap indexes work best when combined They are very quick to build –Up to a million rows for 10 seconds Work best when limited number of values + when high repetitions Best way to deal with huge volumes => make drastic selection of interesting rows before reading the table Warning: one entry in a Bitmap = hundreds of records = > locking can be crazy (OLTP systems) => for datawarehouse type applications (no contention)

What oracle says about it Use index for queries with low hit ratio or when queries access < 2 - 4% of data Index maintenance is costly so index “just in case” is silly Must analyse the type of data when deciding what kind of index Do NOT use columns with loads of changes in an index Use indexed fields in “where” statement Can also write queries with NO_INDEX

Administering indexes Indexes degrade over time Should stabilise around 75% efficiency, but don’t Run stats: Analyse index NAME validate structure Analyse index NAME compute statistics Analyse index NAME estimate statistics sample 1 percent See table 6.6