Data Warehousing Seminar Chapter 13 Indexing the Warehouse

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

The Architecture of Oracle
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
5 Copyright © 2005, Oracle. All rights reserved. Managing Database Storage Structures.
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Performance And Tuning – Lecture 7 Copyright System Managers LLC 2007 all rights reserved.
Tipos de Segmentos. B-Tree Index Index entry header Key column length Key column value ROWID Root Branch Leaf Index entry.
12 Copyright © Oracle Corporation, All rights reserved. Managing Indexes.
Semantec Ltd. Oracle Performance Tuning Boyan Pavlov Indexes Indexes.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Managing Schema Objects
Harvard University Oracle Database Administration Session 5 Data Storage.
Oracle Database Administration Database files Logical database structures.
9/11/2015ISYS366 - Week051 ISYS366 – Week 5-6 Database Tuning - User and Rollback Data Spaces, Recovery, Backup.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Oracle Database Architecture An Oracle server: –Is a database management system that provides an open, comprehensive, integrated approach to information.
Lecture 8 Index Organized Tables Clusters Index compression
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
1 Physical Data Organization and Indexing Lecture 14.
Chapter 6 Additional Database Objects
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 9 Index Management.
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints.
7202ICT Database Administration Lecture 7 Managing Database Storage Part 2 Orale Concept Manuel Chapter 3 & 4.
Extents, segments and blocks in detail. Database structure Database Table spaces Segment Extent Oracle block O/S block Data file logical physical.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
School of Computing and Management Sciences © Sheffield Hallam University Finding Data –In a list of 700 composers, how do we find Berlioz? –The row with.
Chapter 6 Additional Database Objects Oracle 10g: SQL.
9 Storage Structure and Relationships. 9-2 Objectives Listing the different segment types and their uses Controlling the use of extents by segments Stating.
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 7 Basic Table Management.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Lecture 5 Cost Estimation and Data Access Methods.
Oracle 10g Database Administrator: Implementation and Administration Chapter 7 Basic Table Management.
Indexes / Session 2/ 1 of 36 Session 2 Module 3: Types of Indexes Module 4: Maintaining Indexes.
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
8 Copyright © 2007, Oracle. All rights reserved. Managing Schema Objects.
14 Copyright © 2006, Oracle. All rights reserved. Tuning Block Space Usage.
Week 4 Lecture 2 Advanced Table Management. Learning Objectives  Create tables with large object (LOB) columns and tables that are index-organized 
D Copyright © Oracle Corporation, All rights reserved. Loading Data into a Database.
Harvard University Oracle Database Administration Session 6 Object Storage.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Partition Architecture Yeon JongHeum
Managing Schema Objects
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
Chapter 4 Indexes. Indexes Logically represents subsets of data from one or more tables View Generates numeric valuesSequence Basic unit of storage; composed.
Indexes … WHERE key = Table Index 22 Row pointer Key Indexes
1 Chapter 9 Tuning Table Access. 2 Overview Improve performance of access to single table Explain access methods – Full Table Scan – Index – Partition-level.
CS 440 Database Management Systems Lecture 6: Data storage & access methods 1.
Unit 6 Seminar. Indexed Organized Tables Definition: Index Organized Tables are tables that, unlike heap tables, are organized like B*Tree indexes.
Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.
Dale Roberts 1 Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
1 Indexes ► Sort data logically to improve the speed of searching and sorting operations. ► Provide rapid retrieval of specified rows from the table without.
Indexes 22 Index Table Key Row pointer … WHERE key = 22.
Select Operation Strategies And Indexing (Chapter 8)
Oracle Database Architectural Components
1 Chapters 19 and 20  Ch. 19: By What Authority? Users Roles Grant and revoke Synonyms  Ch. 20: Changing the Oracle Surroundings Indexes Clusters Sequences.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Physical Database Design and Performance
Indexes and Basic Access Methods
Indexes … WHERE key = Table Index 22 Row pointer Key Indexes
Chapter 4 Indexes.
CH 4 Indexes.
CH 4 Indexes.
Understanding Indexes
Managing Indexes.
Managing Tables.
Alternative Storage Techniques
Presentation transcript:

Data Warehousing Seminar Chapter 13 Indexing the Warehouse M.S. 2 Hyeyoung Cho

Oracle Storage Database Tablespace Data file Logical Physical Segment Extent OS block Oracle block

Data block and Row Data block Row header Free space Data Row header Column length Column value

What Is an Index? A structure separate from the table Stores the location of rows based on the specified column values Speed up the retrieval of rows by using a pointer Reduce disk I/O by using rapid path access method to locate the data quickly Used and maintained automatically by the Oracle Server

When to create an Index? The table is large The columns are often used as a condition in the query The column contains a wide range of value or a large number of null value Most queries expected to retrieve less than 5% of the rows Automatically created when define PRIMARY KEY or UNIQUE constraint

Classification of Indexes Logical(application perspective) Single column or concatenated Unique or nonunique Function-based Physical(storing perspective) B-tree Normal or reverse key Bitmap Partitioned or nonpartitioned

Single-column and Composite index Single-column indexes one column in the index key Composite index multiple columns in the index key Max : 32 , 1/3 of the data block size create index purchase1 on purchase (purchase_id) storage (initial 2m next 2m pctincrease 0) tablespace purch_ind1; create index purchase1 on purchase (purchase_id, purchase_date, total_amt) storage (initial 2m next 2m pctincrease 0) tablespace purch_ind1;

Unique or nonunique Unique Nonunique A single key point to only one row Nonunique A single key associated with multiple rows

Function-based indexes(1/3) Oracle 8i new feature Query rewrite privilege required Using functions or expressions involve one or more columns in the table Precomputes the value of the function or expression store it in the index Created as either a B-tree or a Bitmap index

Function-based indexes(2/3) Example1. Client names in a mixed case Index creation Statement predicate create index billing_upcl on billing (upper(client)) storage(initial 20m next 80m maxextents unlimited pctincrease 0) tablespace my_indexes; select bill_id, client, state_nm from billing where upper(client) = ‘MONSANTO’ ;

Function-based indexes(3/3) Example2. Commission exceeded 25% of salary for the certain time period Index creation Statement predicate create index sale_bc_amt on sale (comm/(base+comm)*100) … storage(initial 20m next 80m maxextents unlimited pctincrease 0) tablespace my_indexes; select sum(comm) from sale where comm/(base+comm) * 100 > 25 and tr_date between to_date(’01-MAY-2002’, ‘DD-MON-YYYY’) and to_date (’30-JUN-2002’, ‘DD-MON-YYYY’);

B-tree indexes(1/3) Traditional indexing technique! Stores a list of ROWID for each key A hierarchy of highest-level and lower level index blocks(root> branch> leaf) Leaf Entry Format Header : chaining info, row lock status, number of columns Key column length and value pairs ROWID : the key values (block num. row num. file num) Simplicity, Easy maintenance,High cardinality columns Suitable for exact match query and range query S

B-tree indexes(2/3) Structure Root block Branch block Branch block Leaf blocks Leaf blocks Leaf blocks Index entry header Key column length Key column value ROWID Index entry

B-tree indexes(3/3) Creating Normal B-Tree Indexes Create index employee_last_name_idx on employee(last_name) pctfree 30 storage (initial 200k next 200k pctincrease 0 maxextents 50) tablespace indx; Create [UNIQUE] index [schema.] index on [schema.] table(column [ASC | DESC] [, column [ASC | DESC] ] … ) [TABLESPACE tablespace] [PCTFREE integer] [INITRANS integer] [MAXTRANS integer] [storage – clause] [LOGGING | NOLOGGING] [NOSORT]

Reverse Key indexes(1/3) Reverse the bytes of each column indexed (except the ROWID) Spreading the work load across multiple blocks Unsuitable for range queries Use the keyword reverse

Reverse Key indexes(2/3) Index on EMPLOYEE(ID) EMPLOYEE table KEY ROWID ID (BLOCK# ROW# FILE#) --------- ------------------------------------- 1257 0000000000F. 0002. 0001 2877 0000000000F. 0006. 0001 4567 0000000000F. 0004. 0001 6657 0000000000F. 0003. 0001 8967 0000000000F. 0005. 0001 … … ID FIRST_NAME JOB --------- --------------------- ----------------- 7499 ALLEN SALESMAN 7369 SMITH CLERK 7521 WARD SALESMAN 7566 JONES MANAGER 7654 MARTIN SALESMAN … … …

Reverse Key indexes(3/3) Creating Reverse key index Create unique index orders_id_idx on orders(id) reverse pctfree 30 storage (initial 200k next 200k pctincrease 0 maxextents 50) tablespace indx; Create [UNIQUE] index [schema.] index on [schema.] table(column [ASC | DESC] [, column [ASC | DESC] ] … ) [TABLESPACE tablespace] [PCTFREE integer] [INITRANS integer] [MAXTRANS integer] [storage – clause] [LOGGING | NOLOGGING] [NOSORT] REVERSE

Bitmap indexes(1/4) Stores a bitmap for each key value For Low cardinality columns Leaf Entry Format Header : chaining information, row lock status, number of columns Key column length and value pairs Start ROWID , End ROWID : the first row and the last row pointed by the bitmap (block num. row num. file num) Bitmap : a string of bits depending on key value Create bitmap index person_region on person (region);

Bitmap indexes(2/4) Structure File 3 Table Block 10 Block 11 Index Create bitmap index person_region on person (region); Key startROWID endROWID Bitmap <Blue 10. 0. 3, 12. 8. 3, 1000100100010> <Green 10. 0. 3, 12. 8. 3, 0001010000100> <Red 10. 0. 3, 12. 8. 3, 0100000011000> <Yellow 10. 0. 3, 12. 8. 3, 0010001000001>

Bitmap indexes(3/4) Creating Bitmap index Create bitmap index person_region on person(region) tablespace indexes_prd pctfree 30 storage (initial 200k next 200k pctincrease 0 maxextents 50) tablespace indx; Create [UNIQUE] BITMAP index [schema.] index on [schema.] table(column [ASC | DESC] [, column [ASC | DESC] ] … ) [TABLESPACE tablespace] [PCTFREE integer] [INITRANS integer] [MAXTRANS integer] [storage – clause] [LOGGING | NOLOGGING] [NOSORT]

Bitmap indexes(4/4) Example : a bitmap index on the PERSON table RESION Bitmap Index Row Region NorthBitmap EastBitmap WestBitmap SouthBitmap 1 North 1 0 0 0 2 East 0 1 0 0 3 West 0 0 1 0 4 West 0 0 1 0 East 0 1 0 0 West 0 0 1 0 South 0 0 0 1 North 1 0 0 0

B-Tree index VS Bitmap index Suitable for high-cardinality columns Row-level locking Bitmap-segment-level locking Update on keys relatively inexpensive Update on keys very expensive More storage Less storage Inefficient for queries using OR predicates Efficient for queries using OR predicates Useful for OLPT Useful for data arehousing

B-Tree space VS Bitmap space Bitmap index use 1/100 of the space of the B-tree index! Unique Column Values Cardinality(%) B-Tree Space Bitmap Space 500,000 50.00 15.29 12.35 100,000 10.00 15.21 5.25 10,000 1.00 14.34 2.99 100 0.01 13.40 1.38 5 < 0.01 0.78 Table with 1,000,000 rows

Index-organized tables(IOT)(1/3) Merge the data and index pieces into the same segment No duplication of the values for the Key column Faster key-based access for queries involving exact match and range searches Must have a primary key Specify an overflow tablespace name and percentage Secondary indexes(Oracle 8i new feature)

Index-organized tables(IOT)(2/3) Regular table access IOT access Only One scan! Index Index ROWID Non-key columns Key columns Row header Table

Index-organized tables(IOT)(3/3) Creating Index-organized table create table sales ( office_cd number(3), qtr_end date, revenue number(10,2), review varchar2(1000) constraint sales_pk PRIMARY KEY (office_cd, qtr_end)) ORGANIZATION INDEX tablespace indx PCTTHRESHOLD 20 INCLUDING revenue OVERFLOW tablespace user_data;

Indexes on Partitioned Tables(1/4) An index in several segments Spread across many tablespaces Decreasing contention for index lookup Increasing manageability and scalability Used with partitioned tables Creating Index partition for each table partition

Indexes on Partitioned Tables(2/4) Local index Partition keys of the index match its underlying table Global index Partition keys of the index differ from its underlying table Prefixed index Left-most column in a partitioned index matches the left-most column in that index’s partition key. Nonprefixed index Left-most column in a partitioned index differ from the left-most column in that index’s partition key.

Indexes on Partitioned Tables(3/4) Creating a local partitioned index create table rumors( thorn_id number(10), rumor_id number(4), …) partition by range (rumor_id) (partition rumors_p001 values less than(41), partition rumors_p002 values less than(50), … partition rumors_pmax values less than(maxvalue)); create unique index rumors_u1 on rumors(thorn_id, rumor_id) local (partition rumors_u1_p001, partition rumors_u1_p002, … partition rumors_u1_pmax);

Indexes on Partitioned Tables(4/4) Creating a global partitioned index create table billing(bill_id number(10), region_id varchar2(3), …) partition by range (bill_id) (partition bill_p001 values less than(90000), partition bill_p002 values less than(130000), … partition bill_pmax values less than(maxvalue)); create unique index billing_u1 on billing(bill_id) global partition by range (bill_id) (partition bill_u1_p001 values less than(100000), partition bill_u1_p001 values less than(200000), … partition bill_u1_pmax values less than(maxvalue) );

Optimizer Histograms(1/2) Rule-based optimizer Uses an set of rules for ranking access path Syntax- and data dictionary-driven Cost-based optimizer Chooses least-cost(resource, time) path Statistics-driven analyze table table_name compute statistics;

Optimizer Histograms(2/2) Describe the data distribution of a particular column in more detail Better predicate selectivity estimate for unevenly distributed data Bucket : the number of distinctive column value analyze table table_name compute statistics for table for all Indexed columns size 6;

Optimizer Histograms(2/2) http://www.akadia.com/services/oratips/costbased_optimizer/optm.htm

Guidelines(1) Use NOLOGGING for large indexes creation Rebuilding index Use Different tablespace Converting a index into a reverse key index Index build times with and without NOLOGGING Rows in Table Indexes With NOLOGGING Without NOLOGGING 46,34013 13 57s 3m57s 1,094,8146 6 10m5s 24m36s 4,4013,309 4 27m54s 60m48s ALTER INDEX orders_region_id_idx REBUILD (REVERSE) Tablespace indx02 NOLOGGING;

Guidelines(2) Temporary workspace Sort space parameter Create during the life of a create index statement Dropped after the activity completes Sort space parameter SORT_AREA_SIZE Shared pool parameter shared_pool_size = 10000000

SGA(System Global Area) Oracle Instance SMON DBW0 PMON CKPT LGWR Background processes Memory structures INSTANCE SGA(System Global Area) Data buffer Cache Redo log Buffer Shared pool Library Cache Data Dictionary cache Databae PGA(Program Global Area) sort area, cursor state, session info, stack space User process Server process sql