Www.semantec.de Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D-71083 Herrenberg.

Slides:



Advertisements
Similar presentations
Tuning Oracle SQL The Basics of Efficient SQLThe Basics of Efficient SQL Common Sense Indexing The Optimizer –Making SQL Efficient Finding Problem Queries.
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Big Data Working with Terabytes in SQL Server Andrew Novick
Chapter 11 Group Functions
Semantec Ltd. Oracle Performance Tuning Boyan Pavlov Indexes Indexes.
Midterm Review Lecture 14b. 14 Lectures So Far 1.Introduction 2.The Relational Model 3.Disks and Files 4.Relational Algebra 5.File Org, Indexes 6.Relational.
Introduction to Structured Query Language (SQL)
David Konopnicki Choosing Access Path ä The basic methods. ä The access paths and when they are available. ä How the optimizer chooses among the.
8-1 Outline  Overview of Physical Database Design  File Structures  Query Optimization  Index Selection  Additional Choices in Physical Database Design.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
1  MyOnlineITCourses.com 1 MyOnlineITCourses.com Oracle Partitioning -- A Primer.
AN INTRODUCTION TO EXECUTION PLAN OF QUERIES These slides have been adapted from a presentation originally made by ORACLE. The full set of original slides.
Database Programming Sections 5– GROUP BY, HAVING clauses, Rollup & Cube Operations, Grouping Set, Set Operations 11/2/10.
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
Oracle10g for Data Warehousing Jiangang Luo
Cloud Computing Lecture Column Store – alternative organization for big relational data.
1.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 9 Index Management.
Basel · Baden · Bern · Lausanne · Zurich · Düsseldorf · Frankfurt/M. · Freiburg i. Br. · Hamburg · Munich · Stuttgart · Vienna Partitioning Your Oracle.
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints.
Component 4/Unit 6f Topic VI: Create simple querying statements for the database The SELECT statement Clauses Functions Joins Subqueries Data manipulation.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Database Management 9. course. Execution of queries.
Advanced searching with Oracle Text Indexing and searching in text and documents Author: Krasen Paskalev Certified Oracle DBA Semantec.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach.
1 Chapter 14 DML Tuning. 2 DML Performance Fundamentals DML Performance is affected by: – Efficiency of WHERE clause – Amount of index maintenance – Referential.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Data Warehousing.
´Google-ized´ search in your business data Author: Krasen Paskalev Certified Oracle 8i/9i DBA Seniour Oracle Consultant Semantec GmbH Benzstr.
Star Transformations Tony Hasler, UKOUG Birmingham 2012 Tony Hasler, Anvil Computer Services Ltd.
6 Extraction, Transformation, and Loading (ETL) Transformation.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
SQL Performance and Optimization l SQL Overview l Performance Tuning Process l SQL-Tuning –EXPLAIN PLANs –Tuning Tools –Optimizing Table Scans –Optimizing.
Transportation: Loading Warehouse Data Chapter 12.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Planning Warehouse Storage Chapter 9. Data Partitioning zBreaking up a data into separate physical units that can be handled independently zEase of: -
Database Fundamental & Design by A.Surasit Samaisut Copyrights : All Rights Reserved.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Chapter 5 Index and Clustering
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 6 This material was developed by Oregon Health & Science.
1 Chapter 9 Tuning Table Access. 2 Overview Improve performance of access to single table Explain access methods – Full Table Scan – Index – Partition-level.
Sorting and Joining.
Query Processing – Implementing Set Operations and Joins Chap. 19.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
1 Indexes ► Sort data logically to improve the speed of searching and sorting operations. ► Provide rapid retrieval of specified rows from the table without.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
 CONACT UC:  Magnific training   
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Tuning Oracle SQL The Basics of Efficient SQL Common Sense Indexing
Tim Hall Oracle ACE Director
Indexes By Adrienne Watt.
Physical Changes That Don’t Change the Logical Design
Choosing Access Path The basic methods.
Informix Red Brick Warehouse 5.1
Blazing-Fast Performance:
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Contents Preface I Introduction Lesson Objectives I-2
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Oracle 8i/9i features which support Data Warehousing Author: Krasen Paskalev Certified Oracle DBA Semantec GmbH. D Herrenberg

Agenda ETL Features Data Warehouse Management Data Warehouse Querying Parallel Operations

Agenda ETL (Extraction, Transformation, Transportation and Loading) –Transportable Tablespaces –External Tables –Table Functions –MERGE Statement Data Warehouse Management Data Warehouse Querying Parallel Operations

Transportable tablespaces The fastest method for moving data between databases The tablespeces with all their data are plugged into the data warehouse database ProductionData Warehouse Tablespace ftp

External Tables Can be directly queried and joined in SQL, PL/SQL and Java Avoid data staging One step loading and transformation Save DB space ASCII file Excel sheet Read-only virtual tables External files

Table Functions Can take a set of rows as input Can return a set of rows as output Can be used in the FROM clause Can be paralellized Can be pipelined User defined in PL/SQL, Java or C Region% West Central East Sales Table Function

Table Functions Pipelining Data Transformation Table Function Table Function Source Target Step 1Step 2 Log table

MERGE statement idamount idamount UPDATE INSERT new_salessales MERGE INTO sales s USING new_sales n ON (s.id = n.id) WHEN MATCHED THEN UPDATE s.amount = s.amount + n.amount WHEN NOT MATCHED THEN INSERT (s.id, s.amount) VALUES (n.id, n.amount) idamount

MERGE Advantages Single simple SQL statement Can be paralellized Can use Bulk DML Fewer scans of the base table

More ETL Features Direct-path Interface –SQL*Loader –CREATE AS SELECT –INSERT –Oracle Call Interface Multi-table INSERTs

Agenda ETL Features Data Warehouse Management –Partitioning –Materialized Views –DBMS_STATS Data Warehouse Querying Parallel Operations

Partitioning Jan‘2002 Tablespace 0102 Feb‘2002 Tablespace 0202 Dec‘2002 Tablespace Table Sales

Advantages of Partitioning Partition independance –LOAD, MOVE, Purge and DROP partitions –MERGE, SPLIT, EXCHANGE partitions –BACKUP, RESTORE, SET READ ONLY Partition elimination –SELECT or JOIN only the partition needed Parallel Operations –SELECT, UPDATE, DELETE, MERGE

Partitioning Methods Hash Partitioning –Even row distribution by hash function Range Patitioning –< | < |... | < List Partitioning –Stuttgart, Munich | Manheim, Frankfurt |...

Table Compression Stores tables or partitions in compressed format Reduces disk space requirements Reduces memory requirements Speeds up query execution Speeds up backup and recovery Very efficient for highly redundant data – the FACT table 2 to 4 times compression is usual

Materialized Views revenue_sum regionmonthrevenue sales regionmonthinvc_sum... SELECT region, month, sum(invc_sum) revenue FROM sales GROUP BY region, month

Advantages of Materialized Views Improved query/reporting performance for: –Summaries –Agregates –Joins Fast Refresh –Data change tracking –Partition change tracking No application change needed – their usage is automatic

DBMS_STATS New package for gathering table and index statistics Gathers statistics in parallel Can export and import statistics Production Data Warehouse Development Data Warehouse Statistics

More Data Warehouse Management Features Index-organized tables Online index rebuild Online table rebuild

Agenda ETL Features Data Warehouse Management Data Warehouse Querying –Bitmap Indexing –Star Query Transformation –Agregation – ROLLUP, CUBE, Grouping Sets –Analytic functions Parallel Operations

Bitmap Indexes RegioneastcentralwestNULL rowid rowid ORAND NOT () =

Advantages of Bitmap Indexes Reduced response time for ad-hoq queries Uses much less space than a B-tree index Dramatic performance gains for large class of queries: –Multiple AND, OR and NOT conditions –IS NULL conditions –COUNT –NOT IN - Bitmap MINUS –BETWEEN - Bitmap UNION

Star Query Transformation The query is re-written for efficient execution sales cust_idprod_idamount q_id cust_idnameprod_idnameq_idname customersproductsquarters Steps: 1.Filter all dimentions 2.Combine the bitmap indexes of the fact table‘s foreign keys 3.Retrieve fact and dimention other rows

Agregation Operators Oracle extends the GROUP BY clause by: –ROLLUP –CUBE –Grouping Sets SELECT SUM(amount) FROM sales GROUP BY county, quarter Q1 Q2 UKUS

ROLLUP and CUBE ROLLUP(country, department, quarter) (country, department, quarter) (country, department) (country) () - Grand Total CUBE(country, department, quarter) (country, department, quarter) (country, department) (country, quarter) (department, quarter) (country) (department) (quarter) () - Grand Total ROLLUP – subtotals at increasing levels of agregation – from right to left CUBE – subtotals on all combinations n+1 2n2n

Agregation Operators Advantages Applicable on many agregation functions: –SUM, AVG, COUNT –MIN, MAX –STDDEV, VARIANCE Flexible agregation groups and levels Runs in parallel

Analytic functions Significantly improved performance for complex reports as: –Ranking – Find top 10 sales in each region –Moving agregates – What is the 90 day moving sales average? –Period-over-period comparison – What are the revenues from January 2002 compared to January 2001?

Example – Moving Window SELECT c.cust_id, t.month, SUM(amount_sold) SALES, AVG(SUM(amount_sold)) OVER (ORDER BY c.cust_id, t.month ROWS 2 PRECEDING) MOV_3_MONTH FROM sales s, times t, customers c WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND t. year = 1999 AND c.cust_id IN (6380) GROUP BY c.cust_id, t.month ORDER BY c.cust_id, t.month; CUST_ID MONTH SALES MOV_3_MONTH ,642 19, ,324 19, ,655 20, ,091 22, ,367 21, ,755 22,738

More Data Warehouse Querying Features Function-based Indexes Optimizer Plan Stability Statistics for Long Running Operations Resumable Statements Full Outer Join With Operator Oracle Text “Advanced Searching with Oracle Text” , 2 nd Conference day 11:50-12:30, Konferenzraum EG

Agenda ETL Features Data Warehouse Management Data Warehouse Querying Parallel Operations

Parallel Operations Dramatically reduce execution time of data intensive operations Loading –Direct Path Load DDL Statements –CREATE AS SELECT, CREATE INDEX –REBUILD INDEX, REBUILD INDEX PARTITION –MOVE, SPLIT, COALESCE PARTITION DML Statements –INSERT AS SELECT –UPDATE, DELETE and MERGE

Parallel Operations Access methods –Table and index range and full scans Join methods –Nested loops, Sort merge, Hash, Star transformation SQL operations –GROUP BY, ROLLUP, CUBE –DISTINCT, UNION, UNION ALL –Agregate functions

Parallel System Requirements Symetric Multiprocessor Systems, Clusters or Massively Parallel Systems Sufficient I/O Bandwidth Sufficient (Underutilized) CPUs Sufficient Memory

Summary Effective handling of multi-terabyte Data Warehouses Rich feature set for all Data Warehouse operations Flexible agregation and analytical features for high performance queries Effective parallelizm

Want to know more? Telephone: Fax: Internet: Company: Name: Address: Semantec GmbH. Krasen Paskalev, Armin Singer, Peter Kopecki Benzstr. 32 D Herrenberg, Germany Meet us here -> booth 2C at the ground floor +49(7032) (7032) (7032)