ETL There’s a New Sheriff in Town: Oracle OR… Not Just another Pretty Face Presented by: Bonnie O’Neil.

Slides:



Advertisements
Similar presentations
Oracle 10g & 11g for Dev Virtual Columns DML error logging
Advertisements

CC SQL Utilities.
BY LECTURER/ AISHA DAWOOD DW Lab # 3 Overview of Extraction, Transformation, and Loading.
Manipulating Data Schedule: Timing Topic 60 minutes Lecture
Copyright © 200\8 Quest Software High Performance PL/SQL Guy Harrison Chief Architect, Database Solutions.
18 Copyright © 2005, Oracle. All rights reserved. Moving Data.
5 Copyright © 2005, Oracle. All rights reserved. Extraction, Transformation, and Loading (ETL) Loading.
17 Copyright © 2007, Oracle. All rights reserved. Moving Data.
CHAPTER 14 External Tables. External Table Features An external table allows you to create a database table object that uses as its source an operating.
Lecture-5 Though SQL is the natural language of the DBA, it suffers from various inherent disadvantages, when used as a conventional programming language.
A Guide to Oracle9i1 Advanced SQL And PL/SQL Topics Chapter 9.
Introduction to PL/SQL
PL/SQL Bulk Collections in Oracle 9i and 10g Kent Crotty Burleson Consulting October 13, 2006.
A Guide to SQL, Eighth Edition Chapter Three Creating Tables.
Bordoloi and Bock CURSORS. Bordoloi and Bock CURSOR MANIPULATION To process an SQL statement, ORACLE needs to create an area of memory known as the context.
PL / SQL P rocedural L anguage / S tructured Q uery L anguage Chapter 7 in Lab Reference.
Cursor and Exception Handling By Nidhi Bhatnagar.
SAGE Computing Services Customised Oracle Training Workshops and Consulting Are you making the most of PL/SQL? Hints and tricks and things you may have.
Advanced PL/SQL and Oracle ETL Doug Cosman Senior Oracle DBA SageLogix, Inc. Open World 2003.
Lecture 4 PL/SQL language. PL/SQL – procedural SQL Allows combining procedural and SQL code PL/SQL code is compiled, including SQL commands PL/SQL code.
Stored Procedures, Transactions, and Error-Handling
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
Overview · What is PL/SQL · Advantages of PL/SQL · Basic Structure of a PL/SQL Block · Procedure · Function · Anonymous Block · Types of Block · Declaring.
The Oracle9i Multi-Terabyte Data Warehouse Jeff Parker Manager Data Warehouse Development Amazon.com Session id:
1 Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen Cursors These slides are licensed under.
Triggers and Stored Procedures in DB 1. Objectives Learn what triggers and stored procedures are Learn the benefits of using them Learn how DB2 implements.
20 Copyright © Oracle Corporation, All rights reserved. Oracle9 i Extensions to DML and DDL Statements.
PRACTICE OVERVIEW PL/SQL Part Examine this package specification and body: Which statement about the V_TOTAL_BUDGET variable is true? A. It must.
6 Extraction, Transformation, and Loading (ETL) Transformation.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
PL/SQL Block Structure DECLARE - Optional Variables, cursors, user-defined exceptions BEGIN - Mandatory SQL Statements PL/SQL Statements EXCEPTIONS - Optional.
Chapter 15 Introduction to PL/SQL. Chapter Objectives  Explain the benefits of using PL/SQL blocks versus several SQL statements  Identify the sections.
Guide to Oracle 10g ITBIS373 Database Development Lecture 4a - Chapter 4: Using SQL Queries to Insert, Update, Delete, and View Data.
Database structure and space Management. Segments The level of logical database storage above an extent is called a segment. A segment is a set of extents.
What is a Package? A package is an Oracle object, which holds other objects within it. Objects commonly held within a package are procedures, functions,
Database Lab Lecture 1. Database Languages Data definition language ( DDL ) Data definition language –defines data types and the relationships among them.
D Copyright © Oracle Corporation, All rights reserved. Loading Data into a Database.
Chapter 9: Advanced SQL and PL/SQL Guide to Oracle 10g.
implicit and an explicit cursor
Session 1 Module 1: Introduction to Data Integrity
CHAPTER 14 External Tables. External Table Features An external table allows you to create a database table object that uses as its source an operating.
Learningcomputer.com SQL Server 2008 –Views, Functions and Stored Procedures.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
Oracle 10g Database Administrator: Implementation and Administration Chapter 10 Basic Data Management.
Chapter 8 Advanced SQL Pearson Education © Chapter 8 - Objectives How to use the SQL programming language How to use SQL cursors How to create stored.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata RDBMS Concepts.
Unit-8 Introduction Of MySql. Types of table in PHP MySQL supports various of table types or storage engines to allow you to optimize your database. The.
1 11g NEW FEATURES ByVIJAY. 2 AGENDA  RESULT CACHE  INVISIBLE INDEXES  READ ONLY TABLES  DDL WAIT OPTION  ADDING COLUMN TO A TABLE WITH DEFAULT VALUE.
SQL Triggers, Functions & Stored Procedures Programming Operations.
 CONACT UC:  Magnific training   
20 Copyright © 2006, Oracle. All rights reserved. Best Practices and Operational Considerations.
2 Copyright © 2009, Oracle. All rights reserved. Managing Schema Objects.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Tim Hall Oracle ACE Director
Fundamentals of DBMS Notes-1.
Introduction To Oracle
Web Technologies IT230 Dr Mohamed Habib.
SQL and SQL*Plus Interaction
Subqueries Schedule: Timing Topic 25 minutes Lecture
PL/SQL.
Writing Correlated Subqueries
DATABASE MANAGEMENT SYSTEM
What Is a View? EMPNO ENAME JOB EMP Table EMPVU10 View
PL/SQL week10.
Chapter 8 Advanced SQL.
Subqueries Schedule: Timing Topic 25 minutes Lecture
MATERI PL/SQL Procedures Functions Packages Database Triggers
Alternative Storage Techniques
Subqueries Schedule: Timing Topic 25 minutes Lecture
Presentation transcript:

ETL There’s a New Sheriff in Town: Oracle OR… Not Just another Pretty Face Presented by: Bonnie O’Neil

Introduction ETL (Extract, Transformation and Loading) is having a paradigm shift. Previously, ETL was done with tools outside the database. Ab Initio, Data Stage and Informatica were kings of the ETL world. However, there were major disadvantages to this paradigm. Third party ETL tools required a separate box because they had no controls on the consumption of CPU resources. Third party ETL tools were not integrated into the database. This meant that ETL developers became experts in running the tool, but often they had no knowledge on how the tool related to the database. These tools often did not keep up with the new releases of Oracle and would not take advantage of Oracle’s new features. In order to create fast and efficient ETL processes, you needed to know both.

Introduction Oracle 9i is changing the ETL paradigm. Oracle 9i supports such important features as: External tables MERGE statements (INSERT/UPDATE) Conditional multi-table inserts Pipelined Table functions Native Compilation of PL/SQL programs

Introduction With External Tables, you can now read flat files in parallel into a virtual table. This virtual table can be accessed just as any other table. Flat Files Virtual Table With the MERGE statement, you can replace a complex “update the row if it exists otherwise insert it” logic with one easy to understand statement. With conditional multi-table inserts, I can read a row from an external table and insert it into many tables based upon conditions. I can replace most ETL processes with a relatively simple INSERT statement. Pipelined table functions allow Oracle to pass a set of rows to a function and send the results out after each row is finished. With Native Compilation of PL/SQL programs, I can now take a slow PL/SQL program and compile it into C code and then link it to the database.

External Tables The first component of the new Oracle ETL package is External Tables. External Tables are flat files that Oracle can read and treat as regular tables. No indexes can be on External tables. External Tables are read only ( No DML allowed ) In the past, you had to load the data to temporary tables utilizing SQLLOADER, do some transformations, and then load the data into tables. Now, you can just read the data straight from flat files. This eliminates much overhead with rollback segments, writing out data to the temporary table and Redo Log activity. If you have enough memory, you can cache the table in memory. Additionally, you can specify a degree of parallelism for accessing the external flat files.

External Tables Steps to using external tables. Create a directory for accessing the flat files. This is required by the access driver. CREATE DIRECTORY extdir AS ‘/u01/app/oracle/extfiles’; Grant access to the directory GRANT read,write ON DIRECTORY extdir TO scott; This Directory object is used in the External Table clause: Using the LOCATION clause for input files To specify the location for output files.

External Tables External tables are based upon a SQLLOADER format. In the future, a format of EXPORT / IMPORT will also be supported. KEYWORD: CREATE TABLE …. ORGANIZATION EXTERNAL TYPE: type oracle_loader create table tag_ext( log_type varchar2(60), user_id varchar2(128), trans_id varchar2(64), trans_name varchar2(128), dttm timestamp(4)) organization external ( type oracle_loader default directory extdir access parameters … DEFAULT DIRECTORY specifies the directory to use for input/output files if no location is specified.

External Tables Access Parameters - Contains two sections Record Format Contains information about the record such as the format of the records, names of output files, and what rules are used to exclude data from being loaded. … access parameters ( records delimited by newline load when log_type=‘T1’ badfile extdir:’t1.bad’ nodiscardfile logfile extdir:’t1.log’ skip 1 fields terminated by '|' …  Use SKIP statement if your flat file contains a header row.  Output files can contain %p (process number) or %a (agent number) as part of the filename to create unique output files.

External Tables Field Format Describes what characters are used to separate fields, what character is optionally used to enclose fields, and the data format of the fields in the datafile. … fields terminated by '|' missing field values are null ( log_type char(60), user_id char(128), trans_id char(64), trans_name char(128), trx_class char(10), dttm date(24) mask “dd-mm-yyyy hh24:mi:ssxff“ )

External Tables Use the LOCATION clause to specify the filenames of the input files. REJECT LIMIT specifies the maximum number of rejected records that are allowed. This number applies to each parallel slave used to query the data. PARALLEL specifies the number of access drivers that are started to process the datafiles. … location(‘t1.dat’,‘t2.dat’) ) reject limit unlimited parallel 2 /

External Tables Data Dictionary Views: DBA_EXTERNAL_TABLES DBA_EXTERNAL_LOCATIONS Some DDL statements are allowed on external tables: REJECT LIMIT, PARALLEL, DEFAULT LOCATION, ACCESS PARAMETERS, LOCATION, ADD/MODIFY/DROP COLUMN, RENAME TO Performance Issues: Fixed width is faster than delimited fields Not writing log, bad and discard files No conditions clauses

Merge Statement Before Oracle 9i, a common load, which entailed updating a row if it existed and if it does not insert the row, required two statements or writing PL/SQL code. This data load now can be done in a single statement. This statement is sometimes called an UPSERT – a cross between an update and an insert statement. The merge fires any Insert or Update Triggers. Cannot update the same row of the target table multiple times in the same MERGE statement. ORA-30926: unable to get a stable set of rows in the source tables

Merge Statement MERGE INTO tag t USING tag_ext x ON (t.user_id=x.user_id AND t.trans_id=x.trans_id) WHEN NOT MATCHED THEN INSERT(log_type,user_id,trans_id,trans_name,dttm) VALUES(x.log_type,x.user_id,x.trans_id,x.trans_name,x.dttm) WHEN MATCHED THEN UPDATE SET log_type=x.log_type, trans_name=x.trans_name, dttm=x.dttm Update cannot update columns Used in the ON condition Source can be a table or the results of a query

Merge Statement Common Errors: Running merge statement and get ORA-00904: : invalid identifier CAUSE: The reason for this is because you specified the column name in ON clause and the UPDATE clause. Columns used in the ON clause for the join cannot be updated.

Multi-table Inserts In Oracle 9i, a single insert statement can place data values into multiple tables, both unconditionally and conditionally. This is more efficient than having to parse and execute several insert statements. The format is an extension of the INSERT … SELECT statement. Unconditional Multi-table Insert The ALL keyword is required. INSERT ALL INTO emp VALUES(empno,ename,title,salary) INTO commision VALUES(empno,comm) SELECT empno,ename,title,salary,salary*.10 comm FROM employees_external;

Multi-Table Inserts In addition, Oracle 9i allows a conditional clause to be included is a multi table insert. A conditional insert will insert into a table if the WHEN condition is true. You can insert based on the FIRST WHEN clause that evaluates to true or ALL WHEN clauses that evaluate to true. INSERT FIRST WHEN (title=‘Oracle DBA’) THEN INTO high_paid_employees VALUES(empno,ename,title,salary) WHEN (title=‘SQL Server DBA’) THEN INTO low_paid_employees VALUES(empno,ename,title,salary) SELECT * FROM employees_external; FIRST keyword means that each row will be evaluated until the row is evaluated as true with a WHEN clause. After that the row is not evaluated against the other WHEN conditions.

Multi-Table Inserts Use the ALL keyword to specify that the INSERT should occur for all WHEN clauses that evaluate to true. INSERT ALL WHEN (title=‘Oracle DBA’) THEN INTO bonus_due VALUES(empno,ename,title,salary) WHEN (salary > ) THEN INTO high_paid_employees VALUES(empno,ename,title,salary) ELSE INTO low_paid_employees VALUES(empno,ename,title,salary) SELECT * FROM employees_external;

Table Functions - Pipelined Table functions produce sets of rows as output. Pipelined table functions return the data iteratively, instead of in a batch, thus eliminating the need for intermediate staging. Table functions use the TABLE keyword. Table functions return not a single row but a set or collection of rows. The result set can be a nested table or varray. Table functions can be queried like any table in the FROM clause of a query. Table functions can accept a collection type as input or a REF cursor. Table functions can be parallelized. Table functions can return all the rows at once or PIPELINE the results as they are produced. (one row at a time).

Table Functions - Pipelined KEYWORD: PIPELINED PIPELINED functions use less memory because the object cache doesn’t have to materialize the entire result set. PIPELINED functions can accept a REF Cursor as an input parameter. CREATE FUNCTION managerlist( cur cursor_emp_pkg.emp_cur ) RETURN emp_type_table PIPELINED IS … The keyword pipe row returns the row immediately rather than waiting for all rows to be processed. Table functions can perform the complex transformations in an efficient manner.

Table Functions - Pipelined -- Create a Type to define the result type collection CREATE OR REPLACE TYPE emp_type AS OBJECT( empno number(4), ename varchar2(10), job varchar2(9), Sal number(7,2)) / -- Create a collection used as the return type Create type emp_type_table as table of emp_type / -- Create a REF CURSOR as a package variable CREATE OR REPLACE PACKAGE cursor_emp_pkg AS type emp_rec is record ( empno number(4), ename varchar2(10), job varchar2(9), sal number(7,2)); type strong_emp_cur is ref cursor return emp_rec; End; /

Table Functions - Pipelined -- Create Function CREATE FUNCTION DBA_LIST(cur cursor_emp_pkg.strong_emp_cur) RETURN emp_type_table PIPELINED IS out_rec emp_type := emp_type(NULL,NULL,NULL,NULL); in_rec cur%ROWTYPE; BEGIN LOOP FETCH cur INTO in_rec; EXIT WHEN cur%NOTFOUND; IF in_rec.job = 'DBA' THEN out_rec.empno := in_rec.empno; out_rec.ename := in_rec.ename; out_rec.job := in_rec.job; out_rec.sal := in_rec.sal; PIPE ROW(out_rec); out_rec.sal := in_rec.sal *.10; PIPE ROW(out_rec); END IF; END LOOP; CLOSE cur; RETURN; END; /

Table Functions - Pipelined SELECT empno,ename,job,sal from emp where job=‘DBA’; EMPNO ENAME JOB SAL SCOTT DBA FORD DBA 3000 SELECT * FROM TABLE(DBA_LIST(CURSOR(SELECT empno,ename,job,sal FROM emp))); EMPNO ENAME JOB SAL SCOTT DBA SCOTT DBA FORD DBA FORD DBA 300

Table Functions - Pipelined OBSERVATIONS: Explain plan output for Pipelined Table Functions Rows Execution Plan SELECT STATEMENT GOAL: CHOOSE 4 VIEW 4 COLLECTION ITERATOR (PICKLER FETCH) OF 'DBA_LIST‘ Although the number of records returned is twice as many, the SQL*Net roundtrips to/from client remained the same. Many more recursive calls for the pipelined function.

Native Compilation Before Oracle 9i, PL/SQL programs were compiled to P-code and interpreted at runtime. Interpreted languages are much slower than compiled languages. This interpreted feature allows PL/SQL to be portable. Oracle 9i offers an option to turn the PL/SQL code into C code automatically. You write the same PL/SQL code and Oracle will covert it, compile it and execute it when the PL/SQL code is called. SQL statements do not run much faster. NO CODE CHANGES IN THE PL/SQL CODE!

Native Compilation To utilize Native Compilation, you must set several INIT.ORA parameters. plsql_compiler_flags=‘NATIVE’ Default is INTERPRETED (can be changed at session level) plsql_native_make_utility=‘/usr/ccs/bin/make’ plsql_native_make_file_name= ‘/u01/app/oracle/product/9.2.0/plsql/spnc_makefile.mk’ plsql_native_library_dir=‘/u01/app/oracle/lib’ Location where shared libraries are created plsql_native_c_compiler=‘/opt/SUNWspro/bin/cc’ plsql_native_linker=‘/usr/sbin/link’

Native Compilation CREATE FUNCTION CALC_BONUS(……. Produces a C file called CALC_BONUS__SCOTT__1.c and the shared library file CALC_BONUS__SCOTT__1.so in the directory defined by the parameter plsql_native_library_dir. You can check to see how code was compiled by viewing the PARAM_NAME and PARMA_VALUE from the dictionary view DBA_STORED_SETTINGS SELECT param_name,param_value FROM user_stored_settings WHERE object_name=‘CALC_BONUS’; PARAM_NAME PARAM_VALUE plsql_compiler_flags NATIVE,NON_DEBUG

Native Compilation Check metalink for latest patches and documents regarding NATIVE compilation. There are known bugs (unexpected features) with compiling on a client versus the server. Error compiling from client. PLS-00923: Native compilation falied: make:spdtexmk:? Also, issues with 32 bit versus 64 bit O/S’s. On Sun 2.8, I ran into a problem that had to be fixed by setting an environment variable and bouncing the listener.

Conclusion The ETL paradigm shift is under way. We will start seeing more ETL be done inside the database as opposed to outside the database. With the ETL processing done at the database level, Oracle 9i can take advantage of resource allocations using resource groups. These changes will eventually result in a simpler and more efficient ETL process at a lower cost of ownership. Oracle 9i offers vast improvements in ETL functionality. Economics Efficiency Performance

Thank You Suzanne Riddell President Apex Solutions, Inc office cell "The Business Intelligence Source"