ETL - Oracle Database Features and PL/SQL Techniques Boyan Boev CNsys BGOUG
Agenda ETL Concepts ETL Process Development - Recommendations –Oracle Database Configuration –PL/SQL Techniques –Database Features Conclusion BGOUG
What is ETL? ETL Elements ETL Purpose ETL as a Part of Business Intelligence Solution Types of ETL Using Tools Custom PL/SQL Scripts BGOUG
ETL – Extract, Transform and Load Extract Pull the Data From the Source Transform Convert the Input Format to the Target Format Encode any Values Load Insert the Transformed Data to the Target Tables BGOUG
Agenda ETL Concepts ETL Process Development - Recommendations –Oracle Database Configuration –PL/SQL Techniques –Database Features Conclusion BGOUG
Oracle database configuration Don’t use ARCHIVE LOG Mode Reduce the Amount of I/O Activity Use NOLOGGING DML Statements ALTER TABLE … NOLOGGING; INSERT /*+ nologging*/ … Change the stage schema data model Indexes Constraints Triggers BGOUG
Agenda ETL Concepts ETL Process Development - Recommendations –Oracle Database Configuration –PL/SQL Techniques –Database Features Conclusion BGOUG
ETL – recommendations PL/SQL Techniques Bulk Binding Collection Types Pipelined Table Functions Package Variables Native Compilation BGOUG
PL/SQL – Binding Types of Binds IN-Binds OUT-Binds Bind Options Single Row Binds Bulk Binds BGOUG
Single Row Binds Cursor For Loop Example: BGOUG DECLARE CURSORexample_cursor( p_id_in NUMBER ) IS SELECT * FROM customer_action WHERE cust_id = p_id_in; v_cust_id NUMBER := 1681; BEGIN FOR rec INexample_cursor( v_cust_id ) LOOP INSERT INTO cust_action_hist(…) VALUES( … ); END LOOP; END; /
BGOUG Context Switching
BGOUG Bulk Binding IN-Binds - An Array of Values is passed to SQL Engine OUT-Binds - SQL Engine populates a PL/SQL Bind Array Context Switching once per Batch Instead of Once Per Row Performance Increase to Up to 15 Times
BGOUG Bulk Bind Operators Bulk Collect Specifying of the LIMIT clause Be Careful to handle last batch OPEN cursor … LOOP FETCH cursor BULK COLLECT INTO … LIMIT 100; EXIT WHEN cursor%NOTFOUND; END LOOP;
BGOUG Bulk Bind Operators FORALL Bulk DML Operator Not Looping Construct PL/SQL table is Referenced in the statement Handling and Reporting Exceptions VALUES OF, INDICES OF ( New in Oracle 10g ) FORALL i IN INSERT INTO emp2 VALUES (enums(i), names(i), median(sals),...);
BGOUG Collection Types Associative Arrays( PL/SQL Tables ) PL/SQL type only - not a SQL Type Easy to Use Nested Tables Shared Type
BGOUG Associative Arrays( PL/SQL Tables ) automatic element allocation no need to initialize DECLARE TYPE EmpTabTyp IS TABLE OF employees%ROWTYPE INDEX BY PLS_INTEGER; emp_tab EmpTabTyp; BEGIN SELECT * BULK COLLECT INTO emp_tab FROM employees WHERE employee_id = 100; END; /
BGOUG Nested Tables PL/SQL Example DECLARE TYPE nest_tab_t IS TABLE OF NUMBER; nt nest_tab_t := nest_tab_t(); BEGIN FOR i IN LOOP nt.EXTEND; nt(i) := i; END LOOP; END; /
BGOUG Nested Tables Global Example CREATE OR REPLACE TYPE _demo_obj_t AS OBJECT ( _idNUMBER, demo_codeNUMBER, valueVARCHAR2(30) ); / CREATE OR REPLACE TYPE _demo_nt_t AS TABLE OF _demo_obj_t; /
BGOUG Nested Tables SQL-Defined Nested Tables Sorted Aggregated Ised in Dynamic In-Lists Joined with SQL Tables Joined with other PL/SQL nested tables Nested Tables Enable Table Functions TABLE Operator otell Oracle to treat the Variable like SQL Table CAST Operator oExplicitly Tells Oracle the Data Type to be used to Handle the Operation -
BGOUG Nested Tables Global Nested tables – example: DECLARE eml_dmo_nt _demo_nt_t := _demo_nt_t(); BEGIN -- Some logic that populates the nested table … eml_dmo_nt.EXTEND(3); eml_dmo_nt(1) := _demo_obj_t(45, 3, '23'); eml_dmo_nt(2) := _demo_obj_t(22, 3, '41'); eml_dmo_nt(3) := _demo_obj_t(18, 7, 'over_100k'); -- Process the data in assending order of id. FOR r IN (SELECT * FROM TABLE(CAST(eml_dmo_nt AS _demo_nt_t)) ORDER BY 1) LOOP dbms_output.put_line(r. _id || ' ' || r.demo_id); END LOOP; END; /
BGOUG Table Functions Data is Buffered in the Local Variables During Function Execution Pipelined Table Functions –SELECT * FROM plsql_function;
BGOUG Pipelined Functions - example FUNCTION pipelined_demo( data SYS_REFCURSOR ) RETURN demo_nt_t PIPELINED IS CURSOR demo_cursor IS SELECT demo_obj_t(emp_id, value) FROM emp; emp_nt demo_nt_t; BEGIN OPEN _demo_cur; LOOP EXIT WHEN _demo_cur%NOTFOUND; FETCH _demo_cur BULK COLLECT INTO eml_dmo_nt LIMIT 1000; FOR i IN 1..eml_dmo_nt.COUNT LOOP /* Apply some business logic on the object here, and return a row. */ PIPE ROW (eml_dmo_nt(i)); END LOOP; RETURN; END;
BGOUG Package Variables What are Package Variables Globally Declared Variables in Package, outside of a Procedure or Function Definition Values set by Package Initialization Code Enable Data Caching
BGOUG Native Compilation How it works? PL/SQL First is Compiled Down To P-Code C Source Code is Generated from P-Code Native Compiler is invoked Creating a 'C' Shared Object Library Subsequent Calls to PL/SQL Object are Run by the 'C' Library NComp and Performance
Agenda ETL Concepts ETL Process Development - Recommendations –Oracle Database Configuration –PL/SQL Techniques –Database Features Conclusion BGOUG
ETL – recommendations Database Features and Utilities External Tables( 9i ) Transportable Tablespaces DML Exception Handling( 10g R2 ) Regular Expressions Support( 10g R1 ) SQL*Loader Direct Path INSERT Parallel DML Statements BGOUG
Database Features External tables File can be Queried as if it is a Real Database Table Can Sort, Aggregate, Filter Rows, etc. External File Can be Queried in Parallel Oracle 9i - Read-Only using SQL*Loader Engine Only read-only access to external files Oracle 10g - Read-Write using Data Pump Engine Can do read/write operations with external files Transportable Tablespaces Oracle 10g - transport across different platforms
BGOUG Database Features DML Exception Handling ( New in 10g R2 ) Catch the Error and Move On: Error Logging Clause Enabling: SQL> exec dbms_errlog.CREATE_ERROR_LOG ('ACCOUNTS','ERR_ACCOUNTS'); SQL> insert into accounts select * from accounts_ny log errors into err_accounts reject limit 200; SQL> select ORA_ERR_NUMBER$, ORA_ERR_MESG$, ACC_NO from err_accounts;
BGOUG Database Features Regular Expression Support ( New in 10g R1 ) SQL*Loader Direct Path Insert SQL*Loader CREATE TABLE AS SELECT … INSERT /*+ APPEND*/ … SELECT Parallel DML Statements
BGOUG Agenda ETL Concepts ETL Process Development - Recommendations –Oracle Database Configuration –PL/SQL Techniques –Database Features Conclusion
BGOUG Example create package with PIPELINED Function.... FUNCTION transform (new_data SYS_REFCURSOR) RETURN _demo_nt_t PIPELINED PARALLEL_ENABLE ( PARTITION new_data BY ANY ) IS.... Transformation is Just a Simple INSERT as SELECT Elegant solution to Parallel, Transactional Co-Processing INSERT /*+ append nologging*/ INTO... SELECT /*+ parallel(a,4)*/ FROM.. TABLE( CAST( package.pipe_func_name( select * from table_3 ) as TYPE ) a);
BGOUG Conclusion High Perofmance ETL Solution
BGOUG Recommended Readings – Sites – tmlhttp:// tml – – – – –
BGOUG Recommended Readings – Books / /qid= /sr=2-1/ref=pd_bbs_b_2_1/ ?v=glance&s=books
BGOUG Recommended Readings – Books / /ref=pd_sim_b_1/ ?%5Fencoding=UTF8&v=glance
BGOUG Recommended Readings – Books / /qid= /sr=2- 2/ref=pd_bbs_b_2_2/ ?v=glance&s=books
BGOUG Questions or Comments Boyan Boev