Download presentation
Presentation is loading. Please wait.
1
Pipelined Table Functions and Transformation Pipelines to Replace Traditional Load Operations
Tim Hall Oracle ACE Director Oracle ACE of the Year 2006 OCP DBA (7, 8, 8i, 9i, 10g, 11g) OCA PL/SQL Developer Oracle PL/SQL Tuning (Rampant) Oracle Job Scheduling (Rampant)
2
Table functions and pipelining
What are table functions? Pipelining table functions. Parallel enabled table functions. Creating transformation pipelines. Miscellaneous facts.
3
What are table functions?
Functions that return collections are known as table functions. tf_setup.sql In combination with the TABLE function, they can be used in the FROM clause of a query like a regular table. tf_test.sql For regular table functions, the collection must be defined using database OBJECT types.
4
Pipelining table functions
A table function builds the entire collection before returning any data, while a pipelined table function “pipes” each row out as soon as it is created. ptf_schema.sql ptf_package.sql ptf_query.sql Since the collection is never fully resident in memory, pipelining can produce a considerable memory saving. memory_usage.sql Since 9i R2, the types used to define the pipelined table function can be defined in a package, but this method can produce management problems, so I prefer the explicit method. implicit_types.sql
5
Parallel enabled table functions
Single Process Parallel Process 1,000,000 Master 1,000,000 Slave 1 250,000 Slave 2 250,000 Slave 3 250,000 Salve 4 250,000 Parallel enabled table functions can improve performance by dividing the workload between slave processes.
6
Parallel enabled table functions
To parallel enable a table function, you must: Include the PARALLEL_ENABLE clause. Have a single REF CURSOR as a input parameter. Use the PARTITION BY clause to define a partitioning method for the workload. Weakly typed REF CURSORS can only use the PARTITION BY ANY clause, which randomly partitions the workload. ptf_syntax.sql Let’s see it in action. parallel_setup.sql parallel_tf.sql Running multiple slave and coordinating the parallel operation takes extra resources. It may not be beneficial in every type of operation.
7
Ordering and clustering PTFs
Process Workload: ENG, ENG, ENG, IRE, SCO, SCO Entry Type Slave 1 Slave 2 Slave 3 Random ENG, SCO ENG, IRE Order ENG, ENG SCO, SCO Cluster ENG, ENG, ENG IRE The ORDER BY and CLUSTER BY clauses can be used to alter the order of the data entering each instance of the parallel enabled table function, such that: When neither the ORDER BY or CLUSTER BY clause is specified, the data entry is random. The ORDER BY clause orders the data by the specified column. The CLUSTER BY clause groups data with the same values together, but does not order the data between values. This is difficult to demo due to the volume of data needed, but the following test does work provided you are willing search through the data. parallel_order.sql
8
Creating transformation pipelines
Data loads and Extraction Transformation Load (ETL) processes have traditionally relied on loading data from flat files into staging tables, where it is processed before being loaded into the main schema tables. Pipelined table functions are often discussed as a replacement for the traditional ETL processes. Pipelined table functions can be strung together to perform complex transformations in a single load operation. tp_file.sql tp schema.sql
9
Creating transformation pipelines
External Table (tp_test_ext) Source File (tp_test.txt) t_step_1_in_rc Step 1 (tp_api.step_1) t_step_1_out_tab Destination Table (tp_test) Step 2 (tp_api.step_2) t_step_2_out_tab t_step_2_in_rc
10
Creating transformation pipelines
Two pipelined table functions are defined to perform the transformation steps, along with a procedure to initiate the load process. tp_api.sql tp_api_test.sql Each step is conceptually quite simple, but stringing them together can look complex. The main advantage is complex loads can be done in a single DML statement, rather than requiring many table inserts and updates for data staging.
11
Miscellaneous facts Table functions can only contain DML statements if they are defined with the AUTONOMOUS_TRANSACTION pragma, or the DML is itself wrapped in a procedure call that is an autonomous transaction. DML statements cannot be executed against table functions, but if the table function is incorporated into a view, the view can have INSTEAD OF triggers defined against it. Exception handling is the same for table functions as it is for other PL/SQL functions, such that any unhandled exceptions are propagated back to the calling PL/SQL or SQL.
12
Summary The definition of table functions and pipelined table functions. The performance benefits of pipelined table functions in comparison to regular table functions. The use of parallel enabled pipelined table functions, which share the total workload between multiple slave processes. Creating transformation pipelines to replace traditional Extraction Transformation Load (ETL) processes.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.