Download presentation
Presentation is loading. Please wait.
1
Performance Tuning ETL Process
Mark McNeely
2
Test your self “Matching Game”
3
Component Matching Answers
4
Source Systems Source Systems Extract E-Business Suite R12
PeopleSoft Enterprise Siebel CRM JD Edwards Extract Staging Transformation Delivery End-User
5
DAC ETL Scheduler
6
Source System Stats What – gathers important information such as read times for single and multiple block reads, cpu speed, and other system throughputs. Why – Before a query is executed the optimizer calculates the cost of the query. Without Stats full-table scans and index-scans are evaluated as equivalent. Remember to gather stats when the system is busy to get accurate information.
7
SQL Trace files SQL Trace Files do: Parse, execute, and fetch counts
CPU and elapsed times Physical reads and logical reads Number of rows processed Misses on the library cache Username under which each parse occurred Each commit and rollback
8
TKPROF You can run the TKPROF program to format the contents of the trace file and place the output into a readable output file.
9
Explain Plan Explain Plan shows the sequence of operations performed in a SQL Query. It tells you how tables are joined and the indexes used.
10
SDE vs. SIL tasks
11
DAC Details
12
Informatica Workflow Manager
13
ETL Run
15
Informatica Workflow Monitor
16
Informatica Session Log
17
Session Log usage Busy % = (Total Run Time – Total Idle Time) / Total Run Time If Busy % (> 70 – 80%) for Reader Thread then review the Source Qualifier If Busy % (>60 – 70 %) for the TRANSF Thread then review the transformation If Busy % high for the WRITER Thread then review the Bulk Mode.
18
Hash Joins vs. Nested Loops
Optimizer chooses Nested Loops because they have less cost. Nested loops do bring the initial rows back quicker but for large volumes of over 10 million use a USE_HASH hint to cause the optimizer to use a hash join. I’ve shaved a couple of hours off of a poor performer.
19
Partitioning Guidelines for large tables
More than 20 million rows. Find a reasonable partition for example year. Couple of advantages: improved query performance and quicker ETL loads.
20
Source System Extract Staging Transformation Delivery End-User
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.