307: BW Performance Tuning - Queries / Data Loads Ramesh Sampath Mike Maddox Suresh Kandoor VJ Sudarsan http://www.codongroup.com
Objectives To provide a comprehensive understanding of the factors affecting data load and query performance and to discuss the strategies to help identify, monitor and resolve performance issues. 2
Agenda Why Load Performance & Query Run-Times are important ? Query Performance Identifying long running Queries Query tuning Techniques Data Load Performance Identifying long running loads Improving Data Load times Questions 3
Why Query Run-Times ? More Reporting Functions moving from R/3 to BW User Frustration with Data Warehouse Promise Impacts Number of Analysis Performed User Productivity is affected by Query Run-Times Quick response times, reduces Concurrent Users Affects the bottom line !! 4
Query Performance Identifying and Isolating the slow Queries Using BW Statistics Using SAP Transactions Exploring Possible Solutions to achieve better query run-times 5
Identifying long running Queries Using BW Statistics Average Run-Time of Query in Seconds 6
Identifying long running Queries Using BW Statistics 7
Identifying long running Queries Using BW Statistics 8
Identifying long running Queries Using SAP Transactions ST22 – Query Run Time exceeded set Limit for queries 9
Identifying long running Queries Using SAP Transactions Analyze SM50, SM66 to identify real-time query runs / users affected 10
Agenda Why Load Performance & Query Run-Times are important ? Query Performance Identifying long running Queries Query tuning Techniques Data Load Performance Identifying long running loads Improving Data Load times Questions 11
Query Performance Query Tuning Techniques Analyze Query Plan Identify degenerated indexes or incorrect statistics Design Considerations (Data Modeling) Queries on InfoCube and ODS Selections on Hierarchies Do’s and Don’ts in Query Building 12
Query Tuning Techniques Analyze Query Plan (RSRT) Check Read Mode of Queries Displays List of Aggregates applicable to query based on selection Displays Query Plan 13
Query Tuning Techniques Specify Database Hints Hints on Index Selection to observe the change to Execution Plan & Query Performance 14
Query Tuning Techniques Specify Database Hints Query Cost Estimate Hints on Index Selection to observe the change to Execution Plan & Query Performance Check the SQL Generated, to Identify Secondary Indexes on Dimension Tables, Master Data Tables 15
Query Tuning Techniques Full Table Scan on Fact Table – May lead to bad performance Are the Statistics data Accurate ? Hints on Index Selection to observe the change to Execution Plan & Query Performance 16
Query Tuning Techniques Identify degenerated indexes or incorrect statistics: (RSRV) Check DB Parameter Settings Check Degenerated Indexes Check Database Statistics 17
Query Tuning Techniques Transaction: RSRV in BW 3.X Check DB Parameter Settings 18
Query Tuning Techniques Design Considerations for Dimension Tables Fact Tables Aggregates ODS Objects Master Data 19
Query Tuning Techniques Design Considerations for Dimension Tables: Smaller Dimension Tables - Dimension Table size < 5% of Fact Table Avoid M:N relationship Logical grouping of objects in dimension Fewer Dimension Tables per Cube Avoid Near Line Item Dimensions For large dimension tables, change index type to B-tree from Bitmap (default) Create Secondary Indexes on Dimension tables 20
Query Tuning Techniques Design Considerations for Fact Tables: Partition large fact tables (over 10 million records) Create Physical Partitioning (one Million records per partition) Compression Aggregates Avoid Virtual Key Figures Avoid exception aggregation 21
Query Tuning Techniques Logical Partitioning of Cubes Query against MultiProvider Consolidated view of all data MultiProvider The Power of Parallel Processing! Parallel SELECT Statements Partiton by Region Europe 2000 America 2001 Asia 2002 Year 22 Sub-queries: Run against small structures (pruning)
Query Tuning Techniques Design Considerations for Aggregates: Data Selected from DB:Reported > 20:1 Best on Delta Capable Cubes Use Nav. Attributes in Aggregate (Pro / Con). - Avoid adding frequently changing attributes Create Aggregates on Hierarchy nodes Create Base Aggregate and summarized Aggregate from Base Avoid Virtual Key Figures / Virtual Characteristics Avoid "Before Aggregation" Formulas in Queries Split the delta capable data from full re-loads to create aggregates on delta capable cubes. 23
Query Tuning Techniques Design Considerations for ODS Objects: Create Secondary Indexes Physical Partitioning using DB Tools Locally managed indexes on Partitioned ODS Use Multi-provider to split the Data Create summarized cubes High level analysis on Cubes, Detailed level analysis run queries or Jump to ODS queries. Avoid Compounded Info Objects in ODS – Inefficient filtering by SAP Index on Master Data P & Q Tables 24
Query Tuning Techniques Design Considerations for Master Data: Additional Over head with Time dependent Master data and hierarchies Hierarchy vs. Navigational Attributes. Navigation attributes perform better than hierarchies. Create additional index on the attributes of the master data that are used in query selections. X, Y Tables are used in Cube based Query filters. X, Y, P, Q Tables are used in ODS Based query filters. Create secondary indexes on Master data tables based on its usage 25
Query Tuning Techniques Query Selection on Hierarchies Temporary tables generated by SAP when querying based Hierarchy nodes. Solution: Flatten the hierarchy nodes as attributes and select on them. HIERNODE 26
Query Tuning Techniques Do’s and Don’ts in Query Building: Avoid generic queries Break down queries by summary level & detail level analysis Always have mandatory variables on time component Take advantage of Partition Pruning Limit the Use of hierarchies in Query Selections, use them extensively in Query Output Use trailing Wild cards in query selections – Index! Create indexes on MD, Dimension table objects used in query selections Limit the use of ‘Before Aggregation’ Formulas that provide High level Summarized information and detailed information 27
Query Tuning Techniques Summary of Query Enhancing Techniques Aggregates Compression Partitioning Secondary Indexes Accurate Database Statistics Multi-Providers Data Class / Table Space assignment for Cubes / ODS Objects and Indexes 28
Agenda Why Load Performance & Query Run-Times are important ? Query Performance Identifying long running Queries Query tuning Techniques Data Load Performance Identifying long running loads Improving Data Load times Questions 29
Why Load Performance ? With BW Evolving as Corporate Data Warehouse, more Functional Areas (Logistics / Financials / HR / …) are being serviced by BW Small Window to Load Large Amounts of Data Shrinking Data Load Batch Windows Request for Frequent Data Loads. Middle of day data loads, hourly … 30
Data Load Performance Identifying and Isolating the Load Performance: Using BW Statistics Using SAP Standard Transactions Exploring Possible Solutions to improve load run times 31
Identifying long running loads BW Statistics – Load Times By Data Targets Record Count Load Time 32
Identifying long running loads Detailed Analysis for Info Cube Loads # of Records processed Slow Update Rules Inserts to Cube Slow 33
Identifying long running loads BW Statistics – Load Times By InfoSource Load Time Record Count 34
Identifying long running loads Detailed Analysis to identify cause on all Loads (Cube / ODS / Master Data) Slow Transfer Rules Slow Update Rules # of Records Processed Slow R/3 Extractor 35
Identifying long running loads SM37 on BW system – ‘BI_ODSA*’ for ODS Activation Times SM50, SM66 – Process Overview ST04 – Database Analysis Overview 36
Identifying long running loads Summary of factors affecting data loads Inefficient Extract Programs Inefficient logic in User Exits Slow Data Transformation Services Transfer Rules Update Rules Slow Data Loading Services Loads into Info Cubes Loads into ODS Objects 37
Agenda Why Load Performance & Query Run-Times are important ? Query Performance Identifying long running Queries Query tuning Techniques Data Load Performance Identifying long running loads Improving Data Load times Questions 38
Improving Data Load Times Strategies to improve Load Performance: Extractor Performance Info Cube Data Loads Data Transformation Services (DTS) ODS Data Loads Master data Loads Flat Files Data Loads 39
Improving Data Load Times Strategies to improve Load Performance: (Contd.) Perform Run-time Analysis on Extractor to identify Processing Times by ABAP Logic & Database Selects Check Select Statements on Large R/3 Tables Ensure that Selects are based on Primary Keys or secondary indexes Check the usage of Run-time ABAP Memory by Internal Tables. Schedule set up job in parallel 40
Improving Data Load Times Every thing has been checked, but extract is still slow ? Does the selection parameters entered at the Info package facilitate the use of Indexes ! Selection Parameters 41
Improving Data Load Times Strategies to improve InfoCube Data Loads: Load Master Data before Transactions to pre-create SID’s. Number range Buffering on Large or Near Line item Dimensions for Large data loads Packet Size – reduce the number of records per data packet Secondary Indexes – Drop indexes on complete re-loads or loading significant amount of data (over 25% of existing cube data) ‘Turn-off’ Archival Logs before Large Initial Loads On complete re-load of cube, do not delete dimension table entries, if you do not anticipate any changes to dimension entries. 42
Improving Data Load Times Strategies to improve InfoCube Data Loads: (Contd.) Incremental Data Loads Do not drop Secondary Indexes on Incremental Loads Do not select Refresh statistics after load 43
Improving Data Load Times Strategies to improve DTS Eliminate Single Selects on Update Routines & Transfer Routines Utilize Start Routines Avoid Transfer Rules to enhance Array Inserts Utilize Start Routines – Data Packet Level Start Routines - Runs once per Data Packet. Loop in the Data packet and Update all relevant data fields at once. There would still be scenarios where you would need to use Update Routines like Calculations, reversals etc. But they should not impact load performance. Resist the temptation to write Select Statement on Transfer Routines Depending on the size / indexes on the table you are selecting from they perform very bad since the select is executed per data record sent from source system Data Integrity when writing DB Selects in Update / Transfer Rules. Are you Selecting from another ODS in Update Rules to ODS / Cubes ?. In R/3, All related tables are updated in R/3 as a part of one LUW. Code at R/3 BW DataSource User exits would always get the correct information. In BW, the data is loaded independently and the Routines should not impose dependency on data loads. This could lead to Data Integrity Problems in the Run-Maintain Environment. Examine Update Routines – Info Object Level 44
Improving Data Load Times Strategies to improve ODS Data Loads ‘Turn-off’ Bex Reporting Mark line item characteristics viz., doc Number as Attributes only Large Initial Loads to ODS ‘Turn-off’ Database Archival Logs Delete Secondary Indexes First time load to ODS (ODS data is empty) Load up to 1 Million Records, but do not exceed it. Activate After the First load to facilitate Bulk Inserts Adjust the Data packet size to be < 10 Data packets in the request. SAP has hard coded logic to check if data packet size is < 10% of the total request to process using Memory vs. single selects. Why this Difference ? SAP does bulk processing vs. single record processing based on record counts when the ODS is empty. , when ODS is not intended to be used in Bex Queries. Still use ODS in Infoset and in Multi-Providers through Infosets 45
Improving Data Load Times Activating data in ODS Object Setting ins SPRO 46
Improving Data Load Times Strategies to improve subsequent data loads to ODS: Activation – Single Record Inserts vs. Mass Insert when ODS is empty Enable Unique records, if true. Bulk Inserts. Data Packet Size (<10 Data Packets per Request) Eliminate unused indexes. Focus on smaller data loads for fewer records per activation to reduce the commit interval. New in BW 3.X 47
Improving Data Load Times Why does deletion of an Active Request from ODS takes a long time even for few thousand records ? Data base partitioning on Active table No Index on ODS Primary Key on ODS Change Log Table. Add a Index to Change log Table (SE11) ODS Primary Key 48
Improving Data Load Times Strategies to improve Master Data Loads: Smaller Data Packet Size to minimize commit intervals Convert full loads to delta extract based on date selection in the extract tables Create accurate DB statistics ‘Turn-off’ Consistency Check. Useful to select on data load errors to identify Row Numbers that have incorrect data, but has a performance impact 49
Improving Data Load Times Strategies to improve Flat File Data Loads: Use Logical File Names / Application Server Files Run in Background Batch Mode Break Large Files into Smaller Files and load them in parallel Use Fixed Length Files in place of CSV files SAP Converts CSV File inputs to Fixed Length before it starts processing the file. CSV File is better to view and analyze File data. Change to Fixed Length only if it improves performance significantly. Maximize Parallel Loads Increase the number of Batch Sessions at peak load times to run several data load jobs in parallel Setup as Job triggered by Start and End events of Data loads or integrated into Data Load Process Chains 50
Improving Data Load Times Summary of Data Load Enhancing Techniques Extractor Performance on Source System Info Cube Data Loads (Initial / Delta Loads) Data Transformation Services (Transfer, Update Rules) ODS Data Loads (Initial, Subsequent Loads) Master Data Loads Flat File Loads 51
Questions ? Questions Ramesh Sampath rsampath@codongroup.com VJ Sudharsan vssood@codongroup.com Mike Maddox mike.maddox@shell.com Suresh Kandoor suresh@codongroup.com SAP Converts CSV File inputs to Fixed Length before it starts processing the file. CSV File is better to view and analyze File data. Change to Fixed Length only if it improves performance significantly. Maximize Parallel Loads Increase the number of Batch Sessions at peak load times to run several data load jobs in parallel Setup as Job triggered by Start and End events of Data loads or integrated into Data Load Process Chains http://www.codongroup.com
Thank you for attending! Please remember to complete and return your evaluation form following this session. Session Code: [307]