Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enterprise Data Management Optimization Dr. Boris Zibitsker BEZ Systems St. Louis CMG.

Similar presentations


Presentation on theme: "Enterprise Data Management Optimization Dr. Boris Zibitsker BEZ Systems St. Louis CMG."— Presentation transcript:

1 Enterprise Data Management Optimization Dr. Boris Zibitsker BEZ Systems boris@bez.com www.bez.com St. Louis CMG

2 Outline  Enterprise Data Management with Moving Target  Enterprise Data Management Options and Tradeoffs  Role of Modeling in EDM Optimization –How to use performance prediction models to evaluate and justify enterprise data management alternatives, set performance expectations, verify results and organize a continuous proactive EDM process  Examples Illustrating the Best Practice of EDM Proactive Performance Management During Application and Information Life Cycle  Applying Modeling for Optimizing EDM Strategic Decisions –How to justify enterprise data warehouse –How to justify master data management  Applying Modeling for Optimizing EDM Tactical Decisions –How to reduce time of loading growing volume of data –How to reduce data access time –How to predict the impact of new application implementation  Applying Modeling for Optimizing EDM Operational Decisions –Predicting how change of the workload’s priority will affect performance –Comparison of actual results vs. expected and organizing continuous proactive service level management  Summary 2 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

3 Challenges of Enterprise Data Management 3 Data © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 Changing business demand Loading more data Increasing number of user Implementing new applications Upgrading hardware and software How to optimize EDM to provide accurate and timely information with minimum cost and with moving target

4 4 Scaling Tradeoffs in a Multi-tier Distributed Environment Distribution: Adding more servers, nodes Centralization: Server consolidation Data compression More: CPUs/Server, JVM/Server Disks/Server Reduce Queueing Time Faster CPUs, Disks Reduce Service Time Parallelization Concurrency DBMS Servers Web Servers Application Servers Storage Subsystem EDW Sales Marketing HR © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

5 Optimization of Strategic, Tactical and Operational Enterprise Data Management Decision Strategic Decisions (Yearly)  Architecture: Centralized EDW vs. Distributed DW and DM vs. Master Data Management  Where to place data  Where to run applications Tactical Decisions (Weekly/Monthly)  Dormant data  Indexes  Partitioning  Compression Operational Decisions (Hourly)  Concurrency  Parallelism  Priority  Resource reallocation  Compare different options  Select criteria of comparison, like cost, response time, throughput, availability, accuracy, consistency, manageability, flexibility  Define relative importance/weight of each criteria  Build models showing relationship between different parameters and each criteria for each option  Find an optimum option/solution as a compromise between different criteria 5 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

6 Wrong EDM Decisions Can Delay Action Time and Negatively Affect Business © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 6 ETL Data Access Action Time Value lost Bus Event Business Value Time How Long Will it Take to Load and Aggregate More Data? How Long Will It Take to Access More Data? How Can the Accuracy and Timeliness of Information Be Improved? InformationAction

7 Difference Between Efficiency and Effectiveness of EDM Decisions Effectiveness  Accurate and timely information  Ability to make right decisions  Impact on the bottom line Efficiency  Cost  Performance  Scalability  Availability  Consistency 7 Strategy Operations Tactics © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

8 Input Workloads Hardware Software Prediction Engine Performance Prediction & Optimization Output Recommendation & Expectations By Workload Options Hardware Software DBMS Plan Workload Growth Database Size Growth Hardware Upgrade Software Parameters New Application Server Consolidation DBMS Wizards Index Adviser MV Adviser Data Partitioning Data Compression Optimization Engine 8 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 OS DBMS Server Applica- tion Server

9 9 Simplified Model of 3-tier Architecture Max? 1 2 n CPU Disk Memory Max? 1 2 n CPU Disk Memory Active Sessions Threads or Active Sessions Rejected Requests Arriving requests No Rejected Requests DBMS Servers Net Max? 75 1 2 n CPU Disk Memory Active Sessions Rejected Requests Users Arriving Requests NetworkWeb Servers Net Client 200125 75 60 15 50 25 # of Threads & Active Sessions Control Concurrency Memory Limitation Level of Parallelism Affects Performance Application Servers © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

10 Workload Characterization

11 Each Workload Has Unique Performance, Data & Resource Utilization Profiles Table 1 Table 3 Table m Table 2 Appl SQL User … … CPU Disk CPU Disk User Business Process Workloads Resource Utilization Data 11 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

12 Verification and Control Trend Analysis, Baseline Analysis for Fixed/Rolling Period Trend Analysis Period-to-Period Comparisons & Change Validation Proactive Corrective Actions & New Expectations Workload Centric Approach to Service Level Management Operational Decisions Problem Isolation Current and Predicted Service Breach Business to Infrastructure Drill down Zoom In / Out Include / Exclude Filters Performance Utilization Data Access Scheduling, Workload Management Strategic Decisions Justification of Architecture : Setting Realistic SLO and SLA Capacity Planning New Application Implementation Virtualization Consolidations Tactical Decisions Concurrency Control Priority Database Tuning Index Creation Memory Adjustments Partitioning Compression Appl. Server Tuning #JVM & #JVM Threads Connection Pool Size 12 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

13 13 Typical Steps of Applying Modeling During Application and Information Life Cycle Application Life Cycle  Feasibility study  New application implementation  Performance management  Capacity planning  Disaster recovery  Application consolidation Application Life Cycle  Feasibility study  New application implementation  Performance management  Capacity planning  Disaster recovery  Application consolidation Information Life Cycle  Data loading (ETL)  Data modeling  Database tuning  Data growth  Backup and restore  Data replication  Data consolidation  Enterprise data management  Information integration Information Life Cycle  Data loading (ETL)  Data modeling  Database tuning  Data growth  Backup and restore  Data replication  Data consolidation  Enterprise data management  Information integration Measure CharacterizePlan Advise Manage Model & Optimize © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

14 Example of Configuration Planning Tasks for Multi-tier Distributed Environment  For each workload, identify how many users can be supported by one JVM  How many JVMs will be required to support each of the workloads  The number of servers required to support all workloads  The optimum number of CPUs per server  CPU type and speed  Server memory size  Number of host channels  Storage subsystem type  Control unit cache size  Number of disk channels  Number of disks per server  Maximum number of active sessions within DBMS server per workload  Dispatching priority for each workload  Maximum degree of parallelism  Indexing  Materialized views  Partitioning  Data compression 14 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

15 Performance Prediction

16 Predicting Impact of Workload Growth This Month Next Month In 2 Months In 3 Months In 4 Months Arrival Rate (Req/sec) 56789 Service Time (sec) 0.10.120.140.160.18 Utilization (%) 0.50.60.70.80.9 Response Time (sec)0.20.30.460.81.8 A = 5 Req / secScpu = 0.1 sec Utilization Law U=A*S Ucpu = 5 Req/sec * 0.1sec = 0.5 Response Time law R=S/(1-U) Rcpu = 0.1 sec / (1 - 0.5) = 0.2 sec Little’s Law N = A * R CPU Based on expected workload growth of 20% per month, predict when the system will not be able to meet SLO (0.6 sec). What will be the impact of doubling CPU speed? How long will the system satisfy SLO? 16 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

17 Predicting Impact of Doubling CPU Speed This Month Next Month In 2 Months In 3 Months In 4 Months Arrival Rate (Req/sec) 56789 Service Time (sec) 0.10.120.140.160.18 Utilization (%) 0.50.60.70.80.9 Response Time (sec) 0.20.30.460.81.8 Doubling CPU Speed 0.060.090.130.220.47 Based on expected workload growth of 20% per month, predict when the system will not be able to meet SLO (0.6 sec). What will be the impact of doubling CPU speed? How long will the system satisfy SLO? 17 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

18 18 Example of Planning (see spreadsheet)

19 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 19 Workload Characterization & Forecasting

20 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 20 Modeling AS Hardware Upgrade Impact

21 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 21 Predicted AS Upgrade Impact

22 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 22 Modeling DBMS Server Upgrade Impact

23 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 23 Modeling DBMS Server Upgrade Impact

24 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 24 Predicted DBMS Server Upgrade Impact

25 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 25 Predicted Parallel Processing Impact

26 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 26 Predicted Parallel Processing Impact

27 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 27 Predicted Parallel Processing Impact

28 Strategic Decisions How to Justify Enterprise Data Warehouse Master Data Management Hardware DBMS

29 Optimization of Placement Data and Applications © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 Very Large Disks Large Disks Small Disks Tapes Solid State Data EDW DW AS Hub Applications 29

30 EDW Justification of EDW Predicting How EDW Will Affect ETL and Information Access Time Data Mart 3 Data Mart 3 Data Mart 4 Data Mart 4 Data Mart 2 Data Mart 2 Data Mart 5 Data Mart 5 Data Mart 1 Data Mart 1 Source ETL Source Information Access Time (DM) Source Extract Standard Transform Standard Transform Stage Data Mart Transform Data Mart Transform ETL(DM) Time ETL (EDW) Data Mart 4 Data Mart 4 Data Mart 5 Data Mart 5 Data Mart 3 Data Mart 3 Data Mart 2 Data Mart 2 Data Mart 1 Data Mart 1 Data Mart 6 Data Mart 6 A A B B C C ∑(A,B) Information Access Time (EDW) Factors Affecting EDW Justification: Hardware cost Software licenses ETL process Support personnel 30 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

31 31 What Is the Best Architecture and Hardware Configuration for Specific EDW Workloads? DB2 UDB vs. Oracle RAC vs. Teradata © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

32 Differences Between Parallel Processing on Teradata and Oracle Limited # of Available AMP Worker Tasks 32 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

33 Modeling Scaling Out 33 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

34 Predicting Impact of Different Hardware Platforms and Configurations 34 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

35 Prediction Results Show That Increase in # of Oracle RAC Nodes Will Reduces CPU Utilization, Improve Response Time and Throughput, but Will Increase Contention for Disk 35 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 I/O Rate * 10

36 36 Master Data Store (MDS)―Planning and Managing Challenges  What are the performance implications of supporting centralized Master Data Store vs. distributed repositories of Master Data? ODS EDW DM MDSMD Current Historical © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

37 37 What Are the Performance Implications of Supporting Centralized MDM vs. Distributed Repositories for MDM? (Hub vs. Spoke Architectures) Hub Start with hub and when frequency of accesses increases, consider spoke MDS Current & Historical Data © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

38 Tactical Decisions How to Reduce Time to Load Growing Volume of Data How to Reduce Data Access Time How to Predict the Impact of New Application Implementation

39 39 Technology Processes Workload Data Increase in Volume of Data and Change of Pattern Accessing Data Affects Each Workload’s Performance Increase in volume of data and pattern of data access affects: © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 Predict Future Bottleneck Predict Future Bottleneck Identify Critical Workload Users, SQL Tables, Which Will Cause Problems Identify Critical Workload Users, SQL Tables, Which Will Cause Problems Use DBMS Wizards to Find Tuning Options Use DBMS Wizards to Find Tuning Options Use Modeling to Justify Change & Verify Results Use Modeling to Justify Change & Verify Results ETL Time Disk utilization Aggregation and summarization time Data access time Session, thread usage time Buffer utilization and hit ratio DBMS server and application server CPU utilization Internode communication utilization Enterprise service bus utilization Response time and throughput

40 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 40 Predicting Database Tuning Impact Creation of the Index – See Spreadsheet

41 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 41 Predicting Database Tuning Impact

42 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 42 Predicting Database Tuning Impact

43 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 43 Predicting Database Tuning Impact

44 44 Example: Can I load growing volume of data on time, and how will data load affect other workloads? It will take 6 times longer to load growing volume of data in 10 months. RT for HR application will increase almost 2 times & throughput for ETL will be reduced almost 2 times It will take 6 times longer to load growing volume of data in 10 months. RT for HR application will increase almost 2 times & throughput for ETL will be reduced almost 2 times © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 Transform Extract Load Transform Transport ETL SourceETL Target

45 45 What is the Minimum Hardware Upgrade Required to Load Growing Volume of Data on Time? © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

46 46 What if we Increase the Number of Parallel ETL Utilities Loading Data in Parallel Starting Next Month (p2) and Upgrade Hardware (p5)? Increase in # of loads will allow significant reduction of load time, but there will be very significant elongation of the RT for HR, Marketing and Sales workloads © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

47 47 Predicted Impact of the Implementation of Parallel Processing Based on Oracle 10g RAC Implementation of parallel processing will improve response time for complex queries almost 2 times © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

48 48 Example Showing How Modeling Results Identify Potential Bottlenecks and SQL That Will Cause Problems in the Future by Workload When will SLO not be met? What will cause the problem? Who will cause the problem? How do you fix the problem? What are database and application tuning alternatives? What are the expected savings?

49 DB Advice Capacity Planning Recommendations Processing SQL through SQL/DBMS Access Advisor gives a list of recommendations 49 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

50 Example Showing Predicted Impact of Recommended Indexes 50 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

51 Predicted Data Compression Impact on Different Workloads Data compression will have different impact on different workloads. DW workloads with primarily SELECT type of requests will benefit more. 51 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

52 Data partitioning will have a positive impact on performance for all workloads. Predicted Impact of Data Partitioning 52 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

53 Performance Prediction Results Based on Oracle Memory Advisor Reflect the Impact of the Workload Growth and Memory Pool Size Change 53 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

54 Predicted Impact of Adding a New Application Set up realistic expectations and reduce risk of surprises Prediction on how new application will perform in production environment Prediction on how new application will affect performance of existing applications Test Production Alternatives In a future Database Replay 54 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

55 Predicting New HR Application Implementation Impact 55 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

56 Modeling Results Help Customers to Set Up Realistic SLO and Negotiate SLA for Major Workload Hardware Configuration & TCO SLO Users and IT select SLO level that will provide acceptable performance with acceptable Total Cost of Ownership (TCO ) Prediction results allow customers to negotiate SLA between business and IT For expected workload and database size growth, IT guarantees delivery of a certain level of responsiveness and throughput Expected workload & DB growth Predicted RT for one of the Workloads SLA 56 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

57 Operational Decisions Workload Priority Scheduling Organizing Continuous Proactive Performance Management Virtual Tape Library

58 58 Predicting How Change of the Workload’s Priority Will Affect Performance Sales workload priority increase will improve Sales RT, but other workloads will suffer © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

59 Comparison of Actual Results vs. Expected and Organizing Continuous Proactive Service Level Management Find difference between predicted results or expectations (red line) and actual measurement data Track how often the actual results do not meet expectation (SLA) When number of exceptions exceeds the threshold, generate alert Explain difference and develop new corrective recommendations 59 © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008 Identify when SLO will not be met for each workload Identify what will be a bottleneck Identify which workload will cause the performance degradation

60 60 Summary  Use modeling to evaluate options and justify EDM strategic, tactical and operational decisions to satisfy contradictive business requirements for timeliness and flexibility, accuracy, acceptable performance and minimum cost  Organize a continuous process of applying models for justifying EDM decisions, setting expectations, verifying results and finding effective proactive corrections during application and information life cycle  Workload characterization and modeling allow identification of which data and applications are used by individual lines of business and business processes, and focus EDM decisions and efforts on proactively addressing the most important strategic, tactical and operational IT issues © Boris Zibitsker, BEZ, St Louis CMG - Feb 12, 2008

61 Thank You! Questions? Dr. Boris Zibitsker boris@bez.com www.bez.com


Download ppt "Enterprise Data Management Optimization Dr. Boris Zibitsker BEZ Systems St. Louis CMG."

Similar presentations


Ads by Google