Presentation is loading. Please wait.

Presentation is loading. Please wait.

Workload-Management für komplexe Data Warehousing Umgebungen

Similar presentations


Presentation on theme: "Workload-Management für komplexe Data Warehousing Umgebungen"— Presentation transcript:

1 Workload-Management für komplexe Data Warehousing Umgebungen
Hermann Bär, Data Warehousing Product Management

2 Why Anglo-German? © 2010 Oracle Corporation

3 Agenda What is a concurrent (mixed workload) environment?
Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune

4 A Mixed Workload 2005 Major Changes for your Data Warehouse
Department A supplies data to the DW daily and runs reports Department B supplies data to the DW daily and runs reports Data Marts Daily batch windows Ad-hoc queries Downtime OK

5 A Mixed Workload 2011 Major Changes for your Data Warehouse
All Departments On-Line Applications CEO Strategy Finance Marketing CRM Live Systems Stock Tracking Direct Business Impact Real Time Feeds Enterprise Data Warehouse Write-Backs Classic Reporting Deep Analytics Long running reports Heavy Analytical Content Investigative querying Predictive Modeling Scenario Analysis Data Mining

6 A Mixed Workload Sample Requirements
Workloads should use critical system resources according to their priority CPU, I/O Tactical workload must run with expected DOP Running at diminished DOP or queuing results in unacceptable performance Full utilization of critical resources Avoid inefficient schemes that require dedicated resources Servers dedicated to services Separate data marts / warehouses Flexible resource allocation E.g. priority of ETL is based on time of day

7 Agenda What is a concurrent (mixed workload) environment?
Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune © 2010 Oracle Corporation

8 Workload Management for DW Three Main Components
Database Architecture Hardware Architecture Define Workload Plans Filter Exceptions Manage Resources Monitor Workloads Adjust Workload Plans EDW Data Layers Data Mart Strategy Sandboxes Active HA/DR Strategy Compression Strategies Storage Media Hierarchy © 2010 Oracle Corporation

9 Workload Management for DW What we are covering today…
Define Workloads Filter Exceptions Manage Resources Monitor Workloads Adjust Plans Execute Workloads Monitor Workloads Adjust Workload Plans IORM RAC OEM DBRM Define Workload Plans The RAC piece includes things like: Services Server Pools (Grid Infrastructure) to provide elasticity (add servers to pool to increase memory) Instance Caging (consolidation) © 2010 Oracle Corporation

10 Agenda What is a concurrent (mixed workload) environment?
Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune © 2010 Oracle Corporation

11 Tools and Methods Resource allocation Resource management
Database (processing) nodes Services, Server Pools, and consumer groups Instance caging IO resource management Resource management Consumer Groups Within a single database Across multiple databases Workload-driven database resource management Thresholds and actions

12 Services Use services to restrict the number of nodes
1 2 3 4 5 6 7 8 Service Gold Service Silver Use services to restrict the number of nodes Dynamic allocation and re-routing Divide 8 Node cluster, where Service Gold is 3 nodes

13 Services and Server Pools
Service Gold Service Silver 1 2 3 4 5 6 7 8 Expand a service by expanding the pool of servers it has access to Expand Service Gold to 4 nodes Shrink Service Silver to 4 nodes

14 Instance Caging Limit (“cage”) the amount of CPU for a given instance
1 2 3 4 5 6 7 8 Limit (“cage”) the amount of CPU for a given instance Divide 8 Node cluster, where two databases get half of the CPUs per node

15 Sample Instance Caging
4 CPU server Workload is a mix of OLTP transactions, parallel queries, and DMLs from Oracle Financials

16 I/O Resource Management on Exadata
Global I/O resource management Prioritize multiple individual databases Prioritize workloads within a single database Prioritize a certain type of workload across all databases Prioritize all tactical queries Deprioritize all ad-hoc queries Data Mart A Data Mart B Enterprise Data Warehouse

17 Sample I/O Utilization
Queries from TPC-H benchmark suite Disk utilization measured via iostat

18 Database Resource Manager
Single framework to do workload management including CPU Session control Thresholds IO (Exadata has IO Resource Manager) Parallel statement queuing Each consumer group now needs to be managed in terms of parallel statement queuing New settings / screens to control queuing in Enterprise Manager and in DBRM packages

19 Database Resource Manager (DBRM)
1 2 3 4 5 6 7 8 Grp 1 Grp 2 Grp 3 Resource Management within a single database Divide a system horizontally across nodes Uses Resources Plans and Groups to model and assign resources Allows for prioritization and flexibility in resource allocation

20 DBRM with Services Resource management within a single database
1 2 3 4 5 6 7 8 Grp 1 Grp 3 Grp 4 Grp 2 Grp 5 Service Gold Service Silver Resource management within a single database Service-aware resource management Make sure to fully utilize the resources

21 DBRM with Services and Instance Caging
1 2 3 4 5 6 7 8 Grp 1 Grp 3 Grp 4 Grp 2 Grp 5 Grp 6 Grp 7 Service Gold Service Silver Three individual databases Resource management across cluster between databases Fine grain resource management within single databases

22 Agenda What is a concurrent (mixed workload) environment?
Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune © 2010 Oracle Corporation

23 Step 1: Understand the Workload
Review the workload to find out: Who is doing the work? What types of work are done on the system? When are certain types being done? Where are performance problem areas? What are the priorities, and do they change during a time window? Are there priority conflicts?

24 Workload Management Request Queue Execute Assign Ad-hoc Workload
Each request: Executes on a RAC Service Which limits the physical resources Allows scalability across racks Assign Each request assigned to a consumer group: OS or DB Username Application or Module Action within Module Administrative function Ad-hoc Workload Each consumer group has: Resource Allocation (example: 10% of CPU/IO resources) Directives (example: 20 active sessions) Thresholds (example: no jobs longer than 2 min) Reject Downgrade

25 Workload Management Request Static Reports Queue Assign
Tactical Queries Queue Ad-hoc Workload Execute Reject Downgrade Queue

26 Step 2: Map the Workload to the System
Create the resource consumer groups Map to users or applications Map to estimated execution time Other criteria Create the required resource plans For example: Nighttime vs. daytime, online vs. offline Set the overall priorities Which resource group gets most resources Cap max utilizations Drill down into parallelism, queuing and session throttles

27 Resource Manager User Interface
© 2010 Oracle Corporation

28 Database Resource Manager
Session to Consumer Group Mapping Rules Consumer Groups Tactical service = ‘CRM’ client program = ‘OBIEE’ client program = ‘OBIEE’ && module = ‘AdHoc’ client program = ‘Oracle Data Mining’ query has been running > 1 hour estimated execution time of query > 12 hours service = ‘ETL’ Reports Low-Priority ETL Create Consumer Groups for each type of workload Create rules to dynamically map sessions to Consumer Groups

29 Step 3: Manage CPU CPU is a critical resource Goal Solution
Even more critical on Exadata Exadata Smart Scan only returns useful data blocks Exadata Flash Cache completes I/Os in microseconds Result is heavy CPU loads Goal Allocate sufficient CPU to Tactical, Reports, and ETL to satisfy performance objectives Allocate excess CPU to Low-Priority workloads Solution Configure CPU allocations in Database Resource Plan

30 The DBA can create a Night Time Plan that allocates more CPU to ETL
Step 3: Manage CPU The DBA can create a Night Time Plan that allocates more CPU to ETL Day Time Plan Level 1 Level 2 Tactical % Reports % ETL % Low-Priority % Any CPU unused by Tactical, Reports, or ETL is allocated to Low-Priority sessions Very fine-grained scheduling Resource Manager mimics an OS scheduler Resource Manager schedules at a 100 ms quantum All sessions run, but some sessions run more frequently than others Low-priority session yields to a high-priority session within a quantum Background processes are not managed Backgrounds are high-priority and not CPU-intensive © 2010 Oracle Corporation

31 CPU Scheduling with Resource Manager
Sessions wait on “resmgr:cpu quantum” event Oracle-Internal CPU Queue Tactical Reports Resource Plan: Tactical 75% Reports 25% (Tactical picked 3 out of 4 times) CPU Resource Manager Sessions scheduled every 100 ms © 2010 Oracle Corporation

32 Step 4: Manage I/O Disk bandwidth is a critical resource Goal Solution
Key to exceptional query performance? One query can utilize a high percentage of each disk’s bandwidth Multiple concurrent parallel queries result in heavy disk loads Goal Allocate sufficient I/O bandwidth to Tactical, Reports, and ETL to satisfy performance objectives Allocate excess I/O bandwidth to Low-Priority workloads Solution Configure I/O allocations in Database Resource Plan Enable Exadata I/O Resource Manager

33 Exadata I/O Resource Manager
Issue enough I/Os to keep each disk busy. Queue the rest. When an I/O completes: 1) Pick a Consumer Group queue 2) Issue the I/O request from the head of that queue T T Database Resource Plan T Tactical I/Os R R Database I/O Resource Manager Reports I/Os T E T T T T E T E T T ETL I/Os L L L L Outstanding I/O Requests Disk Low-Priority I/Os Exadata Storage Cell

34 Exadata I/O Resource Manager
Configure Exadata I/O Resource Manager using the Database Resource Plan Same plan used to manage CPU Specify resource allocations per Consumer Group Resource allocation == disk utilization Background and ASM I/Os automatically managed Critical I/Os prioritized: instance recovery, LGWR, control file, etc. Use IORM metrics to track I/O load per Consumer Group (IOPS, MBPS, disk utilization %) I/O throttling per Consumer Group © 2010 Oracle Corporation

35 Step 5: Manage Parallel Execution
Parallel servers are a limited resource Global database limit specified by parallel_max_servers Too many concurrent parallel statements causes thrashing When there are no more parallel servers Critical statements may run serially When parallel servers free up, no way to boost DOP of running statements With 11.2, Oracle automatically decides if a statement Executes in parallel or not and what DOP it will use Can execute immediately or will be queued

36 Parallel Statement Queuing
Tactical Tactical No more parallel servers available – Parallel statements are now queued Parallel servers are available – Parallel statements run immediately Available Servers: 128 Available Servers: 64 Available Servers: 0 Available Servers: 32 Batch Parallel Statement Queue Coordinator Batch Batch Parallel Statement Queue Ad-Hoc Running Parallel Statements © 2010 Oracle Corporation

37 Queuing Shown in Enterprise Management

38 Ordering Parallel Statements
DBAs want to control the order that parallel queries are dequeued Prioritize tactical queries over batch and ad-hoc queries Impose a user-defined policy for ordering queued parallel statements Solution with Separate queues per Consumer Group Resource Plan specifies which queue’s parallel statements are issued next © 2010 Oracle Corporation

39 Ordering Parallel Statements
Since there are no more Tactical parallel statements, we pick either Batch or Ad-Hoc. Batch is selected 70% of the time after Ad-Hoc. Since Tactical is Priority 1, its parallel statements are always selected first. When parallel servers become available, the resource plan is used to select a queue. The head parallel statement from that queue is run. Available Servers: 16 Available Servers: 0 64 Tactical Tactical Tactical Tactical Tactical Tactical Queue Parallel Statement Queue Coordinator Batch Batch Batch Batch Batch Queue Ad-Hoc Ad-Hoc Ad-Hoc Ad-Hoc Ad-Hoc Resource Plan: Priority 1: Tactical Priority 2, 70%: Batch Priority 2, 30%: Ad-Hoc Ad-Hoc Queue Running Queries © 2010 Oracle Corporation

40 Reserving Parallel Servers for Critical Workloads
Flood of batch queries can use up all parallel servers Tactical queries are forced to queue Solution Limit the percentage of parallel servers a Consumer Group can use For example, parallel queries from the Batch Consumer Group can only use 50% of the parallel servers Reserves parallel servers for Tactical queries Limit the degree of parallelism of non-critical workloads © 2010 Oracle Corporation

41 Reserving Parallel Servers for Critical Workloads
Since parallel servers are available, Tactical queries can be run immediately Available Servers: 32 Available Servers: 64 Available Servers: 48 Batch limited to 50% of the parallel servers Tactical Tactical Tactical Queue Parallel Statement Queue Coordinator Batch Batch Batch Batch Batch Batch Batch Batch Queue Resource Plan: Priority 1: Tactical Priority 2, 70%: Batch Priority 2, 30%: Ad-Hoc Ad-Hoc Queue Running Queries © 2010 Oracle Corporation

42 Step 6: Restrict Resource Usage
Requirement Consistent, predictable performance for workloads Useful for hosted environments and departmental apps Solution Cap the CPU utilization for a Consumer Group Cap the disk utilization for a Consumer Group Day Time Plan Allocation Limit Tactical % Sales Reports % % Marketing Reports % % ETL % © 2010 Oracle Corporation

43 Step 7: Manage Runaway Queries
Runaway queries are caused by Missing indicies Unexpected inputs Bad execution plans Severely impact performance of well-behaved queries Very hard to completely eradicate! Query 1 Query 2 Query 3 Query 4 Query Time

44 Manage Runaway Queries
Define runaway queries: Estimated execution time Actual execution time Actual number of I/Os (11.1) Actual bytes of I/O (11.1) Manage runaway queries: Switch to another consumer group Lower-priority consumer group Consumer group with max CPU utilization limit (11.2) Abort call Kill session

45 Manage Runaway Queries
For Tactical consumer group, runaway means: 30+ sec Switch to “Low Priority” consumer group! For Reports consumer group, runaway means: 32GB+ I/Os Abort query! For Ad-Hoc consumer group, runaway means: 24+ hour estimated execution time Don’t execute!

46 Consumer Group Settings Overview

47 Step 3: Run and Adjust the Workload
Run a workload for a period of time and look at the results DBRM Adjust: Overall priorities Scheduling of switches in plans Queuing System Adjust: How many PX statements PX Queuing levels vs. Utilization levels (should we queue less?)

48 Resource Manager - End to End
Test scenario: 2 workloads in a data warehouse Short tactical queries queries Long running deep (batch) analysis Goal: Run batch and tactical analysis concurrently Don’t impact response time of tactical queries!

49 Resource Manager - End to End
© 2010 Oracle Corporation

50 Questions 50

51 Additional Information
Instance Caging Resource Manager © 2010 Oracle Corporation

52


Download ppt "Workload-Management für komplexe Data Warehousing Umgebungen"

Similar presentations


Ads by Google