Workload-Management für komplexe Data Warehousing Umgebungen

Slides:



Advertisements
Similar presentations
Maria Colgan & Thierry Cruanes
Advertisements

Advanced Oracle DB tuning Performance can be defined in very different ways (OLTP versus DSS) Specific goals and targets must be set => clear recognition.
Database Tuning. Objectives Describe the roles associated with database tuning. Describe the dependency between tuning in different development phases.
Chapter 9. Performance Management Enterprise wide endeavor Research and ascertain all performance problems – not just DBMS Five factors influence DB performance.
Paging: Design Issues. Readings r Silbershatz et al: ,
SSRS 2008 Architecture Improvements Scale-out SSRS 2008 Report Engine Scalability Improvements.
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Oracle for Data Warehousing
SLA-Oriented Resource Provisioning for Cloud Computing
Oracle Exadata for SAP.
Acknowledgments Byron Bush, Scott S. Hilpert and Lee, JeongKyu
Introduction to DBA.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Essbase Reporting Jim Kubik Senior Sales Consultant.
Enabling High-level SLOs on Shared Storage Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica Cake 1.
Backup and recovery Basics of Backup and restoration Types of recovery Defining strategy Starting up and shutting down 80/20 rule SLA’s.
Oracle Enterprise Manager – Cloud Control 12c Simon Keys, The Small Ronnie Martin Lambert, The Large Ronnie.
Enter Date in Title Master Workload Management HBC Case Study IRMAC, January 2008 Shelley Perrior -DBA team lead.
Workload Management for an Operational DW Oracle Database 11g Release 2 Jean-Pierre Dijcks Data Warehouse Product Management.
Workload Management BMO Financial Group Case Study IRMAC, January 2008 Sorina Faur, Database Development Manager.
Graeme Scott – Technology Solution Professional Reduce Infrastructure Costs & Increase Productivity with SQL Server 2008.
Module 8: Monitoring SQL Server for Performance. Overview Why to Monitor SQL Server Performance Monitoring and Tuning Tools for Monitoring SQL Server.
Grid Computing Meets the Database Chris Smith Platform Computing Session #
1 Copyright © 2009, Oracle. All rights reserved. Exploring the Oracle Database Architecture.
© 2011 IBM Corporation 11 April 2011 IDS Architecture.
Module 8: Server Management. Overview Server-level and instance-level resources such as memory and processes Database-level resources such as logical.
Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.
The Sun Oracle Database Machine Barry Hodges Senior Solution Architect Oracle New Zealand.
2 Copyright © 2006, Oracle. All rights reserved. Performance Tuning: Overview.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
Best practices on managing parallel execution in concurrent environments Jean-Pierre Dijcks Sr. Principal Product Manager – Data Warehousing.
Oracle Tuning Ashok Kapur Hawkeye Technology, Inc.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Module 10 Administering and Configuring SharePoint Search.
1. Best Practices for Extreme Performance with Data Warehousing on Oracle Database Maria Colgan Senior Principal Product Manager.
Srik Raghavan Principal Lead Program Manager Kevin Cox Principal Program Manager SESSION CODE: DAT206.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Oracle Database Architecture By Ayesha Manzer. Automatic Storage Management Spreads database data across all disks Creates and maintains a storage grid.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1.
REMINDER Check in on the COLLABORATE mobile app Oracle Performance Management with vCenter Operations Manager and Oracle Enterprise Manager (OEM) Adapter.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
© 2009 Oracle Corporation – Proprietary and Confidential Agenda Reporting Overview Performance Workspace Dashboards Reports Drill thru Smartview Excel.
Smarter Technology for Better Business
SEDA: An Architecture for Scalable, Well-Conditioned Internet Services
Flash Storage 101 Revolutionizing Databases
Copyright ©: Nahrstedt, Angrave, Abdelzaher
Database Services at CERN Status Update
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
QlikView Licensing.
Maximum Availability Architecture Enterprise Technology Centre.
Installation and database instance essentials
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
Oracle Solaris Zones Study Purpose Only
Oracle Database Monitoring and beyond
Software Architecture in Practice
Module 5: CPU Scheduling
Operating System Concepts
Support for ”interactive batch”
Operating Systems.
Admission Control and Request Scheduling in E-Commerce Web Sites
CPU SCHEDULING.
Process Scheduling B.Ramamurthy 4/11/2019.
Process Scheduling B.Ramamurthy 4/7/2019.
Virtual Memory: Working Sets
Your Data Any Place, Any Time
Inside the Database Engine
Inside the Database Engine
Presentation transcript:

Workload-Management für komplexe Data Warehousing Umgebungen Hermann Bär, Data Warehousing Product Management

Why Anglo-German? © 2010 Oracle Corporation

Agenda What is a concurrent (mixed workload) environment? Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune

A Mixed Workload 2005 Major Changes for your Data Warehouse Department A supplies data to the DW daily and runs reports Department B supplies data to the DW daily and runs reports 10101000101 10101000101 Data Marts Daily batch windows Ad-hoc queries Downtime OK

A Mixed Workload 2011 Major Changes for your Data Warehouse All Departments On-Line Applications CEO Strategy Finance Marketing CRM Live Systems Stock Tracking Direct Business Impact 10101000101 10101000101 Real Time Feeds Enterprise Data Warehouse Write-Backs 10101000101 10101000101 Classic Reporting Deep Analytics Long running reports Heavy Analytical Content Investigative querying Predictive Modeling Scenario Analysis Data Mining

A Mixed Workload Sample Requirements Workloads should use critical system resources according to their priority CPU, I/O Tactical workload must run with expected DOP Running at diminished DOP or queuing results in unacceptable performance Full utilization of critical resources Avoid inefficient schemes that require dedicated resources Servers dedicated to services Separate data marts / warehouses Flexible resource allocation E.g. priority of ETL is based on time of day

Agenda What is a concurrent (mixed workload) environment? Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune © 2010 Oracle Corporation

Workload Management for DW Three Main Components Database Architecture Hardware Architecture Define Workload Plans Filter Exceptions Manage Resources Monitor Workloads Adjust Workload Plans EDW Data Layers Data Mart Strategy Sandboxes Active HA/DR Strategy Compression Strategies Storage Media Hierarchy    © 2010 Oracle Corporation

Workload Management for DW What we are covering today… Define Workloads Filter Exceptions Manage Resources Monitor Workloads Adjust Plans Execute Workloads Monitor Workloads Adjust Workload Plans IORM RAC OEM DBRM Define Workload Plans The RAC piece includes things like: Services Server Pools (Grid Infrastructure) to provide elasticity (add servers to pool to increase memory) Instance Caging (consolidation) © 2010 Oracle Corporation

Agenda What is a concurrent (mixed workload) environment? Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune © 2010 Oracle Corporation

Tools and Methods Resource allocation Resource management Database (processing) nodes Services, Server Pools, and consumer groups Instance caging IO resource management Resource management Consumer Groups Within a single database Across multiple databases Workload-driven database resource management Thresholds and actions

Services Use services to restrict the number of nodes 1 2 3 4 5 6 7 8 Service Gold Service Silver Use services to restrict the number of nodes Dynamic allocation and re-routing Divide 8 Node cluster, where Service Gold is 3 nodes

Services and Server Pools Service Gold Service Silver 1 2 3 4 5 6 7 8 Expand a service by expanding the pool of servers it has access to Expand Service Gold to 4 nodes Shrink Service Silver to 4 nodes

Instance Caging Limit (“cage”) the amount of CPU for a given instance 1 2 3 4 5 6 7 8 Limit (“cage”) the amount of CPU for a given instance Divide 8 Node cluster, where two databases get half of the CPUs per node

Sample Instance Caging 4 CPU server Workload is a mix of OLTP transactions, parallel queries, and DMLs from Oracle Financials

I/O Resource Management on Exadata Global I/O resource management Prioritize multiple individual databases Prioritize workloads within a single database Prioritize a certain type of workload across all databases Prioritize all tactical queries Deprioritize all ad-hoc queries Data Mart A Data Mart B Enterprise Data Warehouse

Sample I/O Utilization Queries from TPC-H benchmark suite Disk utilization measured via iostat

Database Resource Manager Single framework to do workload management including CPU Session control Thresholds IO (Exadata has IO Resource Manager) Parallel statement queuing Each consumer group now needs to be managed in terms of parallel statement queuing New settings / screens to control queuing in Enterprise Manager and in DBRM packages

Database Resource Manager (DBRM) 1 2 3 4 5 6 7 8 Grp 1 Grp 2 Grp 3 Resource Management within a single database Divide a system horizontally across nodes Uses Resources Plans and Groups to model and assign resources Allows for prioritization and flexibility in resource allocation

DBRM with Services Resource management within a single database 1 2 3 4 5 6 7 8 Grp 1 Grp 3 Grp 4 Grp 2 Grp 5 Service Gold Service Silver Resource management within a single database Service-aware resource management Make sure to fully utilize the resources

DBRM with Services and Instance Caging 1 2 3 4 5 6 7 8 Grp 1 Grp 3 Grp 4 Grp 2 Grp 5 Grp 6 Grp 7 Service Gold Service Silver Three individual databases Resource management across cluster between databases Fine grain resource management within single databases

Agenda What is a concurrent (mixed workload) environment? Planning for workload management Tools and methods Resource definition and management Step-by-step workload management Identify workloads Manage system resources Restrict resource usage Curb runaway queries Monitor and tune © 2010 Oracle Corporation

Step 1: Understand the Workload Review the workload to find out: Who is doing the work? What types of work are done on the system? When are certain types being done? Where are performance problem areas? What are the priorities, and do they change during a time window? Are there priority conflicts?

Workload Management Request Queue Execute Assign Ad-hoc Workload Each request: Executes on a RAC Service Which limits the physical resources Allows scalability across racks Assign Each request assigned to a consumer group: OS or DB Username Application or Module Action within Module Administrative function Ad-hoc Workload Each consumer group has: Resource Allocation (example: 10% of CPU/IO resources) Directives (example: 20 active sessions) Thresholds (example: no jobs longer than 2 min) Reject Downgrade

Workload Management Request Static Reports Queue Assign Tactical Queries Queue Ad-hoc Workload Execute Reject Downgrade Queue

Step 2: Map the Workload to the System Create the resource consumer groups Map to users or applications Map to estimated execution time Other criteria Create the required resource plans For example: Nighttime vs. daytime, online vs. offline Set the overall priorities Which resource group gets most resources Cap max utilizations Drill down into parallelism, queuing and session throttles

Resource Manager User Interface © 2010 Oracle Corporation

Database Resource Manager Session to Consumer Group Mapping Rules Consumer Groups Tactical service = ‘CRM’ client program = ‘OBIEE’ client program = ‘OBIEE’ && module = ‘AdHoc’ client program = ‘Oracle Data Mining’ query has been running > 1 hour estimated execution time of query > 12 hours service = ‘ETL’ Reports Low-Priority ETL Create Consumer Groups for each type of workload Create rules to dynamically map sessions to Consumer Groups

Step 3: Manage CPU CPU is a critical resource Goal Solution Even more critical on Exadata Exadata Smart Scan only returns useful data blocks Exadata Flash Cache completes I/Os in microseconds Result is heavy CPU loads Goal Allocate sufficient CPU to Tactical, Reports, and ETL to satisfy performance objectives Allocate excess CPU to Low-Priority workloads Solution Configure CPU allocations in Database Resource Plan

The DBA can create a Night Time Plan that allocates more CPU to ETL Step 3: Manage CPU The DBA can create a Night Time Plan that allocates more CPU to ETL Day Time Plan Level 1 Level 2 Tactical 60% Reports 20% ETL 20% Low-Priority 100% Any CPU unused by Tactical, Reports, or ETL is allocated to Low-Priority sessions Very fine-grained scheduling Resource Manager mimics an OS scheduler Resource Manager schedules at a 100 ms quantum All sessions run, but some sessions run more frequently than others Low-priority session yields to a high-priority session within a quantum Background processes are not managed Backgrounds are high-priority and not CPU-intensive © 2010 Oracle Corporation

CPU Scheduling with Resource Manager Sessions wait on “resmgr:cpu quantum” event Oracle-Internal CPU Queue Tactical Reports Resource Plan: Tactical 75% Reports 25% (Tactical picked 3 out of 4 times) CPU Resource Manager Sessions scheduled every 100 ms © 2010 Oracle Corporation

Step 4: Manage I/O Disk bandwidth is a critical resource Goal Solution Key to exceptional query performance? One query can utilize a high percentage of each disk’s bandwidth Multiple concurrent parallel queries result in heavy disk loads Goal Allocate sufficient I/O bandwidth to Tactical, Reports, and ETL to satisfy performance objectives Allocate excess I/O bandwidth to Low-Priority workloads Solution Configure I/O allocations in Database Resource Plan Enable Exadata I/O Resource Manager

Exadata I/O Resource Manager Issue enough I/Os to keep each disk busy. Queue the rest. When an I/O completes: 1) Pick a Consumer Group queue 2) Issue the I/O request from the head of that queue T T Database Resource Plan T Tactical I/Os R R Database I/O Resource Manager Reports I/Os T E T T T T E T E T T ETL I/Os L L L L Outstanding I/O Requests Disk Low-Priority I/Os Exadata Storage Cell

Exadata I/O Resource Manager Configure Exadata I/O Resource Manager using the Database Resource Plan Same plan used to manage CPU Specify resource allocations per Consumer Group Resource allocation == disk utilization Background and ASM I/Os automatically managed Critical I/Os prioritized: instance recovery, LGWR, control file, etc. Use IORM metrics to track I/O load per Consumer Group (IOPS, MBPS, disk utilization %) I/O throttling per Consumer Group © 2010 Oracle Corporation

Step 5: Manage Parallel Execution Parallel servers are a limited resource Global database limit specified by parallel_max_servers Too many concurrent parallel statements causes thrashing When there are no more parallel servers Critical statements may run serially When parallel servers free up, no way to boost DOP of running statements With 11.2, Oracle automatically decides if a statement Executes in parallel or not and what DOP it will use Can execute immediately or will be queued

Parallel Statement Queuing Tactical Tactical No more parallel servers available – Parallel statements are now queued Parallel servers are available – Parallel statements run immediately Available Servers: 128 Available Servers: 64 Available Servers: 0 Available Servers: 32 Batch Parallel Statement Queue Coordinator Batch Batch Parallel Statement Queue Ad-Hoc Running Parallel Statements © 2010 Oracle Corporation

Queuing Shown in Enterprise Management

Ordering Parallel Statements DBAs want to control the order that parallel queries are dequeued Prioritize tactical queries over batch and ad-hoc queries Impose a user-defined policy for ordering queued parallel statements Solution with 11.2.0.2 Separate queues per Consumer Group Resource Plan specifies which queue’s parallel statements are issued next © 2010 Oracle Corporation

Ordering Parallel Statements Since there are no more Tactical parallel statements, we pick either Batch or Ad-Hoc. Batch is selected 70% of the time after Ad-Hoc. Since Tactical is Priority 1, its parallel statements are always selected first. When parallel servers become available, the resource plan is used to select a queue. The head parallel statement from that queue is run. Available Servers: 16 Available Servers: 0 64 Tactical Tactical Tactical Tactical Tactical Tactical Queue Parallel Statement Queue Coordinator Batch Batch Batch Batch Batch Queue Ad-Hoc Ad-Hoc Ad-Hoc Ad-Hoc Ad-Hoc Resource Plan: Priority 1: Tactical Priority 2, 70%: Batch Priority 2, 30%: Ad-Hoc Ad-Hoc Queue Running Queries © 2010 Oracle Corporation

Reserving Parallel Servers for Critical Workloads Flood of batch queries can use up all parallel servers Tactical queries are forced to queue Solution Limit the percentage of parallel servers a Consumer Group can use For example, parallel queries from the Batch Consumer Group can only use 50% of the parallel servers Reserves parallel servers for Tactical queries Limit the degree of parallelism of non-critical workloads © 2010 Oracle Corporation

Reserving Parallel Servers for Critical Workloads Since parallel servers are available, Tactical queries can be run immediately Available Servers: 32 Available Servers: 64 Available Servers: 48 Batch limited to 50% of the parallel servers Tactical Tactical Tactical Queue Parallel Statement Queue Coordinator Batch Batch Batch Batch Batch Batch Batch Batch Queue Resource Plan: Priority 1: Tactical Priority 2, 70%: Batch Priority 2, 30%: Ad-Hoc Ad-Hoc Queue Running Queries © 2010 Oracle Corporation

Step 6: Restrict Resource Usage Requirement Consistent, predictable performance for workloads Useful for hosted environments and departmental apps Solution Cap the CPU utilization for a Consumer Group Cap the disk utilization for a Consumer Group Day Time Plan Allocation Limit Tactical 60% Sales Reports 15% 30% Marketing Reports 15% 30% ETL 10% © 2010 Oracle Corporation

Step 7: Manage Runaway Queries Runaway queries are caused by Missing indicies Unexpected inputs Bad execution plans Severely impact performance of well-behaved queries Very hard to completely eradicate! Query 1 Query 2 Query 3 Query 4 Query Time

Manage Runaway Queries Define runaway queries: Estimated execution time Actual execution time Actual number of I/Os (11.1) Actual bytes of I/O (11.1) Manage runaway queries: Switch to another consumer group Lower-priority consumer group Consumer group with max CPU utilization limit (11.2) Abort call Kill session

Manage Runaway Queries For Tactical consumer group, runaway means: 30+ sec Switch to “Low Priority” consumer group! For Reports consumer group, runaway means: 32GB+ I/Os Abort query! For Ad-Hoc consumer group, runaway means: 24+ hour estimated execution time Don’t execute!

Consumer Group Settings Overview

Step 3: Run and Adjust the Workload Run a workload for a period of time and look at the results DBRM Adjust: Overall priorities Scheduling of switches in plans Queuing System Adjust: How many PX statements PX Queuing levels vs. Utilization levels (should we queue less?)

Resource Manager - End to End Test scenario: 2 workloads in a data warehouse Short tactical queries queries Long running deep (batch) analysis Goal: Run batch and tactical analysis concurrently Don’t impact response time of tactical queries!

Resource Manager - End to End © 2010 Oracle Corporation

Questions 50

Additional Information Instance Caging http://www.oracle.com/technetwork/database/features/performance/instance-caging-wp-166854.pdf Resource Manager http://www.oracle.com/technetwork/database/features/performance/resource-manager-twp-133705.pdf © 2010 Oracle Corporation