Manage large RAC Clusters Session# 851 Tom S. Reddy Database Administration, Inc.

Slides:



Advertisements
Similar presentations
SOM Sponsors: RAC, GRID, CLOUD OR ON THE WAY TO ORACLE CLOUD 11GR2 RAC FEATURES REVIEW By: Ahmed Baraka (Independent) Yury Velikanov (Pythian) & All of.
Advertisements

ITEC474 INTRODUCTION.
Presentation Date Top Down Performance Management with OEM Grid Control Or how I learned to stop worrying and love OEM Grid Control 10/1/2010 John Darrah.
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Oracle High Availability Solutions RAC and Standby Database Copyright System Managers LLC 2008.
DB server limits (process/sessions) Carlos Fernando Gamboa, BNL Andrew Wong, TRIUMF WLCG Collaboration Workshop, CERN Geneva, April 2008.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Database Optimization & Maintenance Tim Richard ECM Training Conference#dbwestECM Agenda SQL Configuration OnBase DB Planning Backups Integrity.
Oracle Enterprise Manager – Cloud Control 12c Simon Keys, The Small Ronnie Martin Lambert, The Large Ronnie.
Managing Change with Real Application Testing and Snapshot Standby Barry Hodges Senior Solution Architect, Sales Consulting, Oracle NZ.
Oracle 10g Database Administrator: Implementation and Administration
Oracle 10g Database Administrator: Implementation and Administration Chapter 14 Proactive Maintenance.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
9 Copyright © Oracle Corporation, All rights reserved. Oracle Recovery Manager Overview and Configuration.
CHAPTER 17 Configuring RMAN. Introduction to RMAN RMAN was introduced in Oracle 8.0. RMAN is Oracle’s tool for backup and recovery. RMAN is much more.
Simplify your Job – Automatic Storage Management Angelo Session id:
© 2009 Oracle Corporation. S : Slash Storage Costs with Oracle Automatic Storage Management Ara Vagharshakian ASM Product Manager – Oracle Product.
Database Upgrade/Migration Options & Tips Sreekanth Chintala Database Technology Strategist.
High Availability & Oracle RAC 18 Aug 2005 John Sheaffer Platform Solution Specialist
PPOUG, 05-OCT-01 Agenda RMAN Architecture Why Use RMAN? Implementation Decisions RMAN Oracle9i New Features.
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
CHAPTER 21 Automating Jobs. Introduction to Automating Jobs DBAs rely heavily on automating jobs. DBAs cannot be effective without automation. Listed.
Recovery Manager Overview Target Database Recovery Catalog Database Enterprise Manager Recovery Manager (RMAN) Media Options Server Session.
2 Copyright © 2006, Oracle. All rights reserved. Performance Tuning: Overview.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
Oracle on Windows Server Introduction to Oracle10g on Microsoft Windows Server.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
5 Copyright © 2004, Oracle. All rights reserved. Using Recovery Manager.
Michael Sit Solution Specialists Manager Oracle Corporation.
Installing Oracle Database 11gR2 Software on Red Hat Enterprise Linux 5 (RAC)
1 VitalSoftTech.com Copyright 2014 By Natik Ameen.
11g(R1/R2) Data guard Enhancements Suresh Gandhi
Presented by, MySQL AB® & O’Reilly Media, Inc. 0 to 60 in 3.1 Tyler Carlton Cory Sessions.
15 Copyright © 2007, Oracle. All rights reserved. Performing Database Backups.
Oracle Tuning Ashok Kapur Hawkeye Technology, Inc.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
A Guide to Oracle9i1 Database Instance startup and shutdown.
Anton TopurovIT-DB 23 April 2013 Introduction to Oracle2.
1 Oracle Enterprise Manager Slides from Dominic Gélinas CIS
DB Questions and Answers open session Carlos Fernando Gamboa, BNL WLCG Collaboration Workshop, CERN Geneva, April 2008.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
Marcin Blaszczyk, Zbigniew Baranowski – CERN Outline Overview & Architecture Use Cases for Our experience with ADG and lessons learned Conclusions.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
3 Copyright © 2006, Oracle. All rights reserved. Using Recovery Manager.
Alwayson Availability Groups
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Enterprise Manager: Scalable Oracle Management John Kennedy System Management Products, Server Technologies, Oracle Corporation Session id:XXXXX.
Michelle Malcher PepsiCo Session # For the DBA Manager – Understanding Oracle and DBAs.
2 Copyright © 2006, Oracle. All rights reserved. Configuring Recovery Manager.
8 Copyright © 2007, Oracle. All rights reserved. Using RMAN to Duplicate a Database.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Introduction to Exadata X5 and X6 New Features
REMINDER Check in on the COLLABORATE mobile app Best Practices for Oracle on VMware - Deep Dive Darryl Smith Chief Database Architect Distinguished Engineer.
3 Copyright © 2006, Oracle. All rights reserved. Installation and Administration Basics.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
Oracle 10g Administration Oracle Server Introduction Copyright ©2006, Custom Training Institute.
4 Copyright © 2004, Oracle. All rights reserved. Managing the Oracle Instance.
1 Patterns PRESENTER Y V RaviKumar Oracle ACE & Oracle Certified Master (OCM) PRESENTER Y V RaviKumar Oracle ACE & Oracle Certified Master (OCM)
Calgary Oracle User Group
How To Pass Oracle 1z0-060 Exam In First Attempt?
Oracle Database Monitoring and beyond
Introduction of Week 6 Assignment Discussion
RAC Performance Lab.
AlwaysOn Availability Groups
Index Index.
Presentation transcript:

Manage large RAC Clusters Session# 851 Tom S. Reddy Database Administration, Inc.

About the Speaker IOUG Conference Committee 2013 Upgrades, Migrations and New Features Oracle Certified Database Administrator Presented at previous COLLABORATE Conferences Focus on MAA, HA, RAC and performance tuning Chief Technology Officer at Database Administration, Inc.

Survey DBAs Developers Sys Admins Managers Others How many use RAC? How many use Exadata? Versions10g? 11g? 12c! Familiar with MAA?

Manage Large RAC Clusters Basic Commands Challenges Solutions Workload Management Services Performance Management Standby and Backups

Overview of RAC What is RAC? How does RAC work? How many nodes can you have? What are the advantages and disadvantages of having large number of RAC nodes?

Overview of RAC …Cont’d High Availability Highly Scalable Commodity Servers Database Cloud vs Pluggable DB’s Oracle® Real Application Clusters Administration and Deployment Guide 11g Release 2 (11.2)

What is a large cluster? 4 nodes 8 nodes 12 nodes 16 nodes 32 nodes 100 nodes? Nodes vs CPUs

Design of a large cluster Hardware Chassis Network Servers CPU Memory OS Oracle Binaries Disk Layout

Sample Specs Servers 16 Core Count 2x6x16 = 192 cores at 3.46Ghz! Memory 48x16 = 768 GB! SGA > 400GB!! PGA > 250GB!!

New System Specs Servers 8 Core Count 4x8x8 = 256 cores at 2.7Ghz! Memory 256x8 = 2048 GB! SGA > 1024GB!! PGA > 768GB!! Disk SSD SAN Infiniband Private Network

Oracle Binary Setup Installation Binaries on Internal Hard Drives Home1: Grid Infrastructure/ASM Home2: Oracle Database Home3: Grid Control Agent This allows for rolling upgrades/patching OCR & Voting Disk Files on dedicated ASM disk group (Normal or High Redundancy)

Challenges Sheer number of nodes How do you build them? How do you install oracle? How do you patch oracle? How do you perform other maintenance?

Challenges …Cont’d How do you monitor them? How do you manage them? Performance Monitoring Performance Management Performance Tuning

Solutions Oracle Tools OEM crsctl srvctl Other Tools Custom Scripts

Solutions

Get familiar with the following commands: crsctl -help crsctl status resource -t NAME TARGET STATE SERVER STATE_DETAILS Local Resources ora.LISTENER.lsnr ONLINE ONLINE housrv01 ONLINE ONLINE housrv02 ONLINE ONLINE housrv03 ONLINE ONLINE housrv04 ONLINE ONLINE housrv05 ONLINE ONLINE housrv06 ONLINE ONLINE housrv07 ONLINE ONLINE housrv08 ora.asm ONLINE ONLINE housrv01 ONLINE ONLINE housrv02 ONLINE ONLINE housrv03 ONLINE ONLINE housrv04 ONLINE ONLINE housrv05 ONLINE ONLINE housrv06 ONLINE ONLINE housrv07 ONLINE ONLINE housrv08 ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE housrv05 ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE housrv03 ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE housrv01 ora.r1test.db 1 ONLINE ONLINE housrv01 Open 2 ONLINE ONLINE housrv02 Open 3 ONLINE ONLINE housrv03 Open 4 ONLINE ONLINE housrv04 Open 5 ONLINE ONLINE housrv05 Open 6 ONLINE ONLINE housrv06 Open 7 ONLINE ONLINE housrv07 Open 8 ONLINE ONLINE housrv08 Open

Solutions …Cont’d Get familiar with the following commands: srvctl –help Usage: srvctl [ ] commands: enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config objects: database|instance|service|nodeapps|vip|network|asm|diskgroup|listener|srvpool|server|scan|scan_ listener|oc4j|home|filesystem|gns|cvu For detailed help on each command and object and its options use: srvctl -h or srvctl -h srvctl status database –d R1TEST Instance R1TEST1 is running on node housrv01 Instance R1TEST2 is running on node housrv02 Instance R1TEST3 is running on node housrv03 Instance R1TEST4 is running on node housrv04 Instance R1TEST5 is running on node housrv05 Instance R1TEST6 is running on node housrv06 Instance R1TEST7 is running on node housrv07 Instance R1TEST8 is running on node housrv08

Solutions …Cont’d Custom scripts raccmd_serial scp to move files cygwin

Solutions …Cont’d raccmd_serial.sh housrv "ps -ef | grep -v grep | grep smon " Running command $2 on housrv01 ssh ps -ef | grep -v grep | grep smon oracle Feb11 ? 00:13:53 ora_smon_r1srvc1 root Feb07 ? 08:29:30 /oracle/grid/11.2.0/grid/bin/osysmond.bin oracle Feb07 ? 00:00:04 asm_smon_+ASM1 Running command $2 on housrv02 ssh ps -ef | grep -v grep | grep smon oracle Feb11 ? 00:17:01 ora_smon_r1srvc2 root Feb07 ? 06:31:09 /oracle/grid/11.2.0/grid/bin/osysmond.bin oracle Feb07 ? 00:00:02 asm_smon_+ASM2 Running command $2 on housrv03 ssh ps -ef | grep -v grep | grep smon root Feb07 ? 06:40:01 /oracle/grid/11.2.0/grid/bin/osysmond.bin oracle Feb07 ? 00:00:02 asm_smon_+ASM3 oracle Feb11 ? 00:16:53 ora_smon_r1srvc3 …

Solutions …Cont’d raccmd_serial.sh (Courtesy of Joel N) #!/bin/bash USR=root if [ -z $1 ]; then echo No Servername echo SYNTAX raccmd racname ""command"" exit fi if [ -z $2 ]; then echo No command echo SYNTAX raccmd racname ""command"" exit fi if [ -z $3 ]; then echo No start server number echo SYNTAX raccmd racname ""command"" exit fi if [ -z $4 ]; then echo No end server number echo SYNTAX raccmd racname ""command"" exit fi for i in $(seq $3 $4) do if test $i -lt 10 then echo Running command '$2' on $10$i echo ssh $2 ssh $2 else echo Running command '$2' on $1$i echo ssh $2 ssh $2 fi done

Solutions …Cont’d Cygwin xterm_all_nodes (Courtesy of Kevin L) Opens and xterm on each node in the cluster exe_all_nodes Opens an xterm on each node (including the one you are logged in on) and runs the specified executable

Solutions …Cont’d

Workload Management Services Load Balancing Resource Manager Performance Management SQL Monitoring

Services Enable Workload management by routing work to optimal instances Helps lower Interconnect traffic vs Cache Fusion One or more Services per Instance Typically one per Application or type of Workload Large Database Cloud can be divided into smaller/manageable resources DBMS_SCHEDULER & Job Classes

Services R1COMPC ARAisionJob ClassService Name Node1Node2Node3Node4Node5Node6Node7Node8 hourCOMPa01hourCOMPa02hourCOMPa03hourCOMPa04hourCOMPa05hourCOMPa06hourCOMPa07hourCOMPa08 R1COMPC1R1COMPC2R1COMPC3R1COMPC4R1COMPC5R1COMPC6R1COMPC7R1COMPC8 PPAA95ARA_095R1COMPC_ARA_095 AAPP30ARA_030R1COMPC_ARA_030 AAPP60ARA_060R1COMPC_ARA_060 PPAA20ARA_020R1COMPC_ARA_020 PPAA10ARA_010R1COMPC_ARA_010 PPAA50ARA_050R1COMPC_ARA_050 PPAA12ARA_012R1COMPC_ARA_012 PPAA14ARA_014R1COMPC_ARA_014 AAPP63ARA_063R1COMPC_ARA_063 AAPP293ARA_293R1COMPC_ARA_293 AAPP305ARA_305R1COMPC_ARA_305 AAPP72ARA_072R1COMPC_ARA_072 AAPP306ARA_306R1COMPC_ARA_306 AAPP93ARA_093R1COMPC_ARA_093 AAPP91ARA_091R1COMPC_ARA_091 AAPP80ARA_080R1COMPC_ARA_080 AAPP94ARA_094R1COMPC_ARA_094 PPAANAR1COMPC_IDR 95 30,60 20,10,50,12,14 63,293,305,72,3 06,93,91,80,94

Solutions

Resource Manager Resource throttling based on Consumer Groups Typically used to manage CPU Parallelism Services Basic: Limit user CPU load to 90% Complex: Consumer Groups/additional resources

Solutions

Performance Management OEM Performance Tab SQL Monitoring Top Activity Tab Top Sessions Top Services

OEM Performance Tab

AWR Trends DateCPU %User IO %Other % 11/6/ /26/ /10/ /31/ /14/ /4/ /18/ /25/ /4/ /11/ /18/ /25/ /1/ /2/

SQL Monitoring Best way to monitor real-time SQL using OEM Real-Time Monitoring of SQL Database or Instance Level SQL must have consumed at least 5 sec of CPU&IO or be run in parallel to be captured!

Solutions

Top Activity Sessions

Top Activity Services

Performance Management ADDM addmrt.sql vs addmrpti.sql AWR awrgrpt.sq vs awrrpti.sql awrgdrpt.sql vs awrgdrpi.sql ASH ashrpti.sql

ADDM

AWR

AWR Instance

ASH

Standby Physical Standby Equal or fewer nodes All standby nodes receive archived logs Only one apply node Switchover/Failover DGMGRL Manual (shutdown all other nodes) and work on one node in each location OEM…doesn’t always work

Standby Check Manual Primary select thread#, max(sequence#) from v$archived_log group by thread# order by 1; Standby select thread#, max(sequence#) from v$archived_log where applied = 'YES' group by thread# order by 1; select name, value, time_computed from v$dataguard_stats; DGMGRL show configuration verbose show database verbose psrva show instance verbose psrva1 show database verbose s1srva OEM

Standby Check THREAD# MAX(SEQUENCE#) rows selected NAMEVALUETIME_COMPUTED transport lag+00 00:00:004/7/ :42 apply lag+00 00:00:004/7/ :42 apply finish time+00 00:00: /7/ :42 estimated startup time1224/7/ :42

Backups RMAN Image Copy Backups & Incremental Updates Backups at the Standby Create channels by Instances CONFIGURE CHANNEL DEVICE TYPE sbt CONNECT Change Tracking file for Incrementals Managed by OEM Flashback Technologies

Duplicates Duplicate from large RAC production environments to equivalent or few nodes Regression or Development Refreshes RMAN Live Duplicate Takes just a few hours on a multi-TB DB RMAN Backup-based Duplicate SAN based Snapshots can be much faster…Testing

Summary Basic Commands Challenges Solutions Workload Management Services Performance Management Standby and Backups

Manage large RAC Clusters Session# 851 Questions? Please fill out evaluations!

Manage large RAC Clusters Session# 851 Tom S. Reddy Database Administration, Inc.