Superdome performance Partitions Dynamic CPU Allocation Other Techniques Tom Anderson HP.

Slides:



Advertisements
Similar presentations
Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
Advertisements

Winning With The most comprehensive partitioning solution in the world ! Note: This presentation is best viewed in animation mode.
Section 6.2. Record data by magnetizing the binary code on the surface of a disk. Data area is reusable Allows for both sequential and direct access file.
Hp-ux 11i: executive briefing for MBNA Presenter Mike Wardley Title World Wide HP-UX Marketing Manager.
Part IV: Memory Management
Dynamic Load Balancing for VORPAL Viktor Przebinda Center for Integrated Plasma Studies.
Dynamic Capabilities of Capacity on Demand Sanjiv Patel rp8400 Product Manager Hewlett-Packard Company.
Page 1 Dorado 400 Series Server Club Page 2 First member of the Dorado family based on the Next Generation architecture Employs Intel 64 Xeon Dual.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 11 Windows Server 2008 Virtualization.
Introduction to Virtualization
Cacti Workshop Tony Roman Agenda What is Cacti? The Origins of Cacti Large Installation Considerations Automation The Current.
Memory Management 2010.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 2: Managing Hardware Devices.
1 I/O Management in Representative Operating Systems.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Hands-On Microsoft Windows Server 2008
M ODULE 2 D ATABASE I NSTALLATION AND C ONFIGURATION Section 1: DBMS Installation 1 ITEC 450 Fall 2012.
Computing Hardware Starter.
SUSE Linux Enterprise Server Administration (Course 3037) Chapter 4 Manage Software for SUSE Linux Enterprise Server.
Protection and the Kernel: Mode, Space, and Context.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 2: Managing Hardware Devices.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
1 Guide to Novell NetWare 6.0 Network Administration Chapter 13.
Process Management. Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 8: Main Memory.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 1 Introduction Read:
1. Memory Manager 2 Memory Management In an environment that supports dynamic memory allocation, the memory manager must keep a record of the usage of.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Comparing High-End Computer Architectures for Business Applications Presentation: 493 Track: HP-UX Dr. Frank Baetke HP.
Chapter Nine NetWare-Based Networking. Introduction to NetWare In 1983, Novell introduced its NetWare network operating system Versions 3.1 and 3.1—collectively.
Cosc 2150: Computer Organization Chapter 6, Part 2 Virtual Memory.
Guide to Linux Installation and Administration, 2e1 Chapter 10 Managing System Resources.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Managing in Multiple Operating System Environments OS administration in an hp-ux and Linux environment Steeve Daigle, HP & Steve Cooke, HP.
1 Galaxy S peed T otal Cost of Ownership A vailability R eliability S calability Technology Overview...
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Processes Introduction to Operating Systems: Module 3.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
VMware vSphere Configuration and Management v6
© GCSE Computing Computing Hardware Starter. Creating a spreadsheet to demonstrate the size of memory. 1 byte = 1 character or about 1 pixel of information.
CIS250 OPERATING SYSTEMS Chapter One Introduction.
Processes and Virtual Memory
Coupling Facility. The S/390 Coupling Facility (CF), the key component of the Parallel Sysplex cluster, enables multisystem coordination and datasharing.
Full and Para Virtualization
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Memory Management Overview.
CEG 2400 FALL 2012 Linux/UNIX Network Operating Systems.
If you have a transaction processing system, John Meisenbacher
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
1 OPERATING SYSTEMS. 2 CONTENTS 1.What is an Operating System? 2.OS Functions 3.OS Services 4.Structure of OS 5.Evolution of OS.
Introduction to Operating Systems Concepts
Chapter 2 Memory and process management
Current Generation Hypervisor Type 1 Type 2.
Chapter 3: Process Concept
Process Management Presented By Aditya Gupta Assistant Professor
Memory Management © 2004, D. J. Foreman.
Chapter 9 – Real Memory Organization and Management
CSI 400/500 Operating Systems Spring 2009
Main Memory Management
Introduction of Week 3 Assignment Discussion
Chapter 1: Introduction
IS3440 Linux Security Unit 7 Securing the Linux Kernel
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Memory Management Overview
2.C Memory GCSE Computing Langley Park School for Boys.
CSE 542: Operating Systems
Presentation transcript:

Superdome performance Partitions Dynamic CPU Allocation Other Techniques Tom Anderson HP

Contents Hard partitions and virtual partitions Tuning advantages of partitions Software performance optimizations Mixing processors in a Superdome Instant capacity on demand (iCOD) Dynamic CPU allocation options Variable usage (pay for use) HP-UX & Windows & Linux all in the same Itanium Superdome (future) Summary

Hard partitions and soft partitions Hard partitions (nPartitions) –“separate” servers but without separate “skins” –Electrical isolation –Up to 16 per Superdome –Independent instance of HP-UX Virtual partitions –S/W isolation –Dynamic cpu allocation –Up to 64 per Superdome –Independent instance of HP-UX

nPartitions for superdome Increased system utilization –partitioning Superdome into physical entities: up to16 nPartitions Increased Flexibility: Multi OS –Multi OS support: HP-UX, Linux (*), Windows (*) –Multi OS version support –Multiple patch level support Increased Uptime –hardware and software isolation across nPartitions –MC/ServiceGuard support (within Superdome or to another HP 9000 server) * Available in 2H2003

Partition 1 partition examples isolation of production from test & development Development Test Customer Interaction Production Environment Partition 2 Partition 3 32-way Superdome

partition examples multi-tier applications Oracle Database SAP R/3 Partition 1 Oracle Database Tier Partition 2 32-way Superdome SAP R/3 Application Tier

virtual partitions A single nPartition may be soft-partitioned into multiple virtual servers Each virtual partition (vPar) runs an independent instance of HP-UX, providing complete name-space isolation vPars may run separate release and patch levels of HP-UX vPars may be individually reconfigured and rebooted Dynamic reconfiguration of CPUs 16 Superdome nPartitions can be further partitioned into 64 virtual partitions 1 64 Available on Superdome now

three virtual partitions on a single cell superdome npartition P0 P1 P2 P3 Mem 0-8 Gbyte Mem 8-12 Gbyte Mem 12-16Gbyte Cell Controller I/O Controller Slot 0Slot 1Slot 2Slot 3Slot 8Slot 9Slot 4Slot 6 Slot 10 Slot 5Slot 7 Slot 11 Blue Partition Uses two CPUs, 8 Gbyte memory, 6 PCI slots. Red partition uses one CPU, 4 GByte memory, 3 PCI slots. Yellow partition uses one CPU, 4 GByte memory, 3 PCI slots. unbound CPUs can dynamically move between vPars without reboot! Example: CPU P1 (an unbound CPU in this case) can move from blue to red partition when the red partition needs more processing power.

customer example-- vpartitions to isolate development environments Partition 1 32-way Superdome Partition 2 Partition 3 Partition 4 Partition 5 Oracle Database Content Mgmt Development Project 1 (vPar 1) Development Project 2 (vPar 2) Development Project 3 (vPar 3) Development Project 4 (vPar 4) Significant development productivity boost

Virtual or hard partitions—a partition performance payoff partition1 application 1 & HP-UX 1 are optimally tuned for each other HP-UX1 app1 HP-UX2 app2 tuned pair Unix likes to be tuned for a single application application 2 & HP-UX 2 are optimally tuned for each other tuned pair Partition 2

Actual customer example of performance improvement with hard partitions one large partition 1, 347 tx/sec 12% improvement by partitioning the system. 4 hard partitions 1,506 tx/sec large Asian bank running a ISV banking package Original 550 MHz PA8600

Virtual partitions for increased performance— dynamic CPU migration vp1 4 cpus vp2 4 cpus 40% util. HP-UX1 app1 85% util. HP-UX2 app2 vp1 3 cpus vp2 5 cpus 50% util. HP-UX1 app1 70% util. HP-UX2 app2 Vpars can Cross cell boundaries % utilization response time 85%70% CPU moved Before After

Performance optimization application Oracle file system HP-UX kernel parameters patches Oracle file system HP-UX kernel parameters patches Partition 1 Partition 2 applications Tools: IOzone Glance Plus -application tuning -kernel tuning for most common environment -optimize the file system if I/O intensive multiple copies of apps if they don’t scale to larger no. of cpus - use partitions to allow HP-UX to be ideally tuned to the application -use patches - PHCO_ PHKL_26468 for POSIX threads apps for big payoff

Performance boost with faster processors while protecting investment partition 1 12 cpus partition 2 8 cpus partition 3 8 cpus cell 1cell 2cell 3cell 4cell 5cell 6cell 7 pa-8600 pa-8700 pa partition 1: keep pa-8600s for investment protection and use this partition for non performance sensitive applications partition 2: use for medium performance sensitive applications partition 3: upgrade to pa for performance demanding applications can upgrade to pa on-line, one partition at a time, so applications running in other partitions can keep running.

iCOD (instant capacity on demand) Two types of iCOD –Permanent –Temporary capacity Active CPU Active CPU Active CPU iCOD CPU Cell board iCOD standby CPU activate when needed new

iCOD temporary capacity provides a performance boost when needed –End of month/quarter/year –Unexpected peak in demand, activate by command –During development stress testing etc. etc. Extremely flexible—spread out the “30 CPU- days” Moves closer to the “computing utility” concept capacity

add CPUs when needed for performance—iCOD Temporary Capacity No. of CPUs No. of Days 1 CPU for 5 days (5 CPU-days) 3 CPUs for 7 days (21 CPU-days) 5 CPUs for 2 days (10 CPU-days) iCOD CPUs are now available in 30 CPU-day “chunks” 60 cpu-days of processing power to start left

Dynamic CPU Allocation Capabilities Virtual partitions –Dynamically move unbound CPUs between partitions w/o reboot—by command or automatically via WLM iCOD (instant capacity on demand) –Dynamically activate a CPU w/o reboot by Simple command... or...a script... or Automatically by goal based WLM when more resources are needed such as to maintain a SLA of 2 second response times –For load balancing, can activate CPU in one hard partition without reboot and deactivate CPU in another hard partition for no charge

Dynamic CPU Allocation Capabilities VP = virtual partition iCPU = iCOD standby processor aCPU = active processor Partition 1Partition 2Partition 3Partition 4 VP1 VP2 CPU iCPU aCPU Active Inactive dynamically move unbound CPUs across virtual partitions (no reboot required) “classic” iCOD example: activate iCOD CPU by 1)command or 2)automatically by WLM to meet response time goal (goal based WLM) (no reboot required) load balancing example - activate iCPU processor in one partition (3) -deactivate aCPU processor in another partition (4) - no charge (no reboot required) virtual partition example: Superdome

Dynamic CPU Allocation Capabilities Utility pricing –Can activate and deactivate CPUs w/o reboot to meet usage demands and save money Deallocation of “misbehaving” CPUs –automatically deactivate one or more “troubled” CPUs and keep the application running w/o reboot. (CPU granularity of 1, not Sun’s 4) A: later replace CPU and reboot at a convenient time... or... B: with iCOD, automatically replace the deactivated CPU immediately with out a charge & without a reboot Hard partitions Today: Can add CPUs by adding a new hard partition. Today: can remove cells from one partition and add cells to another partition without rebooting any of the non affected partitions (dynamic Npars) Future: can add or delete cells (CPUs) on line (cell on line add & delete)

Dynamic CPU Allocation Capabilities VP = virtual partition iCPU = iCOD standby processor aCPU = active processor Partition 1Partition 2 Partition 3Partition 5 (new) aCPU Active Inactive iCPU end of month: iCPU gets activated for heavy processing loads middle of month: aCPU gets deactivated CPU replacement example: - aCPU2 “misbehaves” and is automatically deallocated - iCPU CPU is automatically activated to replace it - no charge “dynamic” cell example: -“dynamically “allocate cells between partitions (Only requires reboot of partitions involved. The rest of Superdome continues to run.) utility pricing example: aCPU1 iCPU aCPU2 Partition 4 cells aCPU iCPU new partition example: “dynamically” add CPUs by adding a new partition (requires initial boot of only the new partition) Superdome Active Inactive (no reboot required)

Pay for use (utility concept) activate CPUs when needed for performance and deactivate when not needed pay based on average no. of CPUs turned on per month great for peak periods –Tax time –End of month/quarter/yr. –Election time –Holidays –Noon/mid morning helps smooth out IT headaches

metering server usage Utility customer’s invoice is based on monthly average

Future Itanium based Superdome (2003) HP-UX hard partition Windows Data Center hard partition Linux hard partition application1 data base e.g. Oracle fibre channel driver1 application2 data base e.g. SQL Server fibre channel driver2 application3 data base e.g. Oracle or MySQL fibre channel driver3 LAN card 100BT or 1000BT LAN for performance optimization can pick the best application and operating system combination

Future Itanium based Superdome (2003) HP-UX hard partition Windows Data Center hard partition Linux hard partition application1 data base e.g. Oracle fibre channel driver1 application2 data base e.g. SQL Server fibre channel driver2 application3 data base e.g. Oracle or MySQL fibre channel driver3 LAN card 100BT or 1000BT LAN The data base and data base performance most likely will be different.

Summary Superdome has many opportunities for increased performance –Normal tuning and optimization with only one partition –Normal tuning and optimization within each partition –Use of partitions in general –iCOD addition of cpus –Dynamic allocation of cpus –Utility cpus on demand –Addition of a faster cpu (PA8700+) partition

Attachment—detailed tuning tips Goal: these tuning tips are intended to be a short, simplified (not a whole chapter) list of some “high payoff” tunes. Three areas –Application tuning –Kernel tuning –System tuning

Application tuning o Higher compiler optimization levels are not always the optimum for all applications. Profile (use prof, gprof and Caliper) and use a balanced set of optimization flags for best performance. o If aggressive optimizations break you applications, identify the offending routine (by binary search) and compile them at lower optimizations. o Some key compiler and linker flags worth trying out are: +O3, +Odataprefetch, +Onolimit, +Olibcalls, +FPD (read more about by "man f90", "man cc", "man aCC" etc). o Build your executable using archived libraries as much as possible. That means link with "-Wl,-archive" or "-Wl,-archive_shared". o Build and run your applications in 32bit address space unless you have a need to go to 64 bit. o Set initial data page size to the most appropriate page size using "chatr" command. Read more about by "man chatr". o For parallel applications, pay special attention to compiler and linker environments to build the most efficient parallel code.

Kernel tuning o Investigate and optimize the kernel tuning parameters which are most optimum for most of the applications. o For I/O intensive application, set large buffer cache by a static buffer cache model (set nbuf and bufpagaes to non-zero values) than dynamic model (set dbc_min_pct, dbc_max_pct as % of memory). o If your runtime environment consists of large I/O intensive processes, set maxfiles, maxfiles_lim and nfile to a large value. o Make sure the swchunk and maxswapchunks are set large enough to have sufficient swap space. If you cannot for some reason, set swapmem_on=1 to turn on pseudo swap feature. o For pthread based parallel applications, you may have to increase nkthread parameter appropriately. o For shared memory parallel applications, set appropriate (high) values for shmmax, shmmni and shmseg.

System tuning o Install, and periodically update all the OS patches to get the best performance. o Pay special attention to specific patches for your environment. For example, pthread based applications will show significant boost in performance with patches, PHCO_26466 and PHKL_ o Configure your system with enough swapspace. Swap should be more than (or at least equal to) the physical memory. Also, distribute the swap space evenly across drives and controllers. o For I/O intensive applications, build, test and optimize the file systems with most suitable mount options. Test using applications such as IOZONE or BONNIE. o Investigate system wide, CPU, Memory and I/O bottlenecks using HP tools such as GlancePlus, TUSC, SAR and VMSTTAT.