FermiGrid Highly Available Grid Services Eileen Berman, Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.

Slides:



Advertisements
Similar presentations
PRAGMA Application (GridFMO) on OSG/FermiGrid Neha Sharma (on behalf of FermiGrid group) Fermilab Work supported by the U.S. Department of Energy under.
Advertisements

Dec 14, 20061/10 VO Services Project – Status Report Gabriele Garzoglio VO Services Project WBS Dec 14, 2006 OSG Executive Board Meeting Gabriele Garzoglio.
The FermiGrid Software Acceptance Process aka “So you want me to run your software in a production environment?” Keith Chadwick Fermilab
Mecanismos de alta disponibilidad con Microsoft SQL Server 2008 Por: ISC Lenin López Fernández de Lara.
Implementing Finer Grained Authorization in the Open Science Grid Gabriele Carcassi, Ian Fisk, Gabriele, Garzoglio, Markus Lorch, Timur Perelmutov, Abhishek.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
June 23rd, 2009Inflectra Proprietary InformationPage: 1 SpiraTest/Plan/Team Deployment Considerations How to deploy for high-availability and strategies.
1 Week #1 Objectives Review clients, servers, and Windows network models Differentiate among the editions of Server 2008 Discuss the new Windows Server.
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
Keith Burns Microsoft UK Mission Critical Database.
Lesson 1: Configuring Network Load Balancing
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
National Manager Database Services
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
OSG Public Storage and iRODS
The Fermilab Campus Grid (FermiGrid) Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
Virtualization within FermiGrid Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Metrics and Monitoring on FermiGrid Keith Chadwick Fermilab
Fermilab Site Report Spring 2012 HEPiX Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Mine Altunay July 30, 2007 Security and Privacy in OSG.
Overview of Privilege Project at Fermilab (compilation of multiple talks and documents written by various authors) Tanya Levshina.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
4/25/2006Condor Week 1 FermiGrid Steven Timm Fermilab Computing Division Fermilab Grid Support Center.
Metrics and Monitoring on FermiGrid Keith Chadwick Fermilab
VO Privilege Activity. The VO Privilege Project develops and implements fine-grained authorization to grid- enabled resources and services Started Spring.
High Availability in DB2 Nishant Sinha
70-412: Configuring Advanced Windows Server 2012 services
High Availability Technologies for Tier2 Services June 16 th 2006 Tim Bell CERN IT/FIO/TSI.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
An Introduction to Campus Grids 19-Apr-2010 Keith Chadwick & Steve Timm.
LHC Logging Cluster Nilo Segura IT/DB. Agenda ● Hardware Components ● Software Components ● Transparent Application Failover ● Service definition.
FermiGrid Keith Chadwick. Overall Deployment Summary 5 Racks in FCC:  3 Dell Racks on FCC1 –Can be relocated to FCC2 in FY2009. –Would prefer a location.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
The Fermilab Campus Grid (FermiGrid) Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Fermilab / FermiGrid / FermiCloud Security Update Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359 Keith Chadwick Grid.
FermiGrid Keith Chadwick Fermilab Computing Division Communications and Computing Fabric Department Fabric Technology Projects Group.
Development of the Fermilab Open Science Enclave Policy and Baseline Keith Chadwick Fermilab Work supported by the U.S. Department of.
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
April 18, 2006FermiGrid Project1 FermiGrid Project Status April 18, 2006 Keith Chadwick.
OSG Facility Miron Livny OSG Facility Coordinator and PI University of Wisconsin-Madison Open Science Grid Scientific Advisory Group Meeting June 12th.
FermiGrid Virtualization and Xen Steven Timm Feb 28, 2008 Fermilab Computing Techniques Seminar.
FermiGrid The Fermilab Campus Grid 28-Oct-2010 Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
MySQL HA An overview Kris Buytaert. ● Senior Linux and Open Source ● „Infrastructure Architect“ ● I don't remember when I started.
FermiGrid - PRIMA, VOMS, GUMS & SAZ Keith Chadwick Fermilab
Servizi core INFN Grid presso il CNAF: setup attuale
Failover and High Availability
High Availability 24 hours a day, 7 days a week, 365 days a year…
High Availability Linux (HA Linux)
FermiGrid - PRIMA, VOMS, GUMS & SAZ
f f FermiGrid – Site AuthoriZation (SAZ) Service
Troubleshooting Network Communications
Maximum Availability Architecture Enterprise Technology Centre.
VceTests VCE Test Dumps
Conditions Data access using FroNTier Squid cache Server
Introduction to Networks
SpiraTest/Plan/Team Deployment Considerations
AWS Cloud Computing Masaki.
Client/Server Computing and Web Technologies
Presentation transcript:

FermiGrid Highly Available Grid Services Eileen Berman, Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.

Apr 11, 2008FermiGrid-HA1 Outline FermiGrid - Architecture & Performance FermiGrid-HA - Why? FermiGrid-HA - Requirements & Challenges FermiGrid-HA - Implementation Future Work Conclusions

Apr 11, 2008FermiGrid-HA2 FERMIGRID SE (dcache SRM) Gratia BlueArc FermiGrid - Architecture (mid 2007) CMS WC2 CDF OSG1 CDF OSG2 D0 CAB1 GP Farm VOMS Server SAZ Server GUMS Server Step 2 - user issues voms-proxy-init user receives voms signed credentials Step 3 – user submits their grid job via globus-job-run, globus-job-submit, or condor-g Step 5 – Gateway requests GUMS Mapping based on VO & Role Step 4 – Gateway checks against Site Authorization Service clusters send ClassAds via CEMon to the site wide gateway Step 6 - Grid job is forwarded to target cluster Periodic Synchronization D0 CAB2 Site Wide Gateway Exterior Interior CMS WC1 CMS WC3 VOMRS Server Periodic Synchronization Step 1 - user registers with VO GP MPI

Apr 11, 2008FermiGrid-HA3 FermiGrid-HA - Why? The FermiGrid “core” services (VOMS, GUMS & SAZ) control access to:  Over 2,500 systems with more than 12,000 batch slots (and growing!).  Petabytes of storage (via gPlazma / GUMS). An outage of VOMS can prevent a user from being able to submit “jobs”. An outage of either GUMS or SAZ can cause 5,000 to 50,000 “jobs” to fail for each hour of downtime. Manual recovery or intervention for these services can have long recovery times (best case 30 minutes, worst case multiple hours). Automated service recovery scripts can minimize the downtime (and impact to the Grid operations), but still can have several tens of minutes response time for failures:  How often the scripts run,  Scripts can only deal with failures that have known “signatures”,  Startup time for the service,  A script cannot fix dead hardware.

Apr 11, 2008FermiGrid-HA4 FermiGrid-HA - Requirements Requirements:  Critical services hosted on multiple systems (n ≥ 2).  Small number of “dropped” transactions when failover required (ideally 0).  Support the use of service aliases: –VOMS:fermigrid2.fnal.gov->voms.fnal.gov –GUMS:fermigrid3.fnal.gov->gums.fnal.gov –SAZ:fermigrid4.fnal.gov->saz.fnal.gov  Implement “HA” services with services that did not include “HA” in their design. –Without modification of the underlying service. Desirables:  Active-Active service configuration.  Active-Standby if Active-Active is too difficult to implement.  A design which can be extended to provide redundant services.

Apr 11, 2008FermiGrid-HA5 FermiGrid-HA - Challenges #1 Active-Standby:  Easier to implement,  Can result in “lost” transactions to the backend databases,  Lost transactions would then result in potential inconsistencies following a failover or unexpected configuration changes due to the “lost” transactions. –GUMS Pool Account Mappings. –SAZ Whitelist and Blacklist changes. Active-Active:  Significantly harder to implement (correctly!).  Allows a greater “transparency”.  Reduces the risk of a “lost” transaction, since any transactions which results in a change to the underlying MySQL databases are “immediately” replicated to the other service instance.  Very low likelihood of inconsistencies. –Any service failure is highly correlated in time with the process which performs the change.

Apr 11, 2008FermiGrid-HA6 FermiGrid-HA - Challenges #2 DNS:  Initial FermiGrid-HA design called for DNS names each of which would resolve to two (or more) IP numbers.  If a service instance failed, the surviving service instance could restore operations by “migrating” the IP number for the failed instance to the Ethernet interface of the surviving instance.  Unfortunately, the tool used to build the DNS configuration for the Fermilab network did not support DNS names resolving to >1 IP numbers. –Back to the drawing board. Linux Virtual Server (LVS):  Route all IP connections through a system configured as a Linux virtual server. –Direct routing –Request goes to LVS director, LVS director redirects the packets to the real server, real server replies directly to the client.  Increases complexity, parts and system count: –More chances for things to fail.  LVS director must be implemented as a HA service. –LVS director implemented as an Active-Standby HA service.  LVS director performs “service pings” every six (6) seconds to verify service availability. –Custom script that uses curl for each service.

Apr 11, 2008FermiGrid-HA7 FermiGrid-HA - Challenges #3 MySQL databases underlie all of the FermiGrid-HA Services (VOMS, GUMS, SAZ):  Fortunately all of these Grid services employ relatively simple database schema,  Utilize multi-master MySQL replication, –Requires MySQL 5.0 (or greater). –Databases perform circular replication.  Currently have two (2) MySQL databases, –MySQL 5.0 circular replication has been shown to scale up to ten (10). –Failed databases “cut” the circle and the database circle must be “retied”.  Transactions to either MySQL database are replicated to the other database within 1.1 milliseconds (measured),  Tables which include auto incrementing column fields are handled with the following MySQL 5.0 configuration entries: –auto_increment_offset (1, 2, 3, … n) –auto_increment_increment (10, 10, 10, … )

Apr 11, 2008FermiGrid-HA8 FermiGrid-HA - Technology Xen:  SL Xen (from xensource community version) –64 bit Xen Domain 0 host, 32 and 64 bit Xen VMs  Paravirtualisation. Linux Virtual Server (LVS 1.38):  Shipped with Piranha V0.8.4 from Redhat. Grid Middleware:  Virtual Data Toolkit (VDT 1.8.1)  VOMS V1.7.20, GUMS V1.2.10, SAZ V1.9.2 MySQL:  MySQL V5 with multi-master database replication.

Apr 11, 2008FermiGrid-HA9 Replication FermiGrid-HA - Component Design LVS Standby VOMS Active VOMS Active GUMS Active GUMS Active SAZ Active SAZ Active MySQL Active MySQL Active LVS Active Heartbeat LVS Standby LVS Active Heartbeat Client

Apr 11, 2008FermiGrid-HA10 FermiGrid-HA - Client Communication 1.Client starts by making a standard request for the desired grid service (voms, gums or saz) using the corresponding service “alias” voms=voms.fnal.gov, gums=gums.fnal.gov, saz=saz.fnal.gov, fg-mysql.fnal.gov 2.The active LVS director receives the request, and based on the currently available servers and load balancing algorithm, chooses a “real server” to forward the grid service request to, specifying a respond to address of the original client. voms=fg5x1.fnal.gov, fg6x1.fnal.gov gums=fg5x2.fnal.gov, fg6x2.fnal.gov saz=fg5x3.fnal.gov, fg6x3.fnal.gov 3.The “real server” grid service receives the request, and makes the corresponding query to the mysql database on fg-mysql.fnal.gov (through the LVS director). 4.The active LVS director receives the mysql query request to fg-mysql.fnal.gov, and based on the currently available mysql servers and load balancing algorithm, chooses a “real server” to forward the mysql request to, specifying a respond to address of the service client. mysql=fg5x4.fnal.gov, fg6x4.fnal.gov 5.At this point the selected mysql server performs the requested database query and returns the results to the grid service. 6.The selected grid service then returns the appropriate results to the original client.

Apr 11, 2008FermiGrid-HA11 FermiGrid-HA - Cleint Communication Animation Replication LVS Standby VOMS Active VOMS Active GUMS Active GUMS Active SAZ Active SAZ Active MySQL Active MySQL Active LVS Active Heartbeat Client

Apr 11, 2008FermiGrid-HA12 FermiGrid-HA - Host Configuration The fermigrid5&6 Xen hosts are Dell 2950 systems. Each of the Dell 2950s are configured with:  Two 3.0 GHz core 2 duo processors (total 4 cores).  16 Gbytes of RAM.  Raid-1 system disks (2 x 147 Gbytes, 10K RPM, SAS).  Raid-1 non-system disks (2 x 147 Gbytes, 10K RPM, SAS).  Dual 1 Gig-E interfaces: –1 connected to public network, –1 connected to private network. System Software Configuration:  Each Domain 0 system is configured with 5 Xen VMs. –Previously we had 4 Xen VMs.  Each Xen VM, dedicated to running a specific service: –LVS Director, VOMS, GUMS, SAZ, MySQL –Previously we were running the LVS director in the Domain-0.

Apr 11, 2008FermiGrid-HA13 FermiGrid-HA - Actual Component Deployment Activefermigrid5 Xen Domain 0 Activefermigrid6 Xen Domain 0 Activefg5x1 VOMS Xen VM 1 Activefg5x2 GUMS Xen VM 2 Activefg5x3 SAZ Xen VM 3 Activefg5x4 MySQL Xen VM 4 Activefg5x1 LVS Xen VM 0 Activefg5x1 VOMS Xen VM 1 Activefg5x2 GUMS Xen VM 2 Activefg5x3 SAZ Xen VM 3 Activefg5x4 MySQL Xen VM 4 Standbyfg5x1 LVS Xen VM 0

Apr 11, 2008FermiGrid-HA14 FermiGrid-HA - Performance Stress tests of the FermiGrid-HA GUMS deployment:  A stress test demonstrated that this configuration can support ~9.7M mappings/day. –The load on the GUMS VMs during this stress test was ~9.5 and the CPU idle time was 15%. –The load on the backend MySQL database VM during this stress test was under 1 and the CPU idle time was 92%. Stress tests of the FermiGrid-HA SAZ deployment:  The SAZ stress test demonstrated that this configuration can support ~1.1M authorizations/day. –The load on the SAZ VMs during this stress test was ~12 and the CPU idle time was 0%. –The load on the backend MySQL database VM during this stress test was under 1 and the CPU idle time was 98%. Stress tests of the combined FermiGrid-HA GUMS and SAZ deployment:  Using a GUMS:SAZ call ratio of ~7:1  The combined GUMS-SAZ stress test which was performed demonstrated that this configuration can support ~6.5 GUMS mappings/day and ~900K authorizations/day. –The load on the SAZ VMs during this stress test was ~12 and the CPU idle time was 0%.

Apr 11, 2008FermiGrid-HA15 FermiGrid-HA - Production Deployment FermiGrid-HA was deployed in production on 03-Dec  In order to allow an adiabatic transition for the OSG and our user community, we ran the regular FermiGrid services and FermiGrid-HA services simultaneously for a three month period (which ended on 29- Feb-2008). We have already utilized the HA service redundancy on several occasions:  1 operating system “wedge” of the Domain-0 hypervisor together with a “wedged” Domain-U VM that required a reboot of the hardware to resolve.  multiple software updates.  Without any user impact!!!

Apr 11, 2008FermiGrid-HA16 VOMS Server SAZ Server GUMS Server FERMIGRID SE (dcache SRM) Gratia BlueArc FermiGrid - Current Architecture CMS WC2 CDF OSG1 CDF OSG2 D0 CAB1 GP Farm SAZ Server GUMS Server Step 2 - user issues voms-proxy-init user receives voms signed credentials Step 3 – user submits their grid job via globus-job-run, globus-job-submit, or condor-g Step 5 – Gateway requests GUMS Mapping based on VO & Role Step 4 – Gateway checks against Site Authorization Service clusters send ClassAds via CEMon to the site wide gateway Step 6 - Grid job is forwarded to target cluster Periodic Synchronization D0 CAB2 Site Wide Gateway Exterior Interior CMS WC1 CMS WC3 VOMRS Server Periodic Synchronization Step 1 - user registers with VO GP MPI VOMS Server CDF OSG3/4

Apr 11, 2008FermiGrid-HA17 FermiGrid-HA - Future Work Over the next three to four months we will be deploying “HA” instances of our other services:  Squid, MyProxy (with DRDB), Syslog-Ng, Ganglia, and others. Redundant side wide gatekeeper:  We have a preliminary “Gatekeeper-HA” design  Based on the “manual” procedure to keep jobs alive during OSG to upgrade.  We expect that this should keep Globus and Condor jobs running. We also plan to install a test gatekeeper that will be configured to receive Xen VMs as Grid jobs and execute them:  This a test of a possible future dynamic “VOBox” or “Edge Service” capability within FermiGrid.

Apr 11, 2008FermiGrid-HA18 FermiGrid-HA - Conclusions Virtualisation benefits: +Significant performance increase, +Significant reliability increase, +Automatic service failover, +Cost savings, +Can be scaled as the load and the reliability needs increase, +Can perform “live” software upgrades and patches without client impact. Virtualisation drawbacks: -More complex design, -More “moving parts”, -More opportunities for things to fail, -More items that need to be monitored.

Apr 11, 2008FermiGrid-HA19 Fin Any Questions?