FermiCloud Project Overview and Progress Keith Chadwick Grid & Cloud Computing Department Head Fermilab Work supported by the U.S. Department of Energy.

Slides:



Advertisements
Similar presentations
Fermilab, the Grid & Cloud Computing Department and the KISTI Collaboration GSDC - KISTI Workshop Jangsan-ri, Nov 4, 2011 Gabriele Garzoglio Grid & Cloud.
Advertisements

IPv6 at Fermilab Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Dec 14, 20061/10 VO Services Project – Status Report Gabriele Garzoglio VO Services Project WBS Dec 14, 2006 OSG Executive Board Meeting Gabriele Garzoglio.
ANTHONY TIRADANI AND THE GLIDEINWMS TEAM glideinWMS in the Cloud.
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Assessment of Core Services provided to USLHC by OSG.
F Run II Experiments and the Grid Amber Boehnlein Fermilab September 16, 2005.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
Virtualization Lab 3 – Virtualization Fall 2012 CSCI 6303 Principles of I.T.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Copyright 2009 Fujitsu America, Inc. 0 Fujitsu PRIMERGY Servers “Next Generation HPC and Cloud Architecture” PRIMERGY CX1000 Tom Donnelly April
Introduction to Cloud Computing
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Grids, Clouds and the Community. Cloud Technology and the NGS Steve Thorn Edinburgh University Matteo Turilli, Oxford University Presented by David Fergusson.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
Virtualization within FermiGrid Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Fermilab Site Report Spring 2012 HEPiX Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Commodity Node Procurement Process Task Force: Status Stephen Wolbers Run 2 Computing Review September 13, 2005.
BNL Tier 1 Service Planning & Monitoring Bruce G. Gibbard GDB 5-6 August 2006.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Cloud Computing Lecture 5-6 Muhammad Ahmad Jan.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
Sep 25, 20071/5 Grid Services Activities on Security Gabriele Garzoglio Grid Services Activities on Security Gabriele Garzoglio Computing Division, Fermilab.
An Introduction to Campus Grids 19-Apr-2010 Keith Chadwick & Steve Timm.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
FermiGrid Keith Chadwick. Overall Deployment Summary 5 Racks in FCC:  3 Dell Racks on FCC1 –Can be relocated to FCC2 in FY2009. –Would prefer a location.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Improving resilience of T0 grid services Manuel Guijarro.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Auxiliary services Web page Secrets repository RSV Nagios Monitoring Ganglia NIS server Syslog Forward FermiCloud: A private cloud to support Fermilab.
Fermilab / FermiGrid / FermiCloud Security Update Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359 Keith Chadwick Grid.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
Authentication, Authorization, and Contextualization in FermiCloud S. Timm, D. Yocum, F. Lowe, K. Chadwick, G. Garzoglio, D. Strain, D. Dykstra, T. Hesselroth.
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
Development of the Fermilab Open Science Enclave Policy and Baseline Keith Chadwick Fermilab Work supported by the U.S. Department of.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
April 18, 2006FermiGrid Project1 FermiGrid Project Status April 18, 2006 Keith Chadwick.
Hao Wu, Shangping Ren, Gabriele Garzoglio, Steven Timm, Gerard Bernabeu, Hyun Woo Kim, Keith Chadwick, Seo-Young Noh A Reference Model for Virtual Machine.
CD FY10 Budget and Tactical Plan Review FY10 Tactical Plans for Scientific Computing Facilities / General Physics Computing Facility (GPCF) Stu Fuess 06-Oct-2009.
FermiCloud Update ISGC 2013 Keith Chadwick Grid & Cloud Computing Department Fermilab Work supported by the U.S. Department of Energy under contract No.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is.
FermiCloud Review Response to Questions Keith Chadwick Steve Timm Gabriele Garzoglio Work supported by the U.S. Department of Energy under contract No.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
FermiGrid The Fermilab Campus Grid 28-Oct-2010 Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
July 18, 2011S. Timm FermiCloud Enabling Scientific Computing with Integrated Private Cloud Infrastructures Steven Timm.
Introduction to Distributed Platforms
Ian Bird GDB Meeting CERN 9 September 2003
StratusLab Final Periodic Review
StratusLab Final Periodic Review
f f FermiGrid – Site AuthoriZation (SAZ) Service
Introduction to Cloud Computing
Presentation transcript:

FermiCloud Project Overview and Progress Keith Chadwick Grid & Cloud Computing Department Head Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359

Outline Experience with FermiGrid Drivers for FermiCloud FermiCloud Project Program of Work Requirements & Technology Evaluation Security Current FermiCloud Capabilities Current FermiCloud Stakeholders Effort Summary and Acknowledgements 5-Feb-2013FermiCloud Review - Project Overview1

5-Feb-2013FermiCloud Review - Project Overview2 On November 10, 2004, Vicky White (then Fermilab CD Head) wrote the following: In order to better serve the entire program of the laboratory the Computing Division will place all of its production resources in a Grid infrastructure called FermiGrid. This strategy will continue to allow the large experiments who currently have dedicated resources to have first priority usage of certain resources that are purchased on their behalf. It will allow access to these dedicated resources, as well as other shared Farm and Analysis resources, for opportunistic use by various Virtual Organizations (VOs) that participate in FermiGrid (i.e. all of our lab programs) and by certain VOs that use the Open Science Grid. The strategy will allow us: to optimize use of resources at Fermilab to make a coherent way of putting Fermilab on the Open Science Grid to save some effort and resources by implementing certain shared services and approaches to work together more coherently to move all of our applications and services to run on the Grid to better handle a transition from Run II to LHC (and eventually to BTeV) in a time of shrinking budgets and possibly shrinking resources for Run II worldwide to fully support Open Science Grid and the LHC Computing Grid and gain positive benefit from this emerging infrastructure in the US and Europe.

FermiGrid – A “Quick” History In 2004, the Fermilab Computing Division undertook the strategy of placing all of its production resources in a Grid "meta-facility" infrastructure called FermiGrid (the Fermilab Campus Grid). In April 2005, the first “core” FermiGrid services were deployed. In 2007, the FermiGrid-HA project was commissioned with the goal of delivering % core service availability (not including building or network outages). During 2008, all of the FermiGrid services were redeployed in Xen virtual machines. 5-Feb-2013FermiCloud Review - Project Overview3

FermiGrid-HA2 Experience In 2009, based on operational experience and plans for redevelopment of the FCC-1 computer room, the FermiGrid-HA2 project was established to split the set of FermiGrid services across computer rooms in two separate buildings (FCC-2 and GCC-B). This project was completed on 7-Jun-2011 (and tested by a building failure less than two hours later). FermiGrid-HA2 worked exactly as designed. Our operational experience with FermiGrid-HA and FermiGrid- HA2 has shown the benefits of virtualization and service redundancy. Benefits to the user community – increased service reliability and uptime. Benefits to the service maintainers – flexible scheduling of maintenance and upgrade activities. 5-Feb-2013FermiCloud Review - Project Overview4

What Environment Do We Operate In? RiskMitigation Service Failure Redundant copies of Service (with service monitoring). System Failure Redundant copies of Service (on separate systems with service monitoring). Network Failure Redundant copies of Service (on separate systems with independent network connections and service monitoring). Building Failure Redundant copies of Service (on separate systems with independent network connections in separate buildings and service monitoring). FermiGrid-HA – Deployed in 2007 FermiGrid-HA2 – Deployed in Feb-20135FermiCloud Review - Project Overview

FermiGrid Service Availability (measured over the past year) Service Raw Availability HA Configuration Measured HA Availability Minutes of Downtime VOMS – VO Management Service %Active-Active99.883%600 GUMS – Grid User Mapping Service %Active-Active %0 SAZ – Site AuthoriZation Service %Active-Active %0 Squid – Web Cache99.817%Active-Active99.988%60 MyProxy – Grid Proxy Service99.751%Active-Standby99.874%600 ReSS – Resource Selection Service %Active-Active99.988%60 Gratia – Fermilab and OSG Accounting %Active-Standby99.964%180 MySQL Database99.937%Active-Active %0 5-Feb-20136FermiCloud Review - Project Overview

Experience with FermiGrid = Drivers for FermiCloud 5-Feb-2013FermiCloud Review - Project Overview7 Access to pools of resources using common interfaces: Monitoring, quotas, allocations, accounting, etc. Opportunistic access: Users can use the common interfaces to “burst” to additional resources to meet their needs Efficient operations: Deploy common services centrally High availability services: Flexible and resilient operations

Additional Drivers for FermiCloud Existing development and integration (AKA the FAPL cluster) facilities were: Technically obsolescent and unable to be used effectively to test and deploy the current generations of Grid middleware. The hardware was over 8 years old and was falling apart. The needs of the developers and service administrators in the Grid and Cloud Computing Department for reliable and “at scale” development and integration facilities were growing. Operational experience with FermiGrid had demonstrated that virtualization could be used to deliver production class services. 5-Feb-2013FermiCloud Review - Project Overview8

Additional Developments Large multi-core servers have evolved from from 2 to 64 cores per box, A single “rogue” user/application can impact 63 other users/applications. Virtualization can provide mechanisms to securely isolate users/applications. Typical “bare metal” hardware has significantly more performance than usually needed for a single-purpose server, Virtualization can provide mechanisms to harvest/utilize the remaining cycles. Complicated software stacks are difficult to distribute on grid, Distribution of preconfigured virtual machines together with GlideinWMS can aid in addressing this problem. Large demand for transient development/testing/integration work, Virtual machines are ideal for this work. Science is increasingly turning to complex, multiphase workflows. Virtualization coupled with cloud can provide the ability to flexibly reconfigure hardware “on demand” to meet the changing needs of science. 5-Feb-2013FermiCloud Review - Project Overview9

FermiCloud – Hardware Acquisition In the FY2009 budget input cycle (in Jun-Aug 2008), based on the experience with the “static” virtualization in FermiGrid, the (then) Grid Facilities Department proposed that hardware to form a “FermiCloud” infrastructure be purchased. This was not funded for FY2009. Following an effort to educate the potential stakeholder communities during FY2009, the hardware request was repeated in the FY2010 budget input cycle (Jun-Aug 2009), and this time it was funded. The hardware specifications were developed in conjunction with the GPCF acquisition, and the hardware was delivered in May Feb-2013FermiCloud Review - Project Overview10

FermiCloud – Initial Project Specifications Once funded, the FermiCloud Project was established with the goal of developing and establishing Scientific Cloud capabilities for the Fermilab Scientific Program, Building on the very successful FermiGrid program that supports the full Fermilab user community and makes significant contributions as members of the Open Science Grid Consortium. Reuse High Availabilty, AuthZ/AuthN, Virtualization from Grid In a (very) broad brush, the mission of the FermiCloud project is: To deploy a production quality Infrastructure as a Service (IaaS) Cloud Computing capability in support of the Fermilab Scientific Program. To support additional IaaS, PaaS and SaaS Cloud Computing capabilities based on the FermiCloud infrastructure at Fermilab. The FermiCloud project is a program of work that is split over several overlapping phases. Each phase builds on the capabilities delivered as part of the previous phases. 5-Feb-2013FermiCloud Review - Project Overview11

Overlapping Phases 5-Feb-2013FermiCloud Review - Project Overview12 Phase 1: “Build and Deploy the Infrastructure” Phase 1: “Build and Deploy the Infrastructure” Phase 2: “Deploy Management Services, Extend the Infrastructure and Research Capabilities” Phase 2: “Deploy Management Services, Extend the Infrastructure and Research Capabilities” Phase 3: “Establish Production Services and Evolve System Capabilities in Response to User Needs & Requests” Phase 3: “Establish Production Services and Evolve System Capabilities in Response to User Needs & Requests” Phase 4: “Expand the service capabilities to serve more of our user communities” Phase 4: “Expand the service capabilities to serve more of our user communities” Time Today

FermiCloud Phase 1: “Build and Deploy the Infrastructure” Specify, acquire and deploy the FermiCloud hardware, Establish initial FermiCloud requirements and selected the “best” open source cloud computing framework that met these requirements (OpenNebula), Deploy capabilities to meet the needs of the stakeholders (JDEM analysis development, Grid Developers and Integration test stands, Storage/dCache Developers, LQCD testbed). 5-Feb-2013FermiCloud Review - Project Overview13 Completed late 2010

FermiCloud-HA Upon delivery, the FermiCloud hardware was initially sited in two adjacent racks in the GCC-B computer room. The FY2011 “Worker Node” acquisition needed additional rack space in the GCC-B computer room, so the Grid & Cloud Computing Department agreed to move ~50% of FermiCloud from GCC-B to the (then) new FCC-3 computer room. GCC-B had several cooling related outages that affected FermiCloud during the summer of 2011, As we had learned from the operations of the FermiGrid services, having a high availability services infrastructure that is split across multiple buildings offers benefits to both the user and service administration communities, This move would allow us to eventually build in high availability and redundancy capabilities into the FermiCloud architecture. The systems were relocated in August 2011 and the process started of learning how to: Deploy the OpenNebula software in a HA configuration, Acquire and deploy a redundant HA SAN 5-Feb-2013FermiCloud Review - Project Overview14 Included as part of Phases 2&3

FermiCloud Phase 2: “Deploy Management Services, Extend the Infrastructure and Research Capabilities” Implement x509 based authentication (patches contributed back to OpenNebula project and are generally available in OpenNebula V3.2), Perform secure contextualization of virtual machines at launch, Perform virtualized filesystem I/O measurements, Develop (draft) economic model, Implement monitoring and accounting, Collaborate with KISTI personnel to demonstrate Grid and Cloud Bursting capabilities, Perform initial benchmarks of Virtualized MPI, Target “small” low-cpu-load servers such as Grid gatekeepers, forwarding nodes, small databases, monitoring, etc., Begin the hardware deployment of a distributed SAN, Investigate automated provisioning mechanisms (puppet & cobbler). 5-Feb-2013FermiCloud Review - Project Overview15 Completed late 2012

FermiCloud Phase 3: “Establish Production Services and Evolve System Capabilities in Response to User Needs & Requests” Deploy highly available 24x7 production services, -Both infrastructure and user services. Deploy puppet & cobbler in production, -Done. Develop and deploy real idle machine detection, -Idle VM detection tool written by summer student. Research possibilities for a true multi-user filesystem on top of a distributed & replicated SAN, -GFS2 on FibreChannel SAN across FCC3 and GCC-B. Live migration becomes important for this phase. -Manual migration has been used, Live migration is currently in test, Automatically triggered live migration yet to come. Formal ITIL Change Management “Go-Live”, -Have been operating under “almost” ITIL Change Management for the past several months. 5-Feb-2013FermiCloud Review - Project Overview16 Underway

FermiCloud Phase 4: “Expand the service capabilities to serve more of our user communities” Complete the deployment of the true multi-user filesystem on top of a distributed & replicated SAN, Demonstrate interoperability and federation: -Accepting VM's as batch jobs, -Interoperation with other Fermilab virtualization infrastructures (GPCF, VMware), -Interoperation with KISTI cloud, Nimbus, Amazon EC2, other community and commercial clouds, Participate in Fermilab 100 Gb/s network testbed, -Have just taken delivery of 10 Gbit/second cards. Perform more “Virtualized MPI” benchmarks and run some real world scientific MPI codes, -The priority of this work will depend on finding a scientific stakeholder that is interested in this capability. Reevaluate available open source Cloud computing stacks, -Including OpenStack, -We will also reevaluate the latest versions of Eucalyptus, Nimbus and OpenNebula. 5-Feb-2013FermiCloud Review - Project Overview17 Specifications Being Finalized Specifications Being Finalized

FermiCloud Phase 5 See the program of work in the (draft) Fermilab-KISTI FermiCloud CRADA, This phase will also incorporate any work or changes that arise out of the Scientific Computing Division strategy on Clouds and Virtualization, Additional requests and specifications are also being gathered! 5-Feb-2013FermiCloud Review - Project Overview18 Specifications Under Development Specifications Under Development

FermiCloud - Stack Evaluation The first order of business was the evaluation of the (then) available Cloud stacks: Eucalyptus Nimbus OpenNebula OpenStack was not yet “on the market”, so it was not included in the evaluations. We will evaluate OpenStack this year, We will also reevaluate the latest versions of Eucalyptus, Nimbus and OpenNebula. 5-Feb-2013FermiCloud Review - Project Overview19

Evaluation Criteria CriteriaMax PointsEucalyptusNimbusOpenNebula OS and Hypervisor Provisioning and Contextualization9488 Object Store Interoperability Functionality Accounting/Billing6113 Network Topology Clustered File Systems2022 Security Totals TOTAL Feb-2013FermiCloud Review - Project Overview20 See CD-DocDB #s 4195, 4276, 4536 for the details of this evaluation

Results of the FermiCloud Stack Evaluation OpenNebula was the clear “winner”, Biggest strengths: reliability of operations, flexibility in types of virtual machines you can make. Biggest lacks: X.509 authentication, HA configuration, Accounting. Nimbus was next, Biggest strength: Developers ~30 miles away at ANL. Biggest lacks: Poor documentation, uses SimpleCA rather than IGTF CAs for internal authentication. Eucalyptus was a distant third. Show-stoppers: reboot of head node lost all virtual machines, Intellectual Property policy on patches. 5-Feb-2013FermiCloud Review - Project Overview21

Security Main areas of cloud security development: X.509 Authentication/Authorization: X.509 Authentication written by T. Hesselroth, code submitted to and accepted by OpenNebula, publicly available since Jan Secure Contextualization: Secrets such as X.509 service certificates and Kerberos keytabs are not stored in virtual machines (See following talk for more details). Security Policy: A security taskforce met and delivered a report to the Fermilab Computer Security Board, recommending the creation of a new Cloud Computing Environment, now in progress. We also participated in the HEPiX Virtualisation Task Force, We respectfully disagree with the recommendations regarding VM endorsement. 5-Feb-2013FermiCloud Review - Project Overview22

Current FermiCloud Capabilities The current FermiCloud hardware capabilities include: Public network access via the high performance Fermilab network, -This is a distributed, redundant network (see next presentation). Private 1 Gb/sec network, -This network is bridged across FCC and GCC on private fiber (see next presentation), High performance Infiniband network to support HPC based calculations, -Performance under virtualization is ~equivalent to “bare metal” (see next presentation), -Currently split into two segments, -Segments could be bridged via Mellanox MetroX. Access to a high performance FibreChannel based SAN, -This SAN spans both buildings (see next presentation), Access to the high performance BlueArc based filesystems, -The BlueArc is located on FCC-2, Access to the Fermilab dCache and enStore services, -These services are split across FCC and GCC, Access to 100 Gbit Ethernet test bed in LCC (via FermiCloud integration nodes), -Intel 10 Gbit Ethernet converged network adapter X540-T1. 5-Feb-2013FermiCloud Review - Project Overview23

FermiCloud – FTE Months/Year YearManagementOperations Project + Development TOTAL to Date Average Feb-2013FermiCloud Review - Project Overview24

FermiCloud FTE Effort Plot 5-Feb-2013FermiCloud Review - Project Overview25

Current Stakeholders Grid & Cloud Computing Personnel, Run II – CDF & D0, Intensity Frontier Experiments, Cosmic Frontier (LSST), Korea Institute of Science & Technology Investigation (KISTI), Open Science Grid (OSG) software refactoring from pacman to RPM based distribution, Grid middleware development. 5-Feb-2013FermiCloud Review - Project Overview26

FermiCloud Project Summary - 1 Science is directly and indirectly benefiting from FermiCloud: CDF, D0, Intensity Frontier, Cosmic Frontier, CMS, ATLAS, Open Science Grid,… FermiCloud operates at the forefront of delivering cloud computing capabilities to support scientific research: By starting small, developing a list of requirements, building on existing Grid knowledge and infrastructure to address those requirements, FermiCloud has managed to deliver a production class Infrastructure as a Service cloud computing capability that supports science at Fermilab. FermiCloud has provided FermiGrid with an infrastructure that has allowed us to test Grid middleware at production scale prior to deployment. The Open Science Grid software team used FermiCloud resources to support their RPM “refactoring” and is currently using it to support their ongoing middleware development/integration. 5-Feb FermiCloud Review - Project Overview

FermiCloud Project Summary - 2 The FermiCloud collaboration with KISTI has leveraged the resources and expertise of both institutions to achieve significant benefits. vCluster has demonstrated proof of principal “Grid Bursting” using FermiCloud and Amazon EC2 resources. Using SRIOV drivers on FermiCloud virtual machines, MPI performance has been demonstrated to be >96% of the native “bare metal” performance. We are currently working on finalizing a CRADA with KISTI for future collaboration. 5-Feb FermiCloud Review - Project Overview

FermiCloud Project Summary - 3 FermiCloud personnel are currently working to deliver: The Phase 3 deliverables, including a SAN storage deployment that will offer a true multi-user filesystem on top of a distributed & replicated SAN, Finalize the Phase 4 specifications, Collaborate in the development of the Phase 5 specifications. The future is mostly cloudy. 5-Feb FermiCloud Review - Project Overview

Acknowledgements None of this work could have been accomplished without: The excellent support from other departments of the Fermilab Computing Sector – including Computing Facilities, Site Networking, and Logistics. The excellent collaboration with the open source communities – especially Scientific Linux and OpenNebula, As well as the excellent collaboration and contributions from KISTI. 5-Feb-2013FermiCloud Review - Project Overview30

Thank You! Any Questions?

Extra Slides Are present in the “Extra Slides Presentation”