OSG Campus Grids Dr. Sebastien Goasguen, Clemson University ____________________________.

Slides:



Advertisements
Similar presentations
Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
Advertisements

Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.
Bosco: Enabling Researchers to Expand Their HTC Resources The Bosco Team: Dan Fraser, Jaime Frey, Brooklin Gore, Marco Mambelli, Alain Roy, Todd Tannenbaum,
Campus High Throughput Computing (HTC) Infrastructures (aka Campus Grids) Dan Fraser OSG Production Coordinator Campus Grids Lead.
Open Science Grid Frank Würthwein UCSD. 2/13/2006 GGF 2 “Airplane view” of the OSG  High Throughput Computing — Opportunistic scavenging on cheap hardware.
© 2009 Wipro Ltd - Confidential Private Cloud at Wipro Cloud computing based on Condor.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
ProjectWise Virtualization Kevin Boland. What is Virtualization? Virtualization is a technique for deploying technologies. Virtualization creates a level.
Open Science Ruth Pordes Fermilab, July 17th 2006 What is OSG Where Networking fits Middleware Security Networking & OSG Outline.
Welcome to CW 2007!!!. The Condor Project (Established ‘85) Distributed Computing research performed by.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
April Open Science Grid Clemson Campus Grid Sebastien Goasguen School of Computing Clemson University, Clemson, SC.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
PCGRID ‘08 Workshop, Miami, FL April 18, 2008 Preston Smith Implementing an Industrial-Strength Academic Cyberinfrastructure at Purdue University.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
Grid Appliance – On the Design of Self-Organizing, Decentralized Grids David Wolinsky, Arjun Prakash, and Renato Figueiredo ACIS Lab at the University.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
Campus Grids Report OSG Area Coordinator’s Meeting Dec 15, 2010 Dan Fraser (Derek Weitzel, Brian Bockelman)
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
SG - OSG Improving Campus Research CI Through Leveraging and Integration: Developing a SURAgrid-OSG Collaboration John McGee, RENCI/OSG Engagement Coordinator.
From Virtualization Management to Private Cloud with SCVMM 2012 Dan Stolts Sr. IT Pro Evangelist Microsoft Corporation
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Use of Condor on the Open Science Grid Chris Green, OSG User Group / FNAL Condor Week, April
Grid Computing at The Hartford Condor Week 2008 Robert Nordlund
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
The ILC And the Grid Andreas Gellrich DESY LCWS2007 DESY, Hamburg, Germany
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Data Intensive Science Network (DISUN). DISUN Started in May sites: Caltech University of California at San Diego University of Florida University.
Purdue Campus Grid Preston Smith Condor Week 2006 April 24, 2006.
Partnerships & Interoperability - SciDAC Centers, Campus Grids, TeraGrid, EGEE, NorduGrid,DISUN Ruth Pordes Fermilab Open Science Grid Joint Oversight.
Open Science Grid For CI-Days NYSGrid Meeting Sebastien Goasguen, John McGee, OSG Engagement Manager School of Computing.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center February.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
Portal Update Plan Ashok Adiga (512)
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
VO Privilege Activity. The VO Privilege Project develops and implements fine-grained authorization to grid- enabled resources and services Started Spring.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
An Introduction to Campus Grids 19-Apr-2010 Keith Chadwick & Steve Timm.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Open Science Grid Build a Grid Session Siddhartha E.S University of Florida.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
HTCondor Private Cloud Integration Andrew Lahiff STFC Rutherford Appleton Laboratory European HTCondor Site Admins Meeting 2014.
Group # 14 Dhairya Gala Priyank Shah. Introduction to Grid Appliance The Grid appliance is a plug-and-play virtual machine appliance intended for Grid.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Information Initiative Center, Hokkaido University North 11, West 5, Sapporo , Japan Tel, Fax: General.
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
Campus Grid Technology Derek Weitzel University of Nebraska – Lincoln Holland Computing Center (HCC) Home of the 2012 OSG AHM!
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Defining the Technical Roadmap for the NWICG – OSG Ruth Pordes Fermilab.
Towards Dynamic Database Deployment LCG 3D Meeting November 24, 2005 CERN, Geneva, Switzerland Alexandre Vaniachine (ANL)
Out of the basement, into the cloud: Extending the Tier 3 Lincoln Bryant Computation and Enrico Fermi Institutes.
FermiGrid The Fermilab Campus Grid 28-Oct-2010 Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
CernVM and Volunteer Computing Ivan D Reid Brunel University London Laurence Field CERN.
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Ivan Reid (Brunel University London/CMS)
Grid Laboratory Of Wisconsin (GLOW)
GLOW A Campus Grid within OSG
Presentation transcript:

OSG Campus Grids Dr. Sebastien Goasguen, Clemson University ____________________________

Outline A Few examples Clemson's Pool in details o Windows o Backfill o OSG Other pools and CI-TEAM OSG:CElite Condor and clouds ____________________________

~14,000 CPUs available US-CMS Tier-2 TeraGrid site Regional VO Campus Condor pool backfills idle nodes in PBS clusters - provided 5.5 million CPU-hours in 2006, all from idle nodes in clusters Use on TeraGrid: 2.4 million hours in 2006 spent Building a database of hypothetical zeolite structures; 2007: 5.5 million hours allocated to TG

Users submit jobs to their own private or department scheduler as members of a group (e.g. “CMS” or “MedPhysics”)  Jobs are dynamically matched to available machines  Jobs run preferentially at the “home” site, but may run anywhere when machines are available  Computers at each site give highest priority to jobs from same group (via machine RANK) Crosses multiple administrative domains  No common uid-space across campus  No cross-campus NFS for file access Grid Laboratory of Wisconsin (GLOW)

Housing the Machines Condominium Style  centralized computing center  space, power, cooling, management  standardized packages Neighborhood Association Style  each group hosts its own machines  each contributes to administrative effort  base standards (e.g. Linux & Condor) to make easy sharing of resources GLOW and Clemson have elements of both Grid Laboratory of Wisconsin (GLOW)

Clemson’s pool Clemson's Pool o Orignially mostly Windows, +100 locations on campus. o Now 6,000 linux slots as well o Working on 11,500 slots setup, ~120 TFlops o Maintained by Central IT o CS dpt tests new configs o Other dpt adopt the Central IT images o BOINC Backfill to maximize utilization. o Connected to OSG via an OSG CE. ____________________________ Total Owner Claimed Unclaimed Matched Preempting Backfill INTEL/LINUX INTEL/WINNT INTEL/WINNT SUN4u/SOLARIS X86_64/LINUX Total

Clemson’s pool history ____________________________

Clemson’s pool BOINC backfill Put Clemson in World Community Grid, and Reached #1 on WCG in the world, contributing ~4 years per day when no local jobs are running ____________________________ # Turn on backfill functionality, and use BOINC ENABLE_BACKFILL = TRUE BACKFILL_SYSTEM = BOINC BOINC_Executable = C:\PROGRA~1\BOINC\boinc.exe BOINC_Universe = vanilla BOINC_Arguments = --dir $(BOINC_HOME) --attach_project cbf9dNOTAREALKEYGETYOUROWN035b4b2

Clemson’s pool BOINC backfill Reached #1 on WCG in the world, contributing ~4 years per day when no local jobs are running = Lots of pink ____________________________

OSG VO through BOINC ____________________________ LIGO VO very little jobs to grab Could we count BOINC work for OSG VO led project into OSG accounting. A.k.a count jobs not coming through the CE.

Clemson’s pool on OSG Multi-tier job queues to fill the pool Local users, then OSG, then BOINC ____________________________

Other Pools and CI-TEAM CI-TEAM is a NSF award to outreach to campuses, help them build their cyberinfrastructure and make use of it as well as the national OSG infrastructure. “Embedded Immersive Engagement for Cyberinfrastructure, EIE-4CI” ____________________________ Provide help to build cyberinfrastructure on campus Provide help to make your application run on “the Grid” Train experts

Other Pools and CI-TEAM Other Large Campus Pools Purdue –14,000 slots (Led by US-CMS Tier-2). GLOW in Wisconsin (Also US-CMS leadership). FermiGrid (Multiple Experiments as stakeholders). RIT and Albany have created +1,000 pools after CI-days in Albany in December 2007 ____________________________

Campus sites “levels” Different level of efforts, different commitments, different results. How much to do? ____________________________ Duke, ATLAS Tier-3. One day of work, not registered on OSG Harvard, SBGrid VO. Weeks/Months of work, registered, VO members highly engaged RIT, NYSgrid VO, regional VO. Windows based Condor pool, BOINC backfill. SURAGRID, interop partnership, different CA policies. Trend towards Regional “Grids” (NWICG, NYSGRID,NJEDGE SURAGRID, LONI…) leverage OSG framework to access more resources and share there own resources.

OSG:CElite Low-level entry to OSG CE (and SE in the future). What is the minimum required set of software to setup a OSG CE ? Physical appliance, Virtual appliance, Live CD, new VDT cache or new P2P network with separate security. ____________________________

OSG:CElite Physical appliance: Prep machine, configure software, ship machine, receiving site just turns it on. Virtual appliance: Same as physical but no shipping, no buying of machines Live CD: size of the image ? VDT cache: pacman –get OSG:CElite Problems: Drop in valid certificates for hosts, registration of the resource. Use a different CA to issue these certs ? P2P network of Tier-3s, create a “VPN” and create an isolated testbed for sys admin …more of an academic exercise.. ____________________________

What software ? # vdt-control –list Service | Type | Desired State fetch-crl | cron | enable vdt-rotate-logs | cron | enable gris | init | do not enable globus-gatekeeper | inetd | enable gsiftp | inetd | enable mysql | init | enable globus-ws | init | enable edg-mkgridmap | cron | do not enable gums-host-cron | cron | do not enable MLD | init | do not enable vdt-update-certs | cron | do not enable condor-devel | init | enable apache | init | enable osg-rsv | init | do not enable tomcat-5 | init | enable syslog-ng | init | enable gratia-condor | cron | enable ____________________________

Condor and Clouds For us clouds are clusters of workstations/servers that are dedicated to a particular Virtual Organization. Their software environments can be tailored to the particular needs of a VO. They can be provisioned dynamically. Condor can help us build clouds: Ease to target specific machines for specific VOs with classads Ease of having adding nodes to clouds by sending ads to collectors. Ease to integrate with existing grid computing environments, OSG for instance. Implementation: Use virtual machine (VM) to provide different running environment for each VO. Each VM advertized with different classads Run Condor within the VMs Start and Stop VMs depending on job load ____________________________

Condor and Clouds Use IPOP ( to build WAN “VPN” that traverse NATs. Ability to isolate clouds in different address space. ____________________________ o VM as a job o Job “glides” in VM o VM destroyed o “VPN” for all VMs o Different OS/sw for each VO o Use EC2… o Use VM universe… o Under test as we speak

Acknowledgements lots of folks at clemson...Dru, Matt, Nell, John-Mark,Ben... lots of condor folks: Miron, Todd, Alain, Jaime, Dan, Ben, Greg.... ____________________________

questions? yum repo for condor…