Presentation is loading. Please wait.

Presentation is loading. Please wait.

RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open.

Similar presentations


Presentation on theme: "RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open."— Presentation transcript:

1 RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open Science Grid Linking Universities and Laboratories In National Cyberinfrastructure www.opensciencegrid.org

2 RIT Colloquium (May 23, 2007)Paul Avery 2 Cyberinfrastructure and Grids  Grid: Geographically distributed computing resources configured for coordinated use  Fabric: Physical resources & networks providing raw capability  Ownership:Resources controlled by owners and shared w/ others  Middleware: Software tying it all together: tools, services, etc.  Enhancing collaboration via transparent resource sharing US-CMS “Virtual Organization”

3 RIT Colloquium (May 23, 2007)Paul Avery 3 Motivation: Data Intensive Science  21 st century scientific discovery  Computationally & data intensive  Theory + experiment + simulation  Internationally distributed resources and collaborations  Dominant factor: data growth (1 petabyte = 1000 terabytes)  2000~0.5 petabyte  2007~10 petabytes  2013~100 petabytes  2020~1000 petabytes  Powerful cyberinfrastructure needed  ComputationMassive, distributed CPU  Data storage & accessLarge-scale, distributed storage  Data movementInternational optical networks  Data sharingGlobal collaborations (100s – 1000s)  SoftwareManaging all of the above How to collect, manage, access and interpret this quantity of data?

4 RIT Colloquium (May 23, 2007)Paul Avery 4 Open Science Grid: July 20, 2005  Consortium of many organizations (multiple disciplines)  Production grid cyberinfrastructure  80+ sites, 25,000+ CPUs: US, UK, Brazil, Taiwan

5 RIT Colloquium (May 23, 2007)Paul Avery 5 The Open Science Grid Consortium Open Science Grid U.S. grid projects LHC experiments Laboratory centers Education communities Science projects & communities Technologists (Network, HPC, …) Computer Science University facilities Multi-disciplinary facilities Regional and campus grids

6 RIT Colloquium (May 23, 2007)Paul Avery 6 Open Science Grid Basics  Who  Comp. scientists, IT specialists, physicists, biologists, etc.  What  Shared computing and storage resources  High-speed production and research networks  Meeting place for research groups, software experts, IT providers  Vision  Maintain and operate a premier distributed computing facility  Provide education and training opportunities in its use  Expand reach & capacity to meet needs of stakeholders  Dynamically integrate new resources and applications  Members and partners  Members:HPC facilities, campus, laboratory & regional grids  Partners:Interoperation with TeraGrid, EGEE, NorduGrid, etc.

7 RIT Colloquium (May 23, 2007)Paul Avery 7 Crucial Ingredients in Building OSG  Science “Push”: ATLAS, CMS, LIGO, SDSS  1999: Foresaw overwhelming need for distributed cyberinfrastructure  Early funding: “Trillium” consortium  PPDG:$12M (DOE)(1999 – 2006)  GriPhyN:$12M (NSF)(2000 – 2006)  iVDGL:$14M (NSF)(2001 – 2007)  Supplements + new funded projects  Social networks: ~150 people with many overlaps  Universities, labs, SDSC, foreign partners  Coordination: pooling resources, developing broad goals  Common middleware: Virtual Data Toolkit (VDT)  Multiple Grid deployments/testbeds using VDT  Unified entity when collaborating internationally  Historically, a strong driver for funding agency collaboration

8 RIT Colloquium (May 23, 2007)Paul Avery 8 OSG History in Context 1999200020012002200520032004200620072008 2009 PPDG GriPhyN iVDGL TrilliumGrid3 OSG (DOE) (DOE+NSF) (NSF) European Grid + Worldwide LHC Computing Grid Campus, regional grids LHC Ops LHC construction, preparation LIGO operation LIGO preparation

9 RIT Colloquium (May 23, 2007)Paul Avery 9 Principal Science Drivers  High energy and nuclear physics  100s of petabytes (LHC)2007  Several petabytes2005  LIGO (gravity wave search)  0.5 - several petabytes2002  Digital astronomy  10s of petabytes2009  10s of terabytes2001  Other sciences coming forward  Bioinformatics (10s of petabytes)  Nanoscience  Environmental  Chemistry  Applied mathematics  Materials Science? Data growth Community growth 2007 2005 2003 2001 2009

10 RIT Colloquium (May 23, 2007)Paul Avery 10 OSG Virtual Organizations ATLAS HEP/LHCHEP experiment at CERN CDF HEPHEP experiment at FermiLab CMS HEP/LHCHEP experiment at CERN DES Digital astronomyDark Energy Survey DOSAR Regional gridRegional grid in Southwest US DZero HEPHEP experiment at FermiLab DOSAR Regional gridRegional grid in Southwest ENGAGE Engagement effortA place for new communities FermiLab Lab gridHEP laboratory grid fMRI Functional MRI GADU BioBioinformatics effort at Argonne Geant4 SoftwareSimulation project GLOW Campus gridCampus grid U of Wisconsin, Madison GRASE Regional gridRegional grid in Upstate NY

11 RIT Colloquium (May 23, 2007)Paul Avery 11 OSG Virtual Organizations (2) GridChem ChemistryQuantum chemistry grid GPN Great Plains Networkwww.greatplains.net GROW Campus gridCampus grid at U of Iowa I2U2 EOTE/O consortium LIGO Gravity wavesGravitational wave experiment Mariachi Cosmic raysUltra-high energy cosmic rays nanoHUB NanotechNanotechnology grid at Purdue NWICG Regional gridNorthwest Indiana regional grid NYSGRID NY State Gridwww.nysgrid.org OSGEDU EOTOSG education/outreach SBGRID Structural biologyStructural biology @ Harvard SDSS Digital astronomySloan Digital Sky Survey (Astro) STAR Nuclear physicsNuclear physics experiment at Brookhaven UFGrid Campus gridCampus grid at U of Florida

12 RIT Colloquium (May 23, 2007)Paul Avery 12 Partners: Federating with OSG  Campus and regional  Grid Laboratory of Wisconsin (GLOW)  Grid Operations Center at Indiana University (GOC)  Grid Research and Education Group at Iowa (GROW)  Northwest Indiana Computational Grid (NWICG)  New York State Grid (NYSGrid) (in progress)  Texas Internet Grid for Research and Education (TIGRE)  nanoHUB (Purdue)  LONI (Louisiana)  National  Data Intensive Science University Network (DISUN)  TeraGrid  International  Worldwide LHC Computing Grid Collaboration (WLCG)  Enabling Grids for E-SciencE (EGEE)  TWGrid (from Academica Sinica Grid Computing)  Nordic Data Grid Facility (NorduGrid)  Australian Partnerships for Advanced Computing (APAC)

13 RIT Colloquium (May 23, 2007)Paul Avery 13 Search for  Origin of Mass  New fundamental forces  Supersymmetry  Other new particles  2007 – ? TOTEM LHCb ALICE  27 km Tunnel in Switzerland & France CMS ATLAS Defining the Scale of OSG: Experiments at Large Hadron Collider LHC @ CERN

14 RIT Colloquium (May 23, 2007)Paul Avery 14 CMS: “Compact” Muon Solenoid Inconsequential humans

15 RIT Colloquium (May 23, 2007)Paul Avery 15 All charged tracks with pt > 2 GeV Reconstructed tracks with pt > 25 GeV (+30 minimum bias events) 10 9 collisions/sec, selectivity: 1 in 10 13 Collision Complexity: CPU + Storage

16 RIT Colloquium (May 23, 2007)Paul Avery 16 CMS ATLAS LHCb Storage  Raw recording rate 0.2 – 1.5 GB/s  Large Monte Carlo data samples  100 PB by ~2013  1000 PB later in decade? Processing  PetaOps (> 300,000 3 GHz PCs) Users  100s of institutes  1000s of researchers LHC Data and CPU Requirements

17 RIT Colloquium (May 23, 2007)Paul Avery 17 CMS Experiment OSG and LHC Global Grid Online System CERN Computer Center FermiLab Korea Russia UK Maryland 200 - 1500 MB/s >10 Gb/s 10-40 Gb/s 2.5-10 Gb/s Tier 0 Tier 1 Tier 3 Tier 2 Physics caches PCs Iowa UCSDCaltech U Florida  5000 physicists, 60 countries  10s of Petabytes/yr by 2009  CERN / Outside = 10-20% FIU Tier 4 OSG

18 RIT Colloquium (May 23, 2007)Paul Avery 18 ATLAS CMS LHC Global Collaborations  2000 – 3000 physicists per experiment  USA is 20–31% of total

19 RIT Colloquium (May 23, 2007)Paul Avery 19 LIGO: Search for Gravity Waves  LIGO Grid  6 US sites  3 EU sites (UK & Germany) * LHO, LLO: LIGO observatory sites * LSC: LIGO Scientific Collaboration  Cardiff AEI/Golm Birmingham

20 RIT Colloquium (May 23, 2007)Paul Avery 20 Sloan Digital Sky Survey: Mapping the Sky

21 RIT Colloquium (May 23, 2007)Paul Avery 21 Integrated Database Integrated Database Includes:  Parsed Sequence Data and Annotation Data from Public web sources.  Results of different tools used for Analysis: Blast, Blocks, TMHMM, … GADU using Grid Applications executed on Grid as workflows and results are stored in integrated Database. GADU Performs:  Acquisition: to acquire Genome Data from a variety of publicly available databases and store temporarily on the file system.  Analysis: to run different publicly available tools and in-house tools on the Grid using Acquired data & data from Integrated database.  Storage: Store the parsed data acquired from public databases and parsed results of the tools and workflows used during analysis. Bidirectional Data Flow Public Databases Genomic databases available on the web. Eg: NCBI, PIR, KEGG, EMP, InterPro, etc. Applications (Web Interfaces) Based on the Integrated Database PUMA2 Evolutionary Analysis of Metabolism Chisel Protein Function Analysis Tool. TARGET Targets for Structural analysis of proteins. PATHOS Pathogenic DB for Bio-defense research Phyloblocks Evolutionary analysis of protein families TeraGridOSGDOE SG GNARE – Genome Analysis Research Environment Services to Other Groups SEED (Data Acquisition) Shewanella Consortium (Genome Analysis) Others.. Bioinformatics: GADU / GNARE

22 RIT Colloquium (May 23, 2007)Paul Avery 22 Bioinformatics (cont) Shewanella oneidensis genome

23 RIT Colloquium (May 23, 2007)Paul Avery 23 Nanoscience Simulations collaboration nanoHUB.org courses, tutorials online simulation seminars learning modules Real users and real usage >10,100 users 1881 sim. users >53,000 simulations

24 RIT Colloquium (May 23, 2007)Paul Avery 24 OSG Engagement Effort  Purpose: Bring non-physics applications to OSG  Led by RENCI (UNC + NC State + Duke)  Specific targeted opportunities  Develop relationship  Direct assistance with technical details of connecting to OSG  Feedback and new requirements for OSG infrastructure  (To facilitate inclusion of new communities)  More & better documentation  More automation

25 RIT Colloquium (May 23, 2007)Paul Avery 25 OSG and the Virtual Data Toolkit  VDT: a collection of software  Grid software (Condor, Globus, VOMS, dCache, GUMS, Gratia, …)  Virtual Data System  Utilities  VDT: the basis for the OSG software stack  Goal is easy installation with automatic configuration  Now widely used in other projects  Has a growing support infrastructure

26 RIT Colloquium (May 23, 2007)Paul Avery 26 Why Have the VDT?  Everyone could download the software from the providers  But the VDT:  Figures out dependencies between software  Works with providers for bug fixes  Automatic configures & packages software  Tests everything on 15 platforms (and growing) Debian 3.1 Fedora Core 3 Fedora Core 4 (x86, x86-64) Fedora Core 4 (x86-64) RedHat Enterprise Linux 3 AS (x86, x86-64, ia64) RedHat Enterprise Linux 4 AS (x64, x86-64) ROCKS Linux 3.3 Scientific Linux Fermi 3 Scientific Linux Fermi 4 (x86, x86-64, ia64) SUSE Linux 9 (IA-64)

27 RIT Colloquium (May 23, 2007)Paul Avery 27 VDT Growth Over 5 Years (1.6.1i now) vdt.cs.wisc.edu # of Components

28 RIT Colloquium (May 23, 2007)Paul Avery 28 Collaboration with Internet2 www.internet2.edu

29 RIT Colloquium (May 23, 2007)Paul Avery 29  Optical, multi-wavelength community owned or leased “dark fiber” (10 GbE) networks for R&E  Spawning state-wide and regional networks (FLR, SURA, LONI, …)  Bulletin: NLR-Internet2 merger announcement Collaboration with National Lambda Rail www.nlr.net

30 RIT Colloquium (May 23, 2007)Paul Avery 30 UltraLight 10 Gb/s+ network Caltech, UF, FIU, UM, MIT SLAC, FNAL Int’l partners Level(3), Cisco, NLR http://www.ultralight.org Funded by NSF Integrating Advanced Networking in Applications

31 RIT Colloquium (May 23, 2007)Paul Avery 31 REDDnet: National Networked Storage  NSF funded project  Vandebilt  8 initial sites  Multiple disciplines  Satellite imagery  HEP  Terascale Supernova Initative  Structural Biology  Bioinformatics  Storage  500TB disk  200TB tape Brazil?

32 RIT Colloquium (May 23, 2007)Paul Avery 32 OSG Jobs Snapshot: 6 Months 5000 simultaneous jobs from multiple VOs SepDecFebNovJanOctMar

33 RIT Colloquium (May 23, 2007)Paul Avery 33 OSG Jobs Per Site: 6 Months 5000 simultaneous jobs at multiple sites SepDecFebNovJanOctMar

34 RIT Colloquium (May 23, 2007)Paul Avery 34 Completed Jobs/Week on OSG 400K CMS “Data Challenge” SepDecFebNovJanOctMar

35 RIT Colloquium (May 23, 2007)Paul Avery 35 # Jobs Per VO New Accounting System (Gratia)

36 RIT Colloquium (May 23, 2007)Paul Avery 36 Massive 2007 Data Reprocessing by D0 Experiment @ Fermilab SAM OSG LCG ~ 400M total ~ 250M OSG

37 RIT Colloquium (May 23, 2007)Paul Avery 37 CDF Discovery of B s Oscillations

38 RIT Colloquium (May 23, 2007)Paul Avery 38 Communications: International Science Grid This Week SGTW  iSGTW  From April 2005  Diverse audience  >1000 subscribers www.isgtw.org

39 RIT Colloquium (May 23, 2007)Paul Avery 39 OSG News: Monthly Newsletter 18 issues by Apr. 2007 www.opensciencegrid.org/ osgnews

40 RIT Colloquium (May 23, 2007)Paul Avery 40 Grid Summer Schools  Summer 2004, 2005, 2006  1 week @ South Padre Island, Texas  Lectures plus hands-on exercises to ~40 students  Students of differing backgrounds (physics + CS), minorities  Reaching a wider audience  Lectures, exercises, video, on web  More tutorials, 3-4/year  Students, postdocs, scientists  Agency specific tutorials

41 RIT Colloquium (May 23, 2007)Paul Avery 41 Project Challenges  Technical constraints  Commercial tools fall far short, require (too much) invention  Integration of advanced CI, e.g. networks  Financial constraints (slide)  Fragmented & short term funding injections (recent $30M/5 years)  Fragmentation of individual efforts  Distributed coordination and management  Tighter organization within member projects compared to OSG  Coordination of schedules & milestones  Many phone/video meetings, travel  Knowledge dispersed, few people have broad overview

42 RIT Colloquium (May 23, 2007)Paul Avery 42 Funding & Milestones: 1999 – 2007 20002001200320042005200620072002 GriPhyN, $12M PPDG, $9.5M UltraLight, $2M CHEPREO, $4M Grid Communications Grid Summer Schools 2004, 2005, 2006 Grid3 start OSG start VDT 1.0 First US-LHC Grid Testbeds Digital Divide Workshops 04, 05, 06 LIGO Grid LHC start iVDGL, $14M DISUN, $10M OSG, $30M NSF, DOE VDT 1.3  Grid, networking projects  Large experiments  Education, outreach, training

43 RIT Colloquium (May 23, 2007)Paul Avery 43 Challenges from Diversity and Growth  Management of an increasingly diverse enterprise  Sci/Eng projects, organizations, disciplines as distinct cultures  Accommodating new member communities (expectations?)  Interoperation with other grids  TeraGrid  International partners (EGEE, NorduGrid, etc.)  Multiple campus and regional grids  Education, outreach and training  Training for researchers, students  … but also project PIs, program officers  Operating a rapidly growing cyberinfrastructure  25K  100K CPUs, 4  10 PB disk  Management of and access to rapidly increasing data stores (slide)  Monitoring, accounting, achieving high utilization  Scalability of support model (slide)

44 RIT Colloquium (May 23, 2007)Paul Avery 44 Rapid Cyberinfrastructure Growth: LHC CERN Tier-1 Tier-2 2008: ~140,000 PCs  Meeting LHC service challenges & milestones  Participating in worldwide simulation productions

45 RIT Colloquium (May 23, 2007)Paul Avery 45 OSG Operations Distributed model  Scalability!  VOs, sites, providers  Rigorous problem tracking & routing  Security  Provisioning  Monitoring  Reporting Partners with EGEE operations

46 RIT Colloquium (May 23, 2007)Paul Avery 46 Integrated Network Management Five Year Project Timeline & Milestones LHC Simulations Support 1000 Users; 20PB Data Archive Contribute to Worldwide LHC Computing Grid LHC Event Data Distribution and Analysis Contribute to LIGO Workflow and Data Analysis Additional Science Communities+1 Community Facility Security : Risk Assessment, Audits, Incident Response, Management, Operations, Technical Controls Plan V11st AuditRisk Assessment AuditRisk Assessment AuditRisk Assessment AuditRisk Assessment VDT and OSG Software Releases: Major Release every 6 months; Minor Updates as needed VDT 1.4.0VDT 1.4.1VDT 1.4.2………… Advanced LIGO LIGO Data Grid dependent on OSG CDF Simulation STAR, CDF, D0, Astrophysics D0 Reprocessing D0 Simulations CDF Simulation and Analysis LIGO data run SC5 Facility Operations and Metrics: Increase robustness and scale; Operational Metrics defined and validated each year. Interoperate and Federate with Campus and Regional Grids 200620072008200920102011 Project startEnd of Phase I End of Phase II VDT Incremental Updates dCache with role based authorization OSG 0.6.0OSG 0.8.0OSG 1.0OSG 2.0OSG 3.0… AccountingAuditing VDS with SRM Common S/w Distribution with TeraGrid EGEE using VDT 1.4.X Transparent data and job movement with TeraGrid Transparent data management with EGEE Federated monitoring and information services Data Analysis (batch and interactive) Workflow Extended Capabilities & Increase Scalability and Performance for Jobs and Data to meet Stakeholder needs SRM/dCache Extensions “Just in Time” Workload Management VO Services Infrastructure Improved Workflow and Resource Selection Work with SciDAC-2 CEDS and Security with Open Science +1 Community 200620072008200920102011 +1 Community STAR Data Distribution and Jobs 10KJobs per Day

47 RIT Colloquium (May 23, 2007)Paul Avery 47 Extra Slides

48 RIT Colloquium (May 23, 2007)Paul Avery 48 VDT Release Process (Subway Map) Gather requirements Build software Test Validation test bed ITB Release Candidate VDT Release Integration test bed OSG Release Time Day 0 Day N From Alain Roy

49 RIT Colloquium (May 23, 2007)Paul Avery 49 VDT Challenges  How should we smoothly update a production service?  In-place vs. on-the-side  Preserve old configuration while making big changes  Still takes hours to fully install and set up from scratch  How do we support more platforms?  A struggle to keep up with the onslaught of Linux distributions  AIX? Mac OS X? Solaris?  How can we accommodate native packaging formats?  RPM  Deb Fedora Core 3 Fedora Core 4 RHEL 3 RHEL 4 BCCD Fedora Core 6


Download ppt "RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open."

Similar presentations


Ads by Google