GriPhyN Project Overview Paul Avery University of Florida GriPhyN NSF Project Review January 2003 Chicago
229 Jan 2003 Paul Avery, University of Florida GriPhyN = Experiments + CS + Grids GriPhyN = Grid Physics Network –Computer Scientists (Globus, Condor, SRB, …) –Physicists from 4 frontier physics/astronomy expts. GriPhyN basics (2000 – 2005) –$11.9M (NSF) + $1.6M (matching) –17 universities, SDSC, 3 labs, ~80 people –Integrated Outreach effort (UT Brownsville) Management –Paul Avery (Florida)co-Director –Ian Foster (Chicago)co-Director –Mike Wilde (Argonne)Project Coordinator –Rick Cavanaugh (Florida)Deputy Coordinator
329 Jan 2003 Paul Avery, University of Florida GriPhyN Institutions (Sep. 2000) –U Florida –U Chicago –Boston U –Caltech –U Wisconsin, Madison –USC/ISI –Harvard –Indiana –Johns Hopkins –Texas A&M –Stanford –U Illinois at Chicago –U Penn –U Texas, Brownsville –UC Berkeley –U Wisconsin, Milwaukee –UC San Diego –SDSC –Lawrence Berkeley Lab –Argonne –Fermilab –Brookhaven Funded by GriPhyN
429 Jan 2003 Paul Avery, University of Florida GriPhyN Vision Create tools to enable collaborative research –Large research teams … by global scientific communities –International distribution of people and resources … at petascale levels –PetaOps + PetaBytes + Performance … in a transparent way –Scientists think in terms of their science
529 Jan 2003 Paul Avery, University of Florida GriPhyN Science Drivers US-CMS & US-ATLAS –HEP experiments at LHC/CERN –100s of Petabytes LIGO –Gravity wave experiment –100s of Terabytes Sloan Digital Sky Survey –Digital astronomy (1/4 sky) –10s of Terabytes Data growth Community growth Massive CPU Large, distributed datasets Large, distributed communities
629 Jan 2003 Paul Avery, University of Florida GriPhyN Goals Conduct CS research to achieve vision –Virtual Data as unifying principle –Planning, execution, performance monitoring Disseminate through Virtual Data Toolkit Integrate into GriPhyN science experiments –Common Grid tools, services Impact other disciplines –HEP, biology, medicine, virtual astronomy, eng. –Other Grid projects Educate, involve, train students in IT research –Undergrads, grads, postdocs, –Underrepresented groups
729 Jan 2003 Paul Avery, University of Florida Goal: PetaScale Virtual-Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, CPUs, networks) Resource Management Services Security and Policy Services Other Grid Services Interactive User Tools Production Team Single Researcher Workgroups Raw data source PetaOps Petabytes Performance
829 Jan 2003 Paul Avery, University of Florida CMS Experiment Example: Global LHC Data Grid Online System CERN Computer Center > 20 TIPS USA Korea Russia UK Institute MBytes/s 2.5 Gbits/s 1 Gbits/s Gbits/s ~0.6 Gbits/s Tier 0 Tier 1 Tier 3 Tier 2 Physics cache PCs, other portals Institute Tier2 Center Tier 4
929 Jan 2003 Paul Avery, University of Florida GriPhyN Project Challenges We balance and coordinate –CS research with “goals, milestones & deliverables” –GriPhyN schedule/priorities/risks with those of the 4 experiments –General tools developed by GriPhyN with specific tools developed by 4 experiments –Data Grid design, architecture & deliverables with those of other Grid projects Appropriate balance requires –Tight management, close coordination, trust We have (so far) met these challenges –But requires constant attention, good will
1029 Jan 2003 Paul Avery, University of Florida External Advisory Committee Physics Experiments Project Directors Paul Avery Ian Foster Internet 2DOE Science NSF PACIs Project Coordination Mike Wilde Rick Cavanaugh Outreach/Education Manuela Campanelli Industrial Connections Ian Foster / Paul Avery EDG, LCG, Other Grid Projects Architecture Carl Kesselman VDT Development Coord.: M. Livny Requirements, Definition & Scheduling (Miron Livny) Integration, Testing, Documentation, Support (Alain Roy) Globus Project & NMI Integration (Carl Kesselman) CS Research Coord.: I. Foster Virtual Data (Mike Wilde) Request Planning & Scheduling (Ewa Deelman) Execution Management (Miron Livny) Measurement, Monitoring & Prediction (Valerie Taylor) Applications Coord.: R. Cavanaugh ATLAS (Rob Gardner) CMS (Rick Cavanaugh) LIGO (Albert Lazzarini) SDSS (Alexander Szalay) Inter-Project Coordination: R. Pordes HICB (Larry Price) HIJTB (Carl Kesselman) PPDG (Ruth Pordes) TeraGrid, NMI, etc. (TBD) International (EDG, etc) (Ruth Pordes) GriPhyN Management iVDGL iVDGL Rob Gardner
1129 Jan 2003 Paul Avery, University of Florida External Advisory Committee Members –Fran Berman (SDSC Director) –Dan Reed (NCSA Director) –Joel Butler (former head, FNAL Computing Division) –Jim Gray (Microsoft) –Bill Johnston (LBNL, DOE Science Grid) –Fabrizio Gagliardi (CERN, EDG Director) –David Williams (former head, CERN IT) –Paul Messina (former CACR Director) –Roscoe Giles (Boston U, NPACI-EOT) Met with us 3 times: 4/2001, 1/2002, 1/2003 –Extremely useful guidance on project scope & goals
1229 Jan 2003 Paul Avery, University of Florida Integration of GriPhyN and iVDGL International Virtual-Data Grid Laboratory –A global Grid laboratory (US, EU, Asia, …) –A place to conduct Data Grid tests “at scale” –A mechanism to create common Grid infrastructure –A laboratory for Grid tests by other disciplines Tight integration with GriPhyN –Testbeds –VDT support –Outreach –Common External Advisory Committee International participation –DataTag (EU) –UK e-Science programme: support 6 CS Fellows
1329 Jan 2003 Paul Avery, University of Florida GriPhyN/iVDGL Basics Both NSF funded, overlapping periods –GriPhyN:$11.9M (NSF) + $1.6M (match)(2000–2005) –iVDGL:$13.7M (NSF) + $2M (match)(2001–2006) Basic composition –GriPhyN:12 universities, SDSC, 3 labs(~82 people) –iVDGL:16 institutions, SDSC, 3 labs(~84 people) –Large overlap: people, institutions, experiments GriPhyN (Grid research) vs iVDGL (Grid deployment) –GriPhyN:2/3 “CS” + 1/3 “physics”( 0% H/W) –iVDGL:1/3 “CS” + 2/3 “physics”(20% H/W) –Virtual Data Toolkit (VDT) in common –Testbeds in common
1429 Jan 2003 Paul Avery, University of Florida U FloridaCMS CaltechCMS, LIGO UC San DiegoCMS, CS Indiana UATLAS, iGOC Boston UATLAS U Wisconsin, MilwaukeeLIGO Penn StateLIGO Johns HopkinsSDSS, NVO U ChicagoCS U Southern CaliforniaCS U Wisconsin, MadisonCS Salish KootenaiOutreach, LIGO Hampton UOutreach, ATLAS U Texas, BrownsvilleOutreach, LIGO FermilabCMS, SDSS, NVO BrookhavenATLAS Argonne LabATLAS, CS iVDGL Institutions T2 / Software CS support T3 / Outreach T1 / Labs (not funded)
1529 Jan 2003 Paul Avery, University of Florida US-iVDGL Sites (Spring 2003) Partners? EU CERN Brazil Australia Korea Japan UF Wisconsin BNL Indiana Boston U SKC Brownsville Hampton PSU J. Hopkins Caltech Tier1 Tier2 Tier3 FIU FSU Arlington Michigan LBL Oklahoma Argonne Vanderbilt UCSD/SDSC NCSA Fermilab
1629 Jan 2003 Paul Avery, University of Florida Example: US-CMS Grid Testbed
1729 Jan 2003 Paul Avery, University of Florida iVDGL Management & Coordination Project Coordination Group US External Advisory Committee GLUE Interoperability Team Collaborating Grid Projects TeraGridEDGAsiaDataTAG BTEV LCG? BioALICEGeo? D0PDCCMS HI ? US Project Directors Outreach Team Core Software Team Facilities Team Operations Team Applications Team International Piece US Project Steering Group U.S. Piece GriPhyN Mike Wilde
1829 Jan 2003 Paul Avery, University of Florida Meetings in GriPhyN/iVDGL meetings –Oct. 2000All-handsChicago –Dec. 2000ArchitectureChicago –Apr. 2001All-hands, EACUSC/ISI –Aug. 2001PlanningChicago –Oct. 2001All-hands, iVDGLUSC/ISI Numerous smaller meetings –CS-experiment –CS research –Liaisons with PPDG and EU DataGrid –US-CMS and US-ATLAS computing reviews –Experiment meetings at CERN
1929 Jan 2003 Paul Avery, University of Florida Meetings in 2002 GriPhyN/iVDGL meetings –Jan. 2002EAC, Planning, iVDGLFlorida –Mar. 2002Outreach WorkshopBrownsville –Apr. 2002All-handsArgonne –Jul. 2002Reliability WorkshopISI –Oct. 2002Provenance WorkshopArgonne –Dec. 2002Troubleshooting WorkshopChicago –Dec. 2002All-hands technicalISI + Caltech –Jan. 2003EACSDSC Numerous other 2002 meetings –iVDGL facilities workshop (BNL) –Grid activities at CMS, ATLAS meetings –Several computing reviews for US-CMS, US-ATLAS –Demos at IST2002, SC2002 –Meetings with LCG (LHC Computing Grid) project –HEP coordination meetings (HICB)
2029 Jan 2003 Paul Avery, University of Florida Progress: CS, VDT, Outreach Lots of good CS research (Later talks) Installation revolution: VDT + Pacman (Later talk) –Several major releases this year: VDT –VDT/Pacman vastly simplify Grid software installation –Used by all experiments –Agreement to use VDT by LHC Computing Grid Project Grid integration in experiment s/w (Later talks) Expanding education/outreach (Later talk) –Integration with iVDGL –Collaborations: PPDG, NPACI-EOT, SkyServer, QuarkNet –Meetings, brochures, talks, …
2129 Jan 2003 Paul Avery, University of Florida Progress: Student Participation Integrated student involvement –CS research –VDT deployment, testing, support –Integrating Grid tools in physics experiments –Cluster building, testing –Grid software deployment –Outreach, web development Integrated postdoc involvement –Involvement in all areas –Necessary when students not sufficient
2229 Jan 2003 Paul Avery, University of Florida Global Context: Data Grid Projects U.S. Infrastructure Projects –GriPhyN (NSF) –iVDGL (NSF) –Particle Physics Data Grid (DOE) –TeraGrid (NSF) –DOE Science Grid (DOE) EU, Asia major projects –European Data Grid (EDG) (EU, EC) –EDG related national Projects (UK, Italy, France, …) –CrossGrid (EU, EC) –DataTAG (EU, EC) –LHC Computing Grid (LCG) (CERN) –Japanese Project –Korea project
2329 Jan 2003 Paul Avery, University of Florida U.S. Project Coordination: Trillium Trillium = GriPhyN + iVDGL + PPDG –Large overlap in leadership, people, experiments Benefit of coordination –Common S/W base + packaging: VDT + PACMAN –Low overhead for collaborative or joint projects: security, monitoring, newsletter, prod. grids, demos –Wide deployment of new technologies, e.g. Virtual Data –Stronger, more extensive outreach effort Forum for US Grid projects –Joint view, strategies, meetings and work –Unified entity to deal with EU & other Grid projects “Natural” collaboration across DOE and NSF projects –Funding agency interest?
2429 Jan 2003 Paul Avery, University of Florida GriPhyN = Experiments + CS + Grids GriPhyN = Grid Physics Network –Computer Scientists (Globus, Condor, SRB, …) –Physicists from 4 frontier physics/astronomy expts. GriPhyN basics (2000 – 2005) –$11.9M (NSF) + $1.6M (matching) –17 universities, SDSC, 3 labs, ~80 people –Integrated Outreach effort (UT Brownsville) Management –Paul Avery (Florida)co-Director –Ian Foster (Chicago)co-Director –Mike Wilde (Argonne)Project Coordinator –Rick Cavanaugh (Florida)Deputy Coordinator
2529 Jan 2003 Paul Avery, University of Florida GriPhyN Institutions (Sep. 2000) –U Florida –U Chicago –Boston U –Caltech –U Wisconsin, Madison –USC/ISI –Harvard –Indiana –Johns Hopkins –Texas A&M –Stanford –U Illinois at Chicago –U Penn –U Texas, Brownsville –UC Berkeley –U Wisconsin, Milwaukee –UC San Diego –SDSC –Lawrence Berkeley Lab –Argonne –Fermilab –Brookhaven Funded by GriPhyN
2629 Jan 2003 Paul Avery, University of Florida GriPhyN Project Challenges We balance and coordinate –CS research with “goals, milestones & deliverables” –GriPhyN schedule/priorities/risks with those of the 4 experiments –General tools developed by GriPhyN with specific tools developed by 4 experiments –Data Grid design, architecture & deliverables with those of other Grid projects Appropriate balance requires –Tight management, close coordination, trust We have (so far) met these challenges –But requires constant attention, good will
2729 Jan 2003 Paul Avery, University of Florida External Advisory Committee Physics Experiments Project Directors Paul Avery Ian Foster Internet 2DOE Science NSF PACIs Project Coordination Mike Wilde Rick Cavanaugh Outreach/Education Manuela Campanelli Industrial Connections Ian Foster / Paul Avery EDG, LCG, Other Grid Projects Architecture Carl Kesselman VDT Development Coord.: M. Livny Requirements, Definition & Scheduling (Miron Livny) Integration, Testing, Documentation, Support (Alain Roy) Globus Project & NMI Integration (Carl Kesselman) CS Research Coord.: I. Foster Virtual Data (Mike Wilde) Request Planning & Scheduling (Ewa Deelman) Execution Management (Miron Livny) Measurement, Monitoring & Prediction (Valerie Taylor) Applications Coord.: R. Cavanaugh ATLAS (Rob Gardner) CMS (Rick Cavanaugh) LIGO (Albert Lazzarini) SDSS (Alexander Szalay) Inter-Project Coordination: R. Pordes HICB (Larry Price) HIJTB (Carl Kesselman) PPDG (Ruth Pordes) TeraGrid, NMI, etc. (TBD) International (EDG, etc) (Ruth Pordes) GriPhyN Management iVDGL iVDGL Rob Gardner
2829 Jan 2003 Paul Avery, University of Florida External Advisory Committee Members –Fran Berman (SDSC Director) –Dan Reed (NCSA Director) –Joel Butler (former head, FNAL Computing Division) –Jim Gray (Microsoft) –Bill Johnston (LBNL, DOE Science Grid) –Fabrizio Gagliardi (CERN, EDG Director) –David Williams (former head, CERN IT) –Paul Messina (former CACR Director) –Roscoe Giles (Boston U, NPACI-EOT) Met with us 3 times: 4/2001, 1/2002, 1/2003 –Extremely useful guidance on project scope & goals
2929 Jan 2003 Paul Avery, University of Florida Integration of GriPhyN and iVDGL International Virtual-Data Grid Laboratory –A global Grid laboratory (US, EU, Asia, …) –A place to conduct Data Grid tests “at scale” –A mechanism to create common Grid infrastructure –A laboratory for Grid tests by other disciplines Tight integration with GriPhyN –Testbeds –VDT support –Outreach –Common External Advisory Committee International participation –DataTag (EU) –UK e-Science programme: support 6 CS Fellows
3029 Jan 2003 Paul Avery, University of Florida iVDGL Management & Coordination Project Coordination Group US External Advisory Committee GLUE Interoperability Team Collaborating Grid Projects TeraGridEDGAsiaDataTAG BTEV LCG? BioALICEGeo? D0PDCCMS HI ? US Project Directors Outreach Team Core Software Team Facilities Team Operations Team Applications Team International Piece US Project Steering Group U.S. Piece GriPhyN Mike Wilde
3129 Jan 2003 Paul Avery, University of Florida Meetings in GriPhyN/iVDGL meetings –Oct. 2000All-handsChicago –Dec. 2000ArchitectureChicago –Apr. 2001All-hands, EACUSC/ISI –Aug. 2001PlanningChicago –Oct. 2001All-hands, iVDGLUSC/ISI Numerous smaller meetings –CS-experiment –CS research –Liaisons with PPDG and EU DataGrid –US-CMS and US-ATLAS computing reviews –Experiment meetings at CERN
3229 Jan 2003 Paul Avery, University of Florida Meetings in 2002 GriPhyN/iVDGL meetings –Jan. 2002EAC, Planning, iVDGLFlorida –Mar. 2002Outreach WorkshopBrownsville –Apr. 2002All-handsArgonne –Jul. 2002Reliability WorkshopISI –Oct. 2002Provenance WorkshopArgonne –Dec. 2002Troubleshooting WorkshopChicago –Dec. 2002All-hands technicalISI + Caltech –Jan. 2003EACSDSC Numerous other 2002 meetings –iVDGL facilities workshop (BNL) –Grid activities at CMS, ATLAS meetings –Several computing reviews for US-CMS, US-ATLAS –Demos at IST2002, SC2002 –Meetings with LCG (LHC Computing Grid) project –HEP coordination meetings (HICB)
3329 Jan 2003 Paul Avery, University of Florida Global Context: Data Grid Projects U.S. Infrastructure Projects –GriPhyN (NSF) –iVDGL (NSF) –Particle Physics Data Grid (DOE) –TeraGrid (NSF) –DOE Science Grid (DOE) EU, Asia major projects –European Data Grid (EDG) (EU, EC) –EDG related national Projects (UK, Italy, France, …) –CrossGrid (EU, EC) –DataTAG (EU, EC) –LHC Computing Grid (LCG) (CERN) –Japanese Project –Korea project
3429 Jan 2003 Paul Avery, University of Florida U.S. Project Coordination: Trillium Trillium = GriPhyN + iVDGL + PPDG –Large overlap in leadership, people, experiments Benefit of coordination –Common S/W base + packaging: VDT + PACMAN –Low overhead for collaborative or joint projects: security, monitoring, newsletter, prod. grids, demos –Wide deployment of new technologies, e.g. Virtual Data –Stronger, more extensive outreach effort Forum for US Grid projects –Joint view, strategies, meetings and work –Unified entity to deal with EU & other Grid projects “Natural” collaboration across DOE and NSF projects –Funding agency interest?
3529 Jan 2003 Paul Avery, University of Florida International Grid Coordination Close collaboration with EU DataGrid (EDG) –Many connections with EDG activities HICB: HEP Inter-Grid Coordination Board –Non-competitive forum, strategic issues, consensus –Cross-project policies, procedures and technology –International joint projects HICB-JTB Joint Technical Board –Definition, oversight and tracking of joint projects –GLUE interoperability group Participation in LHC Computing Grid (LCG) –Software Computing Committee (SC2) –Project Execution Board (PEB) –Grid Deployment Board (GDB)
3629 Jan 2003 Paul Avery, University of Florida Creation of WorldGrid Joint US-EU Grid deployment GriPhyN contribution: VDT –WorldGrid is major driver for VDT –Demonstrated at IST2002 (Copenhagen) –Demonstrated at SC2002 (Baltimore) Becoming major outreach tool in 2003 –Meeting in February to continue development
3729 Jan 2003 Paul Avery, University of Florida WorldGrid Sites
3829 Jan 2003 Paul Avery, University of Florida What Coordination Takes
3929 Jan 2003 Paul Avery, University of Florida Extending GriPhyN’s Reach Dynamic workspaces proposal –Expansion of virtual data technologies to global analysis communities FIU: Creation of “CHEPREO” in Miami area –HEP research, participation in WorldGrid –Strong minority E/O, coordinate with GriPhyN/iVDGL –Research & int’l network: Brazil / South America Also, MRI, SciDAC, other proposals
4029 Jan 2003 Paul Avery, University of Florida Summary CS research –Unified approach based around Virtual Data –Virtual Data, Planning, Execution, Monitoring Education/Outreach –Student & postdoc involvement at all levels –New collaborations with other E/O efforts, WorldGrid Organization and management –Clear management coordinating CS + experiments –Collaboration/coordination US and international Research dissemination, broad impact –Wide deployment of VDT (US, WorldGrid, EDG, LCG) –Demo projects, experiment testbeds, major productions –New projects extending virtual data technologies
4129 Jan 2003 Paul Avery, University of Florida Grid References Grid Book – GriPhyN – iVDGL – PPDG – TeraGrid – Globus – Global Grid Forum – EU DataGrid –