Outreach Workshop (Mar. 1, 2002)Paul Avery1 University of Florida Global Data Grids for 21 st Century.


Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
LHC Computing Review (Jan. 14, 2003)Paul Avery1 University of Florida GriPhyN, iVDGL and LHC Computing.
The DOE Science Grid Computing and Data Infrastructure for Large-Scale Science William Johnston, Lawrence Berkeley National Lab Ray Bair, Pacific Northwest.
US-CMS Meeting (May 19, 2001)Paul Avery1 US-CMS Meeting (UC Riverside) May 19, 2001 Grids for US-CMS and CMS Paul Avery University of Florida
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
The Grid: Globus and the Open Grid Services Architecture Dr. Carl Kesselman Director Center for Grid Technologies Information Sciences Institute University.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
The Grid as Infrastructure and Application Enabler Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
GriPhyN EAC Meeting (Jan. 7, 2002)Paul Avery1 University of Florida GriPhyN External Advisory Committee.
XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.
CANS Meeting (December 1, 2004)Paul Avery1 University of Florida UltraLight U.S. Grid Projects and Open Science Grid Chinese American.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Experiment Requirements for Global Infostructure Irwin Gaines FNAL/DOE.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
What is Internet2? Ted Hanss, Internet2 5 March
Jarek Nabrzyski, Ariel Oleksiak Comparison of Grid Middleware in European Grid Projects Jarek Nabrzyski, Ariel Oleksiak Poznań Supercomputing and Networking.
UT Arlington Colloquium (Jan. 24, 2002)Paul Avery1 University of Florida Physics Colloquium University.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Major Grid Computing Initatives Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
GriPhyN EAC Meeting (Apr. 12, 2001)Paul Avery1 University of Florida Opening and Overview GriPhyN External.
Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
DOE/NSF Review (Nov. 15, 2000)Paul Avery (LHC Data Grid)1 LHC Data Grid The GriPhyN Perspective DOE/NSF Baseline Review of US-CMS Software and Computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
1 ARGONNE  CHICAGO Grid Introduction and Overview Ian Foster Argonne National Lab University of Chicago Globus Project
Authors: Ronnie Julio Cole David
GriPhyN Project Overview Paul Avery University of Florida GriPhyN NSF Project Review January 2003 Chicago.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
LIGO-G E Generic LabelLIGO Laboratory at Caltech 1 Distributed Computing for LIGO Data Analysis The Aspen Winter Conference on Gravitational Waves,
Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)1 The GriPhyN Project (Grid Physics Network) Paul Avery University of Florida
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
OSG Consortium Meeting (January 23, 2006)Paul Avery1 University of Florida Open Science Grid Progress Linking Universities and Laboratories.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
LIGO-G E LIGO Scientific Collaboration Data Grid Status Albert Lazzarini Caltech LIGO Laboratory Trillium Steering Committee Meeting 20 May 2004.
PPDGLHC Computing ReviewNovember 15, 2000 PPDG The Particle Physics Data Grid Making today’s Grid software work for HENP experiments, Driving GRID science.
GriPhyN EAC Meeting (Jan. 7, 2002)Paul Avery1 Integration with iVDGL è International Virtual-Data Grid Laboratory  A global Grid laboratory (US, EU, Asia,
US Grid Efforts Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
29/1/2002A.Ghiselli, INFN-CNAF1 DataTAG / WP4 meeting Cern, 29 January 2002 Agenda  start at  Project introduction, Olivier Martin  WP4 introduction,
7. Grid Computing Systems and Resource Management
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
Internet2 Spring Meeting NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions.
Realizing the Promise of Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
Storage Management on the Grid Alasdair Earl University of Edinburgh.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
Presentation transcript:

Outreach Workshop (Mar. 1, 2002)Paul Avery1 University of Florida Global Data Grids for 21 st Century Science GriPhyN/iVDGL Outreach Workshop University of Texas, Brownsville March 1, 2002

Outreach Workshop (Mar. 1, 2002)Paul Avery2 What is a Grid? è Grid: Geographically distributed computing resources configured for coordinated use è Physical resources & networks provide raw capability è “Middleware” software ties it together

Outreach Workshop (Mar. 1, 2002)Paul Avery3 What Are Grids Good For? è Climate modeling  Climate scientists visualize, annotate, & analyze Terabytes of simulation data è Biology  A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour è High energy physics  3,000 physicists worldwide pool Petaflops of CPU resources to analyze Petabytes of data è Engineering  Civil engineers collaborate to design, execute, & analyze shake table experiments  A multidisciplinary analysis in aerospace couples code and data in four companies From Ian Foster

Outreach Workshop (Mar. 1, 2002)Paul Avery4 What Are Grids Good For? è Application Service Providers  A home user invokes architectural design functions at an application service provider…  …which purchases computing cycles from cycle providers è Commercial  Scientists at a multinational toy company design a new product è Cities, communities  An emergency response team couples real time data, weather model, population data  A community group pools members’ PCs to analyze alternative designs for a local road è Health  Hospitals and international agencies collaborate on stemming a major disease outbreak From Ian Foster

Outreach Workshop (Mar. 1, 2002)Paul Avery5 Proto-Grid: è Community: SETI researchers + enthusiasts è Arecibo radio data sent to users (250KB data chunks) è Over 2M PCs used

Outreach Workshop (Mar. 1, 2002)Paul Avery6 è Community  Research group (Scripps)  1000s of PC owners  Vendor (Entropia) è Common goal  Drug design  Advance AIDS research More Advanced Proto-Grid: Evaluation of AIDS Drugs

Outreach Workshop (Mar. 1, 2002)Paul Avery7 Why Grids? è Resources for complex problems are distributed  Advanced scientific instruments (accelerators, telescopes, …)  Storage and computing  Groups of people è Communities require access to common services  Scientific collaborations (physics, astronomy, biology, eng. …)  Government agencies  Health care organizations, large corporations, … è Goal is to build “Virtual Organizations”  Make all community resources available to any VO member  Leverage strengths at different institutions  Add people & resources dynamically

Outreach Workshop (Mar. 1, 2002)Paul Avery8 Grids: Why Now? è Moore’s law improvements in computing  Highly functional endsystems è Burgeoning wired and wireless Internet connections  Universal connectivity è Changing modes of working and problem solving  Teamwork, computation è Network exponentials  (Next slide)

Outreach Workshop (Mar. 1, 2002)Paul Avery9 Network Exponentials & Collaboration è Network vs. computer performance  Computer speed doubles every 18 months  Network speed doubles every 9 months  Difference = order of magnitude per 5 years è 1986 to 2000  Computers: x 500  Networks: x 340,000 è 2001 to 2010?  Computers: x 60  Networks: x 4000 Scientific American (Jan-2001)

Outreach Workshop (Mar. 1, 2002)Paul Avery10 Grid Challenges è Overall goal: Coordinated sharing of resources è Technical problems to overcome  Authentication, authorization, policy, auditing  Resource discovery, access, allocation, control  Failure detection & recovery  Resource brokering è Additional issue: lack of central control & knowledge  Preservation of local site autonomy  Policy discovery and negotiation important

Outreach Workshop (Mar. 1, 2002)Paul Avery11 Layered Grid Architecture (Analogy to Internet Architecture) Application Fabric Controlling things locally: Accessing, controlling resources Connectivity Talking to things: communications, security Resource Sharing single resources: negotiating access, controlling use Collective Managing multiple resources: ubiquitous infrastructure services User Specialized services: App. specific distributed services Internet Transport Application Link Internet Protocol Architecture From Ian Foster

Outreach Workshop (Mar. 1, 2002)Paul Avery12 Globus Project and Toolkit è Globus Project™ (Argonne + USC/ISI)  O(40) researchers & developers  Identify and define core protocols and services è Globus Toolkit™ 2.0  A major product of the Globus Project  Reference implementation of core protocols & services  Growing open source developer community è Globus Toolkit used by all Data Grid projects today  US:GriPhyN, PPDG, TeraGrid, iVDGL  EU:EU-DataGrid and national projects è Recent announcement of applying “web services” to Grids  Keeps Grids in the commercial mainstream  GT 3.0

Outreach Workshop (Mar. 1, 2002)Paul Avery13 Globus General Approach è Define Grid protocols & APIs  Protocol-mediated access to remote resources  Integrate and extend existing standards è Develop reference implementation  Open source Globus Toolkit  Client & server SDKs, services, tools, etc. è Grid-enable wide variety of tools  Globus Toolkit  FTP, SSH, Condor, SRB, MPI, … è Learn about real world problems  Deployment  Testing  Applications Diverse global services Core services Diverse resources Applications

Outreach Workshop (Mar. 1, 2002)Paul Avery14 Data Intensive Science: è Scientific discovery increasingly driven by IT  Computationally intensive analyses  Massive data collections  Data distributed across networks of varying capability  Geographically distributed collaboration è Dominant factor: data growth (1 Petabyte = 1000 TB)  2000~0.5 Petabyte  2005~10 Petabytes  2010~100 Petabytes  2015~1000 Petabytes? How to collect, manage, access and interpret this quantity of data? Drives demand for “Data Grids” to handle additional dimension of data access & movement

Outreach Workshop (Mar. 1, 2002)Paul Avery15 Data Intensive Physical Sciences è High energy & nuclear physics  Including new experiments at CERN’s Large Hadron Collider è Gravity wave searches  LIGO, GEO, VIRGO è Astronomy: Digital sky surveys  Sloan Digital sky Survey, VISTA, other Gigapixel arrays  “Virtual” Observatories (multi-wavelength astronomy) è Time-dependent 3-D systems (simulation & data)  Earth Observation, climate modeling  Geophysics, earthquake modeling  Fluids, aerodynamic design  Pollutant dispersal scenarios

Outreach Workshop (Mar. 1, 2002)Paul Avery16 Data Intensive Biology and Medicine è Medical data  X-Ray, mammography data, etc. (many petabytes)  Digitizing patient records (ditto) è X-ray crystallography  Bright X-Ray sources, e.g. Argonne Advanced Photon Source è Molecular genomics and related disciplines  Human Genome, other genome databases  Proteomics (protein structure, activities, …)  Protein interactions, drug delivery è Brain scans (3-D, time dependent) è Virtual Population Laboratory (proposed)  Database of populations, geography, transportation corridors  Simulate likely spread of disease outbreaks Craig Venter

Outreach Workshop (Mar. 1, 2002)Paul Avery17 Example: High Energy Physics “Compact” Muon Solenoid at the LHC (CERN) Smithsonian standard man

Outreach Workshop (Mar. 1, 2002)Paul Avery Physicists 150 Institutes 32 Countries LHC Computing Challenges è Complexity of LHC interaction environment & resulting data è Scale: Petabytes of data per year (100 PB by ~ ) è GLobal distribution of people and resources

Outreach Workshop (Mar. 1, 2002)Paul Avery19 Tier0 CERN Tier1 National Lab Tier2 Regional Center (University, etc.) Tier3 University workgroup Tier4 Workstation Global LHC Data Grid Tier 1 T Tier 0 (CERN) Key ideas: è Hierarchical structure è Tier2 centers

Outreach Workshop (Mar. 1, 2002)Paul Avery20 Global LHC Data Grid Tier2 Center Online System CERN Computer Center > 20 TIPS USA Center France Center Italy Center UK Center Institute Institute ~0.25TIPS Workstations, other portals ~100 MBytes/sec 2.5 Gbits/sec Mbits/sec Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Physics data cache ~PBytes/sec 2.5 Gbits/sec Tier2 Center ~622 Mbits/sec Tier 0 +1 Tier 1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment CERN/Outside Resource Ratio ~1:2 Tier0/(  Tier1)/(  Tier2) ~1:1:1

Outreach Workshop (Mar. 1, 2002)Paul Avery21 Sloan Digital Sky Survey Data Grid

Outreach Workshop (Mar. 1, 2002)Paul Avery22 LIGO (Gravity Wave) Data Grid Hanford Observatory Livingston Observatory Caltech MIT INet2 Abilene Tier1 LSC Tier2 OC3 OC48 OC3 OC12 OC48

Outreach Workshop (Mar. 1, 2002)Paul Avery23 Data Grid Projects è Particle Physics Data Grid (US, DOE)  Data Grid applications for HENP expts. è GriPhyN (US, NSF)  Petascale Virtual-Data Grids è iVDGL (US, NSF)  Global Grid lab è TeraGrid (US, NSF)  Dist. supercomp. resources (13 TFlops) è European Data Grid (EU, EC)  Data Grid technologies, EU deployment è CrossGrid (EU, EC)  Data Grid technologies, EU è DataTAG (EU, EC)  Transatlantic network, Grid applications è Japanese Grid Project (APGrid?) (Japan)  Grid deployment throughout Japan  Collaborations of application scientists & computer scientists  Infrastructure devel. & deployment  Globus based

Outreach Workshop (Mar. 1, 2002)Paul Avery24 Coordination of U.S. Grid Projects è Three U.S. projects  PPDG: HENP experiments, short term tools, deployment  GriPhyN: Data Grid research, Virtual Data, VDT deliverable  iVDGL:Global Grid laboratory è Coordination of PPDG, GriPhyN, iVDGL  Common experiments + personnel, management integration  iVDGL as “joint” PPDG + GriPhyN laboratory  Joint meetings (Jan. 2002, April 2002, Sept. 2002)  Joint architecture creation (GriPhyN, PPDG)  Adoption of VDT as common core Grid infrastructure  Common Outreach effort (GriPhyN + iVDGL) è New TeraGrid project (Aug. 2001)  13MFlops across 4 sites, 40 Gb/s networking  Goal: integrate into iVDGL, adopt VDT, common Outreach

Outreach Workshop (Mar. 1, 2002)Paul Avery25 Worldwide Grid Coordination è Two major clusters of projects  “US based”GriPhyN Virtual Data Toolkit (VDT)  “EU based” Different packaging of similar components

Outreach Workshop (Mar. 1, 2002)Paul Avery26 GriPhyN = App. Science + CS + Grids è GriPhyN = Grid Physics Network  US-CMSHigh Energy Physics  US-ATLASHigh Energy Physics  LIGO/LSCGravity wave research  SDSSSloan Digital Sky Survey  Strong partnership with computer scientists è Design and implement production-scale grids  Develop common infrastructure, tools and services (Globus based)  Integration into the 4 experiments  Broad application to other sciences via “Virtual Data Toolkit”  Strong outreach program è Multi-year project  R&D for grid architecture (funded at $11.9M +$1.6M)  Integrate Grid infrastructure into experiments through VDT

Outreach Workshop (Mar. 1, 2002)Paul Avery27 GriPhyN Institutions  U Florida  U Chicago  Boston U  Caltech  U Wisconsin, Madison  USC/ISI  Harvard  Indiana  Johns Hopkins  Northwestern  Stanford  U Illinois at Chicago  U Penn  U Texas, Brownsville  U Wisconsin, Milwaukee  UC Berkeley  UC San Diego  San Diego Supercomputer Center  Lawrence Berkeley Lab  Argonne  Fermilab  Brookhaven

Outreach Workshop (Mar. 1, 2002)Paul Avery28 GriPhyN: PetaScale Virtual-Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, CPUs, networks) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Workgroups Raw data source ~1 Petaflop ~100 Petabytes

Outreach Workshop (Mar. 1, 2002)Paul Avery29 GriPhyN Research Agenda è Virtual Data technologies (fig.)  Derived data, calculable via algorithm  Instantiated 0, 1, or many times (e.g., caches)  “Fetch value” vs “execute algorithm”  Very complex (versions, consistency, cost calculation, etc) è LIGO example  “Get gravitational strain for 2 minutes around each of 200 gamma- ray bursts over the last year” è For each requested data value, need to  Locate item location and algorithm  Determine costs of fetching vs calculating  Plan data movements & computations required to obtain results  Execute the plan

Outreach Workshop (Mar. 1, 2002)Paul Avery30 Virtual Data in Action è Data request may  Compute locally  Compute remotely  Access local data  Access remote data è Scheduling based on  Local policies  Global policies  Cost Major facilities, archives Regional facilities, caches Local facilities, caches Fetch item

Outreach Workshop (Mar. 1, 2002)Paul Avery31 GriPhyN Research Agenda (cont.) è Execution management  Co-allocation of resources (CPU, storage, network transfers)  Fault tolerance, error reporting  Interaction, feedback to planning è Performance analysis (with PPDG)  Instrumentation and measurement of all grid components  Understand and optimize grid performance è Virtual Data Toolkit (VDT)  VDT = virtual data services + virtual data tools  One of the primary deliverables of R&D effort  Technology transfer mechanism to other scientific domains

Outreach Workshop (Mar. 1, 2002)Paul Avery32 GriPhyN/PPDG Data Grid Architecture Application Planner Executor Catalog Services Info Services Policy/Security Monitoring Repl. Mgmt. Reliable Transfer Service Compute ResourceStorage Resource DAG DAGMAN, Kangaroo GRAMGridFTP; GRAM; SRM GSI, CAS MDS MCAT; GriPhyN catalogs GDMP MDS Globus = initial solution is operational

Outreach Workshop (Mar. 1, 2002)Paul Avery33 Transparency wrt materialization Id Trans FParamName … i1 F X F.X … i2 F Y F.Y … i10 G Y PG(P).Y … TransProgCost … F URL:f 10 … G URL:g 20 … Program storage Trans. name URLs for program location Derived Data Catalog Transformation Catalog Update upon materialization App specificattr. id … …i2,i10 … … Derived Metadata Catalog id Id TransParam Name … i1 F X F.X … i2 F Y F.Y … i10 G Y PG(P).Y … Trans ProgCost … F URL:f 10 … G URL:g 20 … Program storage Trans. name URLs for program location App-specific-attr id … …i2,i10 … … id Physical file storage URLs for physical file location NameLObjN… F.XlogO3 … … LCNPFNs… logC1 URL1 logC2 URL2 URL3 logC3 URL4 logC4 URL5 URL6 Metadata Catalog Replica Catalog Logical Container Name GCMS Object Name Transparency wrt location Name LObjN … … X logO1 … … Y logO2 … … F.X logO3 … … G(1).Y logO4 … … LCNPFNs… logC1 URL1 logC2 URL2 URL3 logC3 URL4 logC4 URL5 URL6 Replica Catalog GCMS Object Name Catalog Architecture Metadata Catalog

Outreach Workshop (Mar. 1, 2002)Paul Avery34 iVDGL: A Global Grid Laboratory è International Virtual-Data Grid Laboratory  A global Grid laboratory (US, EU, South America, Asia, …)  A place to conduct Data Grid tests “at scale”  A mechanism to create common Grid infrastructure  A facility to perform production exercises for LHC experiments  A laboratory for other disciplines to perform Data Grid tests  A focus of outreach efforts to small institutions è Funded for $13.65M by NSF “We propose to create, operate and evaluate, over a sustained period of time, an international research laboratory for data-intensive science.” From NSF proposal, 2001

Outreach Workshop (Mar. 1, 2002)Paul Avery35 iVDGL Components è Computing resources  Tier1, Tier2, Tier3 sites è Networks  USA (TeraGrid, Internet2, ESNET), Europe (Géant, …)  Transatlantic (DataTAG), Transpacific, AMPATH, … è Grid Operations Center (GOC)  Indiana (2 people)  Joint work with TeraGrid on GOC development è Computer Science support teams  Support, test, upgrade GriPhyN Virtual Data Toolkit è Outreach effort  Integrated with GriPhyN è Coordination, interoperability

Outreach Workshop (Mar. 1, 2002)Paul Avery36 Current iVDGL Participants è Initial experiments (funded by NSF proposal)  CMS, ATLAS, LIGO, SDSS, NVO è U.S. Universities and laboratories  (Next slide) è Partners  TeraGrid  EU DataGrid + EU national projects  Japan (AIST, TITECH)  Australia è Complementary EU project: DataTAG  2.5 Gb/s transatlantic network

Outreach Workshop (Mar. 1, 2002)Paul Avery37  U FloridaCMS  CaltechCMS, LIGO  UC San DiegoCMS, CS  Indiana UATLAS, GOC  Boston UATLAS  U Wisconsin, MilwaukeeLIGO  Penn StateLIGO  Johns HopkinsSDSS, NVO  U Chicago/ArgonneCS  U Southern CaliforniaCS  U Wisconsin, MadisonCS  Salish KootenaiOutreach, LIGO  Hampton UOutreach, ATLAS  U Texas, BrownsvilleOutreach, LIGO  FermilabCMS, SDSS, NVO  BrookhavenATLAS  Argonne LabATLAS, CS U.S. iVDGL Proposal Participants T2 / Software CS support T3 / Outreach T1 / Labs (funded elsewhere)

Outreach Workshop (Mar. 1, 2002)Paul Avery38 Initial US-iVDGL Data Grid Tier1 (FNAL) Proto-Tier2 Tier3 university UCSD Florida Wisconsin Fermilab BNL Indiana BU Other sites to be added in 2002 SKC Brownsville Hampton PSU JHU Caltech

Outreach Workshop (Mar. 1, 2002)Paul Avery39 iVDGL Map ( ) Tier0/1 facility Tier2 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility DataTAG Surfnet Later  Brazil  Pakistan  Russia  China

Outreach Workshop (Mar. 1, 2002)Paul Avery40 Summary è Data Grids will qualitatively and quantitatively change the nature of collaborations and approaches to computing è The iVDGL will provide vast experience for new collaborations è Many challenges during the coming transition  New grid projects will provide rich experience and lessons  Difficult to predict situation even 3-5 years ahead

Outreach Workshop (Mar. 1, 2002)Paul Avery41 Grid References è Grid Book  è Globus  è Global Grid Forum  è TeraGrid  è EU DataGrid  è PPDG  è GriPhyN  è iVDGL 