Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe.

Similar presentations


Presentation on theme: "Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe."— Presentation transcript:

1 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida http://www.phys.ufl.edu/~avery/ avery@phys.ufl.edu Extending the Grid Reach in Europe Brussels, Mar. 23, 2001 http://www.phys.ufl.edu/~avery/griphyn/talks/avery_brussels_23mar01.ppt Global Data Grids The Need for Infrastructure

2 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery2 Global Data Grid Challenge “Global scientific communities, served by networks with bandwidths varying by orders of magnitude, need to perform computationally demanding analyses of geographically distributed datasets that will grow by at least 3 orders of magnitude over the next decade, from the 100 Terabyte to the 100 Petabyte scale.”

3 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery3 Data Intensive Science: 2000-2015 è Scientific discovery increasingly driven by IT  Computationally intensive analyses  Massive data collections  Rapid access to large subsets  Data distributed across networks of varying capability è Dominant factor: data growth (1 Petabyte = 1000 TB)  2000~0.5 Petabyte  2005~10 Petabytes  2010~100 Petabytes  2015~1000 Petabytes? How to collect, manage, access and interpret this quantity of data?

4 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery4 Data Intensive Disciplines è High energy & nuclear physics è Gravity wave searches (e.g., LIGO, GEO, VIRGO) è Astronomical sky surveys (e.g., Sloan Sky Survey) è Global “Virtual” Observatory è Earth Observing System è Climate modeling è Geophysics

5 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery5 Data Intensive Biology and Medicine è Radiology data è X-ray sources (APS crystallography data) è Molecular genomics (e.g., Human Genome) è Proteomics (protein structure, activities, …) è Simulations of biological molecules in situ è Human Brain Project è Global Virtual Population Laboratory (disease outbreaks) è Telemedicine è Etc. Commercial applications not far behind

6 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery6 The Large Hadron Collider at CERN “Compact” Muon Solenoid at the LHC Standard man

7 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery7 1800 Physicists 150 Institutes 32 Countries LHC Computing Challenges è Complexity of LHC environment and resulting data è Scale: Petabytes of data per year (100 PB by 2010) è Global distribution of people and resources CMS Experiment

8 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery8 Global LHC Data Grid Hierarchy Tier 1 T2 3 3 3 3 3 3 3 3 3 3 3 Tier 0 (CERN) 4 4 4 4 3 3 Tier0 CERN Tier1 National Lab Tier2 Regional Center at University Tier3 University workgroup Tier4 Workstation GriPhyN: è R&D è Tier2 centers è Unify all IT resources

9 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery9 Global LHC Data Grid Hierarchy Tier2 Center Online System CERN Computer Center > 20 TIPS France Center USA Center Italy Center UK Center Institute Institute ~0.25TIPS Workstations, other portals ~100 MBytes/sec 2.5-10 Gb/sec 100 - 1000 Mbits/sec Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Physics data cache ~PBytes/sec 2.5-10 Gb/sec Tier2 Center ~622 Mbits/sec Tier 0 +1 Tier 1 Tier 3 Tier 4 Tier2 Center Experiment Tier 2

10 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery10 Global Virtual Observatory Source Catalogs Image Data Specialized Data: Spectroscopy, Time Series, Polarization Information Archives: Derived & legacy data: NED,Simbad,ADS, etc Discovery Tools: Visualization, Statistics Standards Multi-wavelength astronomy, Multiple surveys

11 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery11 GVO: The New Astronomy è Large, globally distributed database engines  Integrated catalog and image databases  Multi-Petabyte data size  Gbyte/s aggregate I/O speed per site è High speed (>10 Gbits/s) backbones  Cross-connecting, correlating the major archives è Scalable computing environment  100s–1000s of CPUs for statistical analysis and discovery

12 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery12 Infrastructure for Global Grids

13 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery13 Grid Infrastructure è Grid computing sometimes compared to electric grid  You plug in to get resource (CPU, storage, …)  You don’t care where resource is located è This analogy might have an unfortunate downside You might need different sockets!

14 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery14 Role of Grid Infrastructure è Provide essential common Grid infrastructure  Cannot afford to develop separate infrastructures è Meet needs of high-end scientific collaborations  Already international and even global in scope  Need to share heterogeneous resources among members  Experiments drive future requirements è Be broadly applicable outside science  Government agencies: National, regional (EU), UN  Non-governmental organizations (NGOs)  Corporations, business networks (e.g., supplier networks)  Other “virtual organizations” è Be scalable to the Global level  But EU + US is a good starting point

15 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery15 A Path to Common Grid Infrastructure è Make a concrete plan è Have clear focus on infrastructure and standards è Be driven by high-performance applications è Leverage resources & act coherently è Build large-scale Grid testbeds è Collaborate with industry

16 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery16 Building Infrastructure from Data Grids è 3 Data Grid projects recently funded è Particle Physics Data Grid (US, DOE)  Data Grid applications for HENP  Funded 2000, 2001  http://www.ppdg.net/ è GriPhyN (US, NSF)  Petascale Virtual-Data Grids  Funded 9/2000 – 9/2005  http://www.griphyn.org/ è European Data Grid (EU)  Data Grid technologies, EU deployment  Funded 1/2001 – 1/2004  http://www.eu-datagrid.org/  HEP in common  Focus: infrastructure development & deployment  International scope

17 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery17 Background on Data Grid Projects è They support several disciplines  GriPhyN:CS, HEP (LHC), gravity waves, digital astronomy  PPDG:CS, HEP (LHC + current expts), Nuc. Phys., networking  DataGrid:CS, HEP, earth sensing, biology, networking è They are already joint projects  Each serving needs of multiple constituencies  Each driven by high-performance scientific applications  Each has international components  Their management structures are interconnected è Each project developing and deploying infrastructure  US$23M (additional proposals for US$35M) What if they join forces?

18 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery18 A Common Infrastructure Opportunity è GriPhyN + PPDG + EU-DataGrid + national efforts  France, Italy, UK, Japan è Have agreed to collaborate, develop joint infrastructure  Initial meeting March 4 in Amsterdam to discuss issues  Future meetings in June, July è Preparing management document  Joint management, technical boards + steering committee  Coordination of people, resources  An expectation that this will lead to real work è Collaborative projects  Grid middleware  Integration into applications  Grid testbed: iVDGL  Network testbed (Foster): T 3 = Transatlantic Terabit Testbed

19 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery19 iVDGL è International Virtual-Data Grid Laboratory  A place to conduct Data Grid tests at scale  A concrete manifestation of world-wide grid activity  A continuing activity that will drive Grid awareness  A basis for further funding è Scale of effort  For national, international scale Data Grid tests, operations  Computationally and data intensive computing  Fast networks è Who  Initially US-UK-EU  Other world regions later  Discussions w/ Russia, Japan, China, Pakistan, India, South America

20 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery20 iVDGL Parameters è Local control of resources vitally important  Experiments, politics demand it  US, UK, France, Italy, Japan,... è Grid Exercises  Must serve clear purposes  Will require configuration changes  not trivial  “Easy”, intra-experiment tests first (10-20%, national, transatlantic)  “Harder” wide-scale tests later (50-100% of all resources) è Strong interest from other disciplines  Our CS colleagues (wide scale tests)  Other HEP + NP experiments  Virtual Observatory (VO) community in Europe/US  Gravity wave community in Europe/US/(Japan?)  Bioinformatics

21 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery21 Revisiting the Infrastructure Path è Make a concrete plan  GriPhyN + PPDG + EU DataGrid + national projects è Have clear focus on infrastructure and standards  Already agreed  COGS (Consortium for Open Grid Software) to drive standards? è Be driven by high-performance applications  Applications are manifestly high-perf: LHC, GVO, LIGO/GEO/Virgo, …  Identify challenges today to create tomorrow’s Grids

22 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery22 Revisiting the Infrastructure Path (cont) è Leverage resources & act coherently  Well-funded experiments depend on Data Grid infrastructure  Collab. with national laboratories: FNAL, BNL, RAL, Lyon, KEK, …  Collab. with other Data Grid projects: US, UK, France, Italy, Japan  Leverage new resources: DTF, CAL-IT 2, …  Work through Global Grid Forum è Build and maintain large-scale Grid testbeds  iVDGL  T 3 è Collaboration with industry  next slide è EC investment in this opportunity  Leverage and extend existing projects, worldwide expertise  Invest in testbeds  Work with national projects (US/NSF, UK/PPARC, …) Part of same infrastructure

23 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery23 Collaboration with Industry è Industry efforts are similar, but only in spirit  ASP, P2P, home PCs, …  IT industry mostly has not invested in Grid R&D  We have different motives, objectives, timescales è Still many areas of common interest  Clusters, storage, I/O  Low cost cluster management  High-speed, distributed databases  Local and wide-area networks, end-to-end performance  Resource sharing, fault-tolerance, … è Fruitful collaboration requires clear objectives è EC could play important role in enabling collaborations

24 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery24 Status of Data Grid Projects è GriPhyN  US$12M funded by NSF/ITR 2000 program (5 year R&D)  2001 supplemental funds requested for initial deployments  Submitting 5-year proposal ($15M) to NSF  Intend to fully develop production Data Grids è Particle Physics Data Grid  Funded in 1999, 2000 by DOE ($1.2 M per year)  Submitting 3-year Proposal ($12M) to DOE Office of Science è EU DataGrid  10M Euros funded by EU (3 years, 2001 – 2004)  Submitting proposal in April for additional funds è Other projects?

25 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery25 Grid References è Grid Book  www.mkp.com/grids è Globus  www.globus.org è Global Grid Forum  www.gridforum.org è PPDG  www.ppdg.net è EU DataGrid  www.eu-datagrid.org/ è GriPhyN  www.griphyn.org

26 Brussels Grid Meeting (Mar. 23, 2001)Paul Avery26 Summary è Grids will qualitatively and quantitatively change the nature of collaborations and approaches to computing è Global Data Grids provide challenges needed to build tomorrows Grids è We have a major opportunity to create common infrastructure è Many challenges during the coming transition  New grid projects will provide rich experience and lessons  Difficult to predict situation even 3-5 years ahead


Download ppt "Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe."

Similar presentations


Ads by Google