Download presentation
Presentation is loading. Please wait.
1
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Large Scale Grid Infrastructures: Status and Future Erwin Laure EGEE Technical Director CERN, Switzerland
2
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 2 Scientific Grid Infrastructures DEISA EGEE Naregi Nordugrid OSG Pragma Teragrid GINGIN
3
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 3 eScience Science is becoming increasingly digital, needs to deal with increasing amounts of data and computational needs Simulations get ever more detailed –Nanotechnology – design of new materials from the molecular scale –Modelling and predicting complex systems (weather forecasting, river floods, earthquake) –Decoding the human genome Experimental Science uses ever more sophisticated sensors to make precise measurements Need high statistics Huge amounts of data Serves user communities around the world
4
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 4 High Energy Physics Large Hadron Collider (LHC): One of the most powerful instruments ever built to investigate matter 4 Experiments: ALICE, ATLAS, CMS, LHCb 27 km circumference tunnel Due to start up in 2007 Mont Blanc (4810 m) Downtown Geneva
5
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 5 Accelerating and colliding particles 40 Million Particle collisions per second Online filter reduces to a few 100 “good” events per second recorded on disk and magnetic tape at 100-1,000 MegaBytes/sec ~15 PetaBytes per year for all four experiments
6
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 6 simulation reconstruction analysis interactive physics analysis batch physics analysis batch physics analysis detector event summary data raw data event reprocessing event reprocessing event simulation event simulation analysis objects (extracted by physics topic) Data Handling and Computation for Physics Analysis event filter (selection & reconstruction) event filter (selection & reconstruction) processed data les.robertson@cern.ch
7
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 7 Example: Addressing emerging diseases Emerging diseases know no frontiers. Time is a critical factor Avian influenza: human casualties International collaboration is required for: Early detection Epidemiological watch Prevention Search for new drugs Search for vaccines
8
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 8 WISDOM, the first step WISDOM focuses on drug discovery for neglected and emerging diseases. –Summer 2005: World-wide In Silico Docking On Malaria –Spring 2006: drug design against H5N1 neuraminidase involved in virus propagation impact of selected point mutations on the efficiency of existing drugs identification of new potential drugs acting on mutated N1 –New challenge currently being run N1H5
9
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 9 Challenges for high throughput virtual docking 300,000 Chemical compounds: ZINC & Chemical combinatorial library Target (PDB) : Neuraminidase (8 structures) Millions of chemical compounds available in laboratories High Throughput Screening 2$/compound, nearly impossible Molecular docking (Autodock) 100s CPU years, TBs data Data challenge on EGEE, Auvergrid, TWGrid ~6 weeks on ~2000 computers In vitro screening of 100 hits Hits sorting and refining
10
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 10 Example: Determining earthquake mechanisms Seismic software application determines epicentre, magnitude, mechanism Analysis of Indonesian earthquake (28 March 2005) –Seismic data within 12 hours after the earthquake –Analysis performed within 30 hours after earthquake occurred –Results Not an aftershock of December 2004 earthquake Different location (different part of fault line further south) Different mechanism Rapid analysis of earthquakes important for relief efforts Peru, June 23, 2001 Mw=8.4 Sumatra, March 28, 2005 Mw=8.5
11
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 11 Main trend The size of data an organization owns, manages, and depends on is dramatically increasing: –Ownership cost of storage capacity goes down –Data generated and consumed goes up –Network capacity goes up –Distributed computing technology matures and is more widely adopted
12
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 12 The Grid Integrating computing power and data storage capacities across administrative boundaries Providing seamless access to resources distributed around the globe More effective and seamless collaboration of dispersed communities, both scientific and commercial Ability to run large-scale applications comprising thousands of computers, for wide range of applications
13
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 13 EGEE Infrastructure operation –Currently includes ~200 sites across 40 countries –Continuous monitoring of grid services & automated site configuration/management http://gridportal.hep.ph.ic.ac.uk/rtm/launch_frame.html –Used by >160 VOs running >50.000 jobs/day Middleware –Production quality middleware distributed under business friendly open source licence User Support - Managed process from first contact through to production usage –Training –Expertise in grid-enabling applications –Online helpdesk –Networking events (User Forum, Conferences etc.) Interoperability –Expanding geographical reach and interoperability with collaborating e-infrastructures
14
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 14 EGEE Applications >160 VOs from several scientific domains –Archaeology –Astronomy –Life Sciences –Computational Chemistry –Earth Sciences –Financial Simulation –Geo-Physics –High Energy Physics –Physics of Plasma Further applications under evaluation Applications have moved from testing to routine and daily usage ~80-90% efficiency
15
Grid Crunching Day 2006 15 Grid of Grids - from Local to Global Community Campus National
16
Grid Crunching Day 2006 16 OSG sites
17
Grid Crunching Day 2006 17 32 Virtual Organizations - participating Groups 3 with >1000 jobs max. (all particle physics) 3 with 500-1000 max. (all outside physics) 5 with 100-500 max (particle, nuclear, and astro physics)
18
EGEE Workshop on Management of Rights in Production Grids Paris, June 19th, 2006 Grid Crunching Day 2006V. Alessandrini IDRIS-CNRS 18 The DEISA supercomputing environment (21.900 processors and 145 Tf in 2006, more than 190 Tf in 2007) IBM AIX Super-cluster –FZJ-Julich, 1312 processors, 8,9 teraflops peak –RZG – Garching, 748 processors, 3,8 teraflops peak –IDRIS, 1024 processors, 6.7 teraflops peak –CINECA, 512 processors, 2,6 teraflops peak –CSC, 512 processors, 2,6 teraflops peak –ECMWF, 2 systems of 2276 processors each, 33 teraflops peak –HPCx, 1600 processors, 12 teraflops peak BSC, IBM PowerPC Linux system (MareNostrum) 4864 processeurs, 40 teraflops peak SARA, SGI ALTIX Linux system, 416 processors, 2,2 teraflops peak LRZ, Linux cluster (2.7 teraflops) moving to SGI ALTIX system (5120 processors and 33 teraflops peak in 2006, 70 teraflops peak in 2007) HLRS, NEC SX8 vector system, 646 processors, 12,7 teraflops peak. Systems interconnected with dedicated 1Gb/s network – currently upgrading to 10 Gb/s – provided by GEANT and NRENs
19
Grid Crunching Day 200619 National Research Grid Infrastructure (NAREGI) 2003-2007 Petascale Grid Infrastructure R&D for Future Deployment –$45 mil (US) + $16 mil x 5 (2003-2007) = $125 mil total –Hosted by National Institute of Informatics (NII) and Institute of Molecular Science (IMS) –PL: Ken Miura (Fujitsu NII) Sekiguchi(AIST), Matsuoka(Titech), Shimojo(Osaka-U), Aoyagi (Kyushu-U)… –Participation by multiple (>= 3) vendors, Fujitsu, NEC, Hitachi, NTT, etc. –Follow and contribute to GGF Standardization, esp. OGSA AIST Grid Middleware SuperSINET Grid R&D Infrastr. 15 TF-100TF Grid and Network Management “NanoGrid” IMS ~10TF (BioGrid RIKEN) OtherInst. National Research Grid Middleware R&D Nanotech Grid Apps (Biotech Grid Apps) (Other Apps) Titech Fujitsu NEC Osaka-U U-Kyushu Hitachi Focused “Grand Challenge” Grid Apps Areas U-Tokyo
20
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 20 LCG depends on two major science Grid infrastructures (plus regional Grids) EGEE - Enabling Grids for E-Science OSG - US Open Science Grid High Energy Physics (LCG)
21
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 21 Scientific Grid Infrastructures DEISA EGEE Naregi Nordugrid OSG Pragma Teragrid GINGIN
22
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 22 Grids are in Production Use
23
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 23 EGEE and Sustainability BUT … How do Grid infrastructures compare to other computing infrastructures? –Number of infrastructure users? –Number of application domains? –Number of computing nodes? –Number of years in service? What would happen, if Grid infrastructures are turned off? What happens after April 2008 (End of EGEE-II)?
24
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 24 Why Sustainability? Scientific applications start to depend on Grid infrastructures –e.g. EGEE supports well over 100 VOs, running over 50.000 jobs/day –Require long-term support New scientific collaborations have been formed thanks to the Grid infrastructure –E.g. WISDOM (http://wisdom.healthgrid.org)http://wisdom.healthgrid.org Business and Industry are getting very interested but need a long term perspective –E.g. over 20 companies were present at the Business Track during the EGEE’06 conference, September, 2006 >50k jobs/day Virtual Organizations Jan. ’06 Sep. ’06
25
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 25 The Future of Grids Increasing the number of infrastructure users by increasing awareness –Dissemination and outreach –Training and education –Grids offer new opportunities for collaborative work Increasing the number of applications by improving application support and middleware functionality –Increase stability, scalability, and usability Major efforts needed particularly on VO management, security infrastructure, data management, and job management –High level grid middleware extensions Increasing the grid infrastructure –Increase manageability of Grid services –Incubating regional Grid projects –Ensuring interoperability between projects Protecting user investments –Towards a sustainable grid infrastructure
26
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 26 Sustainability: Beyond EGEE-II Need to prepare for permanent Grid infrastructure –Ensure a reliable and adaptive support for all sciences –Independent of short project funding cycles –Infrastructure managed in collaboration with national grid initiatives
27
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 27 Grids in Europe Great investment in developing Grid technology Sample of National Grid projects: –Austrian Grid Initiative –DutchGrid –France: Grid’5000 –Germany: D-Grid; Unicore –Greece: HellasGrid –Grid Ireland –Italy: INFNGrid; GRID.IT –NorduGrid –Swiss Grid –UK e-Science: National Grid Service; OMII; GridPP EGEE provides framework for national, regional and thematic Grids
28
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 28 Evolution European e-Infrastructure Testbeds Utility Service Routine Usage National Global
29
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Crunching Day 2006 29 Summary Grids represent a powerful new tool for science Today we have a window of opportunity to move grids from research prototypes to permanent production systems (as networks did a few years ago) EGEE offers … … a mechanism for linking together people, resources and data of many scientific community … a basic set of middleware for gridfying applications with documentation, training and support … regular forums for linking with grid experts, other communities and industry
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.