EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Applications Using the EGEE Grid Infrastructure C. Loomis (CNRS/LAL), V. Floros (GRNET) EuroVO Conference Garching, Germany 7-11 April 2008
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Contents Introduction Scientific Disciplines on the Grid Grid Functionality Highlighted Applications Evolution: Project & Middleware Conclusions
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Introduction Grid: A hardware and software infrastructure… Federation –Sharing and coherent use of distributed resources. –Use of diverse resources (CPU, storage, DBs, infiniband, …). Collaboration –Platform to bring different people with different skills together. –Mechanism to analyze, publish, and combine previous results. Goal: Adoption and use by large number of users in wide spectrum of different scientific disciplines.
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Scientific Disciplines Growth and diversification of applications. Reported apps. only underestimate! 6/20062/20071/2008 Astron. & Astrophysics289 Computational Chemistry62721 Earth Science16 18 Fusion234 High-Energy Physics9117 Life Sciences Others41421 Total Condensed Matter Physics Comp. Fluid Dynamics Computer Science/Tools Civil Protection Finance
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Growing Usage ~22K CPUs in continuous use Usage doubled this last year
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Usage by Scientific Discipline Wide (natural) differences in total CPU utilization. –Continued heavy use in HEP and LS. –Average 20x increase in other areas! Evidence of broad adoption of grid technology TOTAL Astron. & Astrophysics Comp. Chem Earth Science Fusion High-Energy Physics Life Sciences Others TOTAL
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Summary of Use Large, growing overall utilization Long-term, habitual use of infrastructure. Broad adoption –Growing number of Virtual Organizations. –Uptake in broad spectrum of disciplines.
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Functionality EGEE core middleware = gLite Job management –Gatekeeper, Workload Mgt. System –GridWay, GANGA/DIANE Data management: –SRM, FTS, LFC –SRB, … Metadata management –AMGA VO management –VOMS (user and group management, authorization)
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI CERN LHC 9 km © CERN Geneva
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI ATLAS Experiment 40 m 20 m 7000 tons ATLAS Image: ATL-PHO-GEN
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI High-Energy Physics (LCG) Data Rate: –40 MHz interaction rate –100 Hz of filtered events –1-10 megabytes per filtered event –0.1-1 gigabytes/second Data Volume: –LHC runs 24/7. –Generates petabytes of data per year! –Plus 1-10 times that in simulated data. Data management is the real challenge for LHC. –Recording and retrieval. –Metadata management for locating interesting data. –Chaotic analysis and large productions. kilo-K10 3 mega-M10 6 giga-G10 9 tera-T10 12 peta-P10 15 exa-E10 18
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Jürgen Knobloch Demanding Data Flow Real data: –CERN to others Analyzed data: –All to all Simulated data: –All to all
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Docking: –Predict how small molecules bind to receptor with known 3D structure. Projects: –malariacontrol.net –WISDOM Virtual Screening Process
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI WISDOM ( –Developing new drugs for neglected and emerging diseases with a particular focus on malaria. –Reduced R&D costs for neglected diseases –Accelerated R&D for emerging diseases Three large calculations: –WISDOM-I (Q3’05), 1M compounds, 1 TB –Avian Flu (Q2’06), 300k compounds, 750 GB –WISDOM-II (Q4’06), 125M compounds, 2 TB WISDOM calculations used FlexX from BioSolveIT (3-6k free, floating licenses) in addition to Autodock. WISDOM
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI GEOSCOPE GEOSCOPE ( –Localization of earthquakes –Determination of rupture modes of the faults –Results within a few hours of major earthquakes date = 6 Jan time = 5:14:17 depth = 50.9 km magnitude = 6.1 latitude = ° longitude = °
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Noise Determination Process complete data set on EGEE –25 years of data –28 seismological stations and data center Impact on seismological data center design
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Evolution Evolve from R&D project(s) to service infrastructure. –Continuous evolution since European DataGrid (2001). –EGEE-III starts 1 May 2008 for 2 years. –Planned switch to EGI after EGEE-III. Focus in EGEE-III –Support, Community Building, Advanced Functionality
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Classic Support Direct User Support –Use GGUS system; grid ticketing system. –Provide team for handling tickets. –Documentation management and generation. VO Support –Registration. –Resource allocation. –Better tools for VO management (user mgt., communication, …). Application Porting Support –GILDA (INFN): porting/training using t-Infrastructure –GASuC (SZTAKI): porting to production grid infrastructure
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Community Building Build strong, self-reliant user communities. Discipline-specific meetings –Techniques to aid each discipline (common data, tools, etc.) –Dissemination within that discipline Topical Meetings –Discuss common problems or needs –Highlight tools/techniques to address those needs
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Community Building Conferences = Knowledge Transfer –Present results from using grid technology. –Discuss encountered problems and solutions. –Increase interactions between users. UF1 (CERN) EGEE’06 (Geneva) UF2-OGF20 (Manchester) EGEE’07 (Budapest) UF3 (Clermont- Ferrand)
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Advanced Functionality gLite provides a reliable base for basic grid services. Core functionality must be augmented with higher-level software/services to provide complete stack for real applications. Continue with “cluster” development –Proven model to provide tight link with user requirements –Focus on “general”, high-level tools that can benefit others –AMGA, Dashboard, GANGA/DIANE, … RESPECT –Mechanism to highlight useful products that will work with gLite –Ensure “external” software provides support to user community
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Working Groups EGEE supports a series of work groups to develop solutions for common needs or problems. Low-latency scheduling Parallel applications (MPI) Medical Data Management Job priorities Application portals Database Access
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Commercial Software Gaussian – –Predicts the energies, vibrational freq., … of molecular systems. –VO-based licensing model, actually in use in gaussian VO. MathWorks – –Integrate MATLAB & Distributed Computing Engine with EGEE. –Both client and server are licensed in this model. Interactive Supercomputing – –Similar to DCE; used from multiple clients (MATLAB, Python, R) –Server licensed, some clients licensed
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Conclusions Scientists use the EGEE grid: –Routinely and heavily to speed and to enhance their analyses, –To federate (share) their resources, and –To collaborate effectively. Long-term goals: –Provide infrastructure and full range of support services. –Have grid technology routinely used in scientific community. –Build strong, self-reliant user communities.
Apps. on EGEE – C. Loomis, V. Floros – EuroVO – 9 April 2008 Enabling Grids for E-sciencE EGEE-II INFSO-RI Useful Links NA4 web site: – –First point of contact for both new and existing users. gLite documentation: – – –Documentation for core middleware functionality. UIG “Use Cases” – uig/production_pages/UIGindex.htmhttp://egee-uig.web.cern.ch/egee- uig/production_pages/UIGindex.htm –Simple HOWTOs for common tasks NA3 Training material – –Comprehensive catalog of training materials.