Download presentation
Presentation is loading. Please wait.
Published byOscar Parker Modified over 9 years ago
1
INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org Grid Applications & Grid Services C. Loomis (LAL-Orsay) EMBRACE-3DEM (Madrid) 23 February 2007
2
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 2 Contents Introduction –EGEE project history –Usage and users Grid Application Families Grid Software & Services Summary
3
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 3 Evolution EGEE: Enabling Grids for E-sciencE –Two-year project funded by European Commission. –Provides computing infrastructure for e-science. Evolution of Project (2001–now): –European DataGrid: R&D –EGEE: Re-engineering & Infrastructure –EGEE-II: Infrastructure & Re-engineering –EGEE-III: Same focus, in preparation Evolution of Grid Users: –Focus: Grid technology Scientific results –Goal: Grid technology Grid as a tool –Experience: IT experts IT “minimalists” more apps. larger grid
4
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 4 EGEE/LCG Production Service > 175 sites > 30 kCPU > 13 PB http://goc03.grid-support.ac.uk/googlemaps/lcg.html
5
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 5 Grid Virtual Organizations Routine and large-scale use of EGEE infrastructure. Virtual Organizations: –200+ visible on the grid –100+ registered with EGEE –App. Deploy. Plan (https://edms.cern.ch/document/722131/2)https://edms.cern.ch/document/722131/2 http://www3.egee.cesga.es/gridsite/accounting/CESGA/tree_vo.php
6
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 6 Usage History Virtual Organizations Dec. ’05 Nov. ’06 http://www3.egee.cesga.es/gridsite/accounting/CESGA/tree_vo.php Sharing and federation of resources make sense!
7
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 7 Scientific Domains Astrophysics –Planck, MAGIC Computational Chemistry Earth Science –Hydrology, Pollution, Climate, Geophysics, … Fusion High-Energy Physics –LHC, Tevatron, HERA, … Life Sciences –Medical Images, Bioinformatics, Drug Discovery Related Projects –Finance, Digital Libraries, … And more…
8
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 8 Grid Benefits Science is a balance between competition and cooperation. Grid appeals to both aspects. Better use of resources: –Sharing: faster turnaround with lower investment. –Federation: reach previously unattainable scales Better science: –Faster results: get published first! –Higher quality: better statistics, more varied data. Collaboration –Platform to bring different people with different skills together. –Mechanism to publish, reuse, and combine previous data.
9
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 9 Job Submission User Interface Resource Broker Information System Replica Catalogs 1. submit 2. query 3. query 4. submit 5. retrieve 6. retrieve publish status User Interface Resource Broker Information System Replica Catalog Storage Element Computing Element Site 1 Storage Element Computing Element Site 2
10
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 10 Comp. Serv. (Gatekeepers) LCG-CE (production) –Modified GT2 gatekeeper with VOMS support. –Not ported to SL4/VDT 1.3; supported until gLite CE is certified. gLite CE (under test) –Only direct interface is Condor-G. –Possible to run pre-WS GRAM too; not certified nor supported. –Maybe possible to run WS GRAM. CREAM CE (development) –Native, proprietary web-service interface. –Request to provide WS-GRAM interface in addition.
11
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 11 Comp. Serv. (Resource Brokers) LCG-RB (production) –Phased out in preference to WMS. gLite WMS (test) –Will talk to old and new CE interfaces. –Provides higher-level services: DAG, parameterized jobs, etc. –Version deployed on production service, but not stable. –Next version extensively tested and is much more robust. GridWay (http://www.gridway.org/)http://www.gridway.org/ –Lighter weight, lower latencies than EGEE brokers. –Standard DRMAA interface. –Federation of EGEE, non-EGEE resources.
12
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 12 Comp. Serv. (Others) Workflow –TAVERNA, MOTEUR have been used. –Need better web-service support for these tools. Others –GANGA/DIANE (ARDA): job management framework –JJS (CC-IN2P3): java job submission
13
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 13 Storage Services Strategy: Follow SRM (Storage Resource Manager). –Implementations provide SRMv1+ functionality. –SRMv2+ will provide better access control possibilities. DPM (CERN) –Disk Pool Manager: only supports disk storage. DCache (DESY) –Supports tape and other backends. –Very flexible, but complicated to install and configure. Storage Resource Broker (SRB) –Used by many disciplines for data and metadata management. –Won’t be integrated; probably can use on EGEE infrastructure.
14
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 14 Data Management Services LCG File Catalog (LFC) –Actually a general file catalog included as part of gLite. –Currently has limited access control features. File Transfer Service (FTS) –Reliable file transfer service (i.e. batch system for data). –Used only by LHC VOs now; could be used by others. Hydra –Key server for data encryption. –Client in gLite; server (?). gLite IO, Fireman (deprecated) –Provide better ACL management and consistency. –Functionality to be incorporated into standard services.
15
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 15 Transparent Data Access ELFI –Uses FUSE kernel module to expose “grid file system”. –Limited to systems where FUSE is available (easier with SL4). –Needs to allow users to mount the file system. Parrot –Intercepts system calls to provide grid data access. –Resides completely in user space.
16
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 16 Metadata Services AMGA –Lightweight metadata catalog developed in ARDA. –Allows distribution and federation of servers. –Clients in gLite; server (?). OGSA-DAI –Generic, secured interface to databases. –Works but has scalability, performance problems. –Integration not likely in the near future. GDSE (Grid Data Source Engine) –Developed by INFN. –Generic interface to data sources (DBs included).
17
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 17 Information Systems Strategy –Keep BDII-based information system for medium-term. –Need something faster and more scalable for longer term. –GLUE schema will evolve with needs of apps. and projects. §Version 2 should be completely (?) service-based. BDII (production) –LDAP-based information system. –Contains all published information. –Used for service discovery and service status. R-GMA –Producer-consumer deployment model. –Specialized uses: accounting and some application monitoring.
18
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 18 Security Security infrastructure is mature; no significant changes in the short to medium-term. –Certificate Authority services –VOMS –LCAS/LCMAPS –Proxy renewal Significant work to integrate these with all services! Potential new services: –Hydra: Data encryption key server –G-PBOX: distribution of VO-specific policies
19
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 19 Accounting Two competing/cooperating systems for collecting and presenting accounting information. –APEL §Works only for computing-related usage. §Has (partial) usage information since early 2005. §Uses R-GMA for collecting the accounting information. –DGAS §General framework for collecting and metering usage. §Probably included in next release of gLite. Developers have agreed to use same accounting sensors for collecting information.
20
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 20 Important Core Changes Move from SL3 to SL4 –Change from 2.4 to 2.6-series kernel. §Provides better support for new hardware. §Better performance on multi-CPU systems. –Minor version change of GCC compiler. VDT (Virtual Data Toolkit) –Change from VDT 1.2 to 1.3 §Compatibility with latest Globus Toolkit™. §Should have web service interfaces available. Decision made to stop integration of new developments until August 2007 to refactor code and rationalize dependencies.
21
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 21 Service Integration Policies EGEE-II users need third-party products: –“Core” only provides low-level services. –To better meet the high-level service needs of applications. –Allow applications choice of several high-level services. RESPECT: Recommended External Software Packages for EGEE Communities –Registry for useful, external software for EGEE scientists. –Final stages of approval within EGEE. –List will appear on the NA4 web site. –Developers must provide support and binary packages.
22
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 22 Application Families Simulation Bulk Processing Responsive Apps. Workflow Parallel Jobs Legacy Applications
23
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 23 Simulation Examples –LHC Monte Carlo simulation –Fusion –WISDOM Characteristics –Jobs are CPU-intensive –Large number of independent jobs –Run by few (expert) users –Small input; large output Needs –Batch-system services –Minimal data management for storage of results ATLAS ITER
24
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 24 Virtual Screening Process Docking: –Predict how small molecules bind to receptor with known 3D structure. Projects: –Proteins@Home –Rosetta@home –Docking@Home –AFRICA@home –malariacontrol.net –WISDOM
25
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 25 WISDOM WISDOM (http://wisdom.healthgrid.org/)http://wisdom.healthgrid.org/ –Developing new drugs for neglected and emerging diseases with a particular focus on malaria. –Reduced R&D costs for neglected diseases –Accelerated R&D for emerging diseases Three large calculations: –WISDOM-I (Summer 2005) –Avian Flu (Spring 2006) –WISDOM-II (Autumn 2006) WISDOM calculations used FlexX from BioSolveIT (3-6k free, floating licenses) in addition to Autodock.
26
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 26 Docking Results TargetsCom- pound s CPU- years Duration (wk) Max. CPUs Size of Results (TB) WISDOM-I (Q3’05) PBD1M80617001 Avian Flu (Q2’06) H5N1300k105617000.750 WISDOM-II (Q4’06) GST DHFR Tubulin 125M240850002
27
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 27 Benefits from Grid Computing Resources –Provided large amount of CPUs that normally would not have been available if it had to be bought. Storage Resources –Ability to hook storage for results to grid. –Ability to make permanent backups of the data. Tools –Job management tools to handle millions of jobs. –Tools for collecting and storing results from calculations. –Data management tools for collating the data and making it available to others. Collaboration –Platform engendered new human collaboration and provides environment in which to share and analyze data efficiently.
28
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 28 Continued Analysis WISDOM-I: Molecular dynamics –5k best plasmepsin docking compounds are being reanalyzed using molecular dynamics codes –Need more “classic” parallel resources, either MPI on EGEE or use of supercomputers through DEISA Avian Flu: –Top 5% of compounds will be refined through other methods –From top 5% of compounds: §structure cluster will be done for web lab assay §50+ compounds will be assayed experimentally by (GRC, Academia Sinica, Taiwan) WISDOM-II: –Post-docking filtering and analysis.
29
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 29 Bulk Processing Examples –HEP processing of raw data, analysis –Earth observation data processing Characteristics –Widely-distributed input data –Significant amount of input and output data Needs –Job management tools (workload management) –Meta-data services –More sophisticated data management
30
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 30 Responsive Apps. (I) Examples –Prototyping new applications –Monitoring grid operations –Direct interactivity Characteristics –Small amounts of input and output data –Not CPU-intensive –Short response time (few minutes) Needs –Configuration which allows “immediate” execution (QoS) –Services must treat jobs with minimum latency
31
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 31 Responsive Apps. (II) Grid as a backend infrastructure: –gPTM3D: interactive analysis of medical images –GPS@: bioinformatics via web portal –DILIGENT: digital libraries –Volcano sonification Characteristics –Rapid response: a human waiting for the result! –Many small but CPU-intensive tasks –User is not aware of “grid”! Needs –Interfacing (data & computing) with non-grid application or portal –User and rights management between front-end and grid
32
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 32 PTM3D: –Interactive analysis of 3D data for surgery planning and volumetric analysis. –Requires “guiding” from physician to find initial contours, work around noisy data, … –Needs unplanned, interactive access to significant computational resources. gPTM3D
33
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 33 Speed-up gives response times acceptable to doctors. Grid overhead doesn’t dominate for short calculations. Requires application modifications to use with grid. Results Dataset (MB) Input (MB) Output (MB) Tasks1 CPU (s) EGEE (s) Sm. body873616931537 Med. Body2109.6573781980150 Lg. Body34615866761080123 Lungs870.42.3953624
34
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 34 Workflow Examples –“Bronze Standard”: image registration –Flood prediction Characteristics –Use of grid and non-grid services –Complex set of algorithms for the analysis –Complex dependencies between individual tasks Needs –Tools for managing the workflow itself –Standard interfaces for services (I.e. web-services)
35
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 35 Parallel Jobs Examples –Climate modeling –Earthquake analysis –Computational chemistry Characteristics –Many interdependent, communicating tasks –Many CPUs needed simultaneously –Use of MPI libraries Needs –Configuration of resources for flexible use of MPI –Pre-installation of optimized MPI libraries
36
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 36 Legacy Applications Examples –Commercial or closed source binaries –Geocluster: geophysical analysis software –FlexX: molecular docking software –Matlab, Mathematics, … Characteristics –Licenses: control access to software on the grid –No recompilation no direct use of grid APIs! Needs –License server and grid deployment model –Transparent access to data on the grid
37
Grid Apps. – C. Loomis – 11 November 2006 Enabling Grids for E-sciencE INFSO-RI-031688 37 Summary & Conclusions Observe routine and large-scale use of the EGEE infrastructure by numerous, diverse set of user communities. Present: –Grid is a collaborative platform: 10+ domains, 200+ VOs. –Grid enables sharing of resources and data for better science. Future: –Responsiveness: Applications requiring quality-of-service. –Workflow: Use of different infrastructures, instruments. –Bigger role for third-party software for applications on grid.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.