E-Science Technology/Middleware (Grid, Cyberinfrastructure) Gap Analysis e-Science Town Meeting Strand Palace Hotel May 14 2003 Geoffrey Fox, Indiana University.

Slides:



Advertisements
Similar presentations
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Advertisements

OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June Geoffrey Fox, Marlon Pierce Community.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Cracow Grid Workshop, November 5-6, 2001 Towards the CrossGrid Architecture Marian Bubak, Marek Garbacz, Maciej Malawski, and Katarzyna Zając.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
The OMII Position At the University of Southampton.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
Course Instructor: Aisha Azeem
Core Grid Functions: A Minimal Architecture for Grids William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (www-itg.lbl.gov/~wej)
© 2006 Open Grid Forum Geoffrey Fox GFSG Meeting CWI Amsterdam December OGF eScience Function.
TeraGrid Information Services December 1, 2006 JP Navarro GIG Software Integration.
Possible Architectural Principles for OGSA-UK and other Grids UK e-Science Core Programme Town Meeting London Monday 31st January 2005 “Defining the next.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
DISTRIBUTED COMPUTING
CoG Kit Overview Gregor von Laszewski Keith Jackson.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Jarek Nabrzyski, Ariel Oleksiak Comparison of Grid Middleware in European Grid Projects Jarek Nabrzyski, Ariel Oleksiak Poznań Supercomputing and Networking.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
TeraGrid CTSS Plans and Status Dane Skow for Lee Liming and JP Navarro OSG Consortium Meeting 22 August, 2006.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Authors: Ronnie Julio Cole David
E-Science Technology/Middleware (Grid, Cyberinfrastructure) Gap Analysis and OMII SEAG Meeting DTI June Geoffrey Fox, Indiana University David.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
ISERVOGrid Architecture Working Group Brisbane Australia June Geoffrey Fox Community Grids Lab Indiana University
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Remarks on OGSA and OGSI e-Science All Hands Meeting September Geoffrey Fox, Indiana University.
CGL: Community Grids Laboratory Geoffrey Fox Director CGL Professor of Computer Science, Informatics, Physics.
GCE Shell? GGF6 Chicago October Geoffrey Fox Marlon Pierce Indiana University
7. Grid Computing Systems and Resource Management
2005 GRIDS Community Workshop1 Learning From Cyberinfrastructure Initiatives Grid Research Integration Development & Support
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Overview of Grid Computing Environments Proposed GGF Information Document G.Fox, D. Gannon, M. Pierce, M. Thomas PTLIU Laboratory for Community Grids Geoffrey.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
The National Grid Service Mike Mineter.
GRIDSTART a European GRID coordination attempt Fabrizio Gagliardi CERN.
Welcome Grids and Applied Language Theory Dave Berry Research Manager 16 th October 2003.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
Bob Jones EGEE Technical Director
SuperComputing 2003 “The Great Academia / Industry Grid Debate” ?
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Stephen Pickles Technical Director, GOSC
iSERVOGrid Architecture Working Group Brisbane Australia June
Grid Technology Implications for ACES and SERVOGrid Brisbane Australia June Geoffrey Fox Marlon Pierce Community Grids.
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
Grid Federation JXTA Jini etc.
Grid Systems: What do we need from web service standards?
Status of Grids for HEP and HENP
Current and Future Perspectives of Grid Technology Panel
GGF10 Workflow Workshop Summary
Presentation transcript:

e-Science Technology/Middleware (Grid, Cyberinfrastructure) Gap Analysis e-Science Town Meeting Strand Palace Hotel May Geoffrey Fox, Indiana University David Walker, Cardiff University Note for this report the terms e-Science Technology/Middleware, Grid, and Cyberinfrastructure are NOT distinguished

Features of Study Draft report distributed to TAG April –A: Summary –B: Technology/Project/Worldwide Service Context –C: Gaps by Category –D: Appendix of UK activities of relevance –E: Action Plan for OMII Interviewed 80 people -- reasonably complete within the UK Extracted and categorized over 120 comments (gaps) Developed an action plan that could be used to guide Core e- Science effort (UK OMII Open Middleware Infrastructure Initiative) to produce robust useable e-Science (Grid) infrastructure by 2006 Interview part of project ran from mid February to early April – currently adding TAG comments and completing worldwide Service context (largely literature/web-based not interviews) –Integrating UK and Worldwide Service studies with uniform terminology/classification “85% finished”

Features of Gap Analysis Examined requirements and services already understood/developed for e-Science (reasonably broad coverage) and e-Business, e-Government and e-Services (inevitably rather spotty coverage) Gaps divided into four broad areas –Near-term Technical –Education and Support –Research (not well separated from Near-term Technical) –Perception and Organization Appendix listed over 60 significant UK services (perhaps clustered together) and tools – in the context of a total of some 150 world wide Grid services

Network 8.11 Information ComputeResources Portals PSE’s 8.10 Application Specific Resource Specific Generic Grid Services: Architecture and Style 8.1 Basic Technology Runtime and Hosting Environment 8.2 Information 8.7 Compute/File 8.8 Security 8.3 Workflow 8.4 Notification 8.5 Meta-data 8.6 Other 8.9 Categorization of Technical Gaps and Grid Services

Taxonomy of Grid Functionalities Name of Grid Type Description of Grid Functionality Compute/File Grid Run multiple jobs with distributed compute and data resources (Global “UNIX Shell”) Desktop Grid “Internet Computing” and “Cycle Scavenging” with secure sandbox on large numbers of untrusted computers Information Grid Grid service access to distributed information, data and knowledge repositories Complexity or Hybrid Grid Hybrid combination of Information and Compute/File Grid emphasizing integration of experimental data, filters and simulations Campus Grid Grid supporting University community computing Enterprise Grid Grid supporting a company’s enterprise infrastructure Note: Term Data Grid not used consistently in community so avoided

HPC Simulation Data Filter Data Filter Data Filter Data Filter Data Filter Distributed Filters massage data For simulation Other Grid and Web Services Analysis Control Visualize Complexity Grid Computing Model Grid OGSA-DAI Grid Services This Type of Grid integrates with Parallel computing e.g. HPC(x)

Taxonomy of Grid Operational Style Name of Grid Style Description of Grid Operational or Architectural Style Semantic GridIntegration of Grid and Semantic Web meta- data and ontology technologies Peer-to-peer GridGrid built with peer-to-peer mechanisms Lightweight GridGrid designed for rapid deployment and minimum life-cycle support costs Collaboration GridGrid supporting collaborative tools like the Access Grid, whiteboard and shared applications. R3 or Autonomic Grid Fault tolerant and self-healing Grid Robust Reliable Resilient R3

“Central” Architecture/Functionality/Style Gaps Substantial comments on “hosting environments” OGSI and “permeating principles” –Agreement on Web service model 4: Key OGSA Services 5: OGSA-compliant System Grid Services 6: Domain-Specific (Application) Grid Services 1: Hosting Environment WS 2: OGSI Web service Enhancements 3: Permeating Principles and Policies “Central Services And Architecture” Central Gaps “Modular” Services natural for distributed teams Specific Gaps

An OGSA Grid Architecture in detail (from GGF GPA)

Permeating Principles and Policies Meta-data rich Message-linked Web Services as the permeating paradigm “User” Component Model such as “Enterprise JavaBean (EJB)” or.NET. Service Management framework including a possible Factory mechanism High level Invocation Framework describing how you interact with system components. –This could for example be used to allow the system to built from either W3C or GGF style (OGSI) Web Services and to protect the user from changes in their specifications. Security is a service but the need for fine grain selective authorization encourages Policy context that sets the rules for each particular Grid. –Currently OGSA supports policies for routing, security and resource use. The Grid Fabric or set of resources needs mechanisms to manage them. This includes automatic recording of meta-data and configuration of software. Quality of service (QoS) for the Network and this implies performance monitoring and bandwidth reservation services. –Challenging as end-to-end and not just backbone QoS is needed. Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as converting between different interface specifications. Messaging is built on transport mechanisms which can be used to support mechanisms to implement QoS and to virtualize ports

World Wide Grid Service Activities I This was implicit in original report for TAG and now is being made explicit based on interviews plus survey of major worldwide activities Commercial activities especially those of IBM, Avaki, Platform, Sun, Entropia and United Devices The GT2 and GT3 Globus Toolkits. Here we effectively covering not just the Globus team but the major projects such the NASA Information Power Grid that have blazed the trail of “productizing” Grids. –Note that we can “already” see GT3 (Grid Service) like functionality from GT2 wrapped with the various (Java, Perl, Python, CORBA) CoG kits. So GT2 capabilities can be classified as Services Trillium (GriPhyn, iVDGL and PPDG) and NeesGrid; the major NSF (DoE for PPDG) projects in the USA. –Condor from the University of Wisconsin which is being integrated into Grid services through the Trillium and NMI activities. The NSF Middleware Initiative (NMI) packaging a suite of Globus, Condor and Internet2 software. –This has overlaps with the VDT (Virtual Data Toolkit from GriPhyn)

World Wide Grid Service Activities II Unicore (GRIP), GridLab, the European Data Grid (EDG) and LCG (LHC Computing Grid) –Many other (20) EU Projects but these have most of technology development Storage Resource Broker SRB-MCAT from SDSC The DoE Science Grid and related activities such as the Common Component Architecture (CCA) project Examination of services from a collection of portal projects in the US from Argonne, Indiana, Michigan, NCSA and Texas. –This includes best practice discussion from Global Grid Forum in portals. Review of contributions to the recent book Grid Computing: Making the Global Infrastructure a Reality edited by Fran Berman, Geoffrey Fox and Tony Hey, John Wiley & Sons, Chichester, England, ISBN , March 2003 –This includes other major projects like Cactus, NetSolve, Ninf Some 6 Core and other application specific UK e-Science Projects

Categories of Worldwide Grid Services Types of Grid –R3 –Lightweight –P2P –Federation and Interoperability Core Infrastructure and Hosting Environment –Service Management –Component Model –Service wrapper/Invocation –Messaging Security Services –Certificate Authority –Authentication –Authorization –Policy Workflow Services and Programming Model –Composition/Development –Languages and Programming –Compiler –Enactment Engines (Runtime) Notification Services Metadata and Information Services –Basic including Registry –Semantically rich Services and meta-data –Information Aggregation (events) –Provenance Information Grid Services –OGSA-DAI/DAIT –Integration with compute resources –P2P and database models Compute/File Grid Services –Job Submission –Job Planning Scheduling Management –Access to Remote Files, Storage and Computers –Replica (cache) Management –Virtual Data –Parallel Computing Other services including –Grid Shell –Accounting –Fabric Management –Visualization Data-mining and Computational Steering –Collaboration Portals and Problem Solving Environments Network Services –Performance –Reservation –Operations

Features of Worldwide Grid Services UK activities have a strong web service and Information Grid emphasis –Important compute/file activities as well (White Rose, RealityGrid, UK part of EDG etc.) Non UK activities are dominantly focused on compute/file Grids –Submit jobs in distributed UNIX shell (Gridshell) fashion –Gather data from instruments (accelerator, satellite, medical device); process in batch mode mapping between filesets Little emphasis on lightweight or R3 Grids but NSF in USA and EDG have aimed at better support and software quality –EDG has useful “tension” between technology and application focus working groups –NMI and even GT3 have changed packaging and added service view – have not changed “underlying” architecture for robustness Coordinated set of Portal activities in USA Little work on integrating parallel computing and Grid although TeraGrid in USA could change this Gaps are omissions/deficiencies in UK or worldwide Grid services of importance to UK e-Science

Central Gaps: Gaps in Grid Styles and Execution Environment Need for both robust (fault tolerant) and lightweight (suitable for small groups) Grid styles identified –Peer-to-peer style supports smaller decentralized virtual organizations Noted opportunities for modern middleware ideas to be used – lightweight, message-based Noted that Enterprise JavaBeans not optimized for Science which has high volume dataflow Federated Grid Architecture natural for integration of heterogeneous functionality, style and security Bioinformatics and other fields require integration of Information and Compute/File Grids

Information Grid Enterprise Grid Compute Grid Campus Grid R2R1 Teacher Students Dynamic light-weight Peer-to-peer Collaboration Training Grid Overlapping Heterogeneous Dynamic Grid Islands

(a) Layered OGSA Grid Core Service Core Service Core Service Core Service Application Service Application Service Application Service OGSA Interface OGSA Mediation Core Service Core Service Core Service Core Service Core Service Core Service Appl. Service Appl. Service Appl. Service Appl. Service Grid-1Grid-2 OGSA or non OGSA Interface-2OGSA or non OGSA Interface-1 (b) Federated OGSA Grid

Many Gaps in Generic Services Some gaps like Workflow and Notification are to make production versions of current projects –Appendix shows workflow from DAME, DiscoveryNet, EDG, Geodise, ICENI, myGrid, Unicore plus Cardiff, NEReSC …. RGMA and Semantic Grid offer improved meta-data and Information services compared to UDDI and MDS (Globus) –Need comprehensive federated Information service Security requires architecture supporting dynamic fine- grain authorization UK e-Science has pioneered Information Grids but gap is continuation of OGSA-DAI, integration with other services and P2P decentralized models Functionality of Compute/File Grids quite advanced but services probably not robust enough for LCG or Campus Grids

Gaps in Other Grid services Portals and User Interfaces – Noted gap that not using Grid Computing Environment “best practice” with component based user-interfaces matching component-based middleware Programming Models (using workflow runtime) Fabric Management (should be integrated with central service management and Information system), Computational Steering, Visualization, Datamining, Accounting, Gridmake, Debugging, Semantic Grid tools (consistent with Information system), Collaboration, provenance Application-specific services Note new production central Infrastructure can support both research and production services of this type

NCSA Jetspeed Computing Portal

Some Non-Technical Gaps (Sections 9 and 11) Some confusion as to “future” of Grid software and how projects should evolve to match evolution of Globus, OGSA etc. Correspondingly need special attention to education (training) in rapidly changing technologies Need dedicated testbeds and repositories Current e-Science projects are typically aimed at “demonstrator” and not broadly deployable “production” software –Correct initial strategy and supports new focus for next phase of core e-Science Technology Repository and Testbed Team Architecture and Project Coordination Distributed Sub-project Teams ACTION PLAN

Action Plan (OMII) Structure Technology Repository and Testbed Team –Compliance testing –Track, training coordination with pro-active alerting technology status/directions –Approximately 6 people Architecture and Project Coordination –Agile Software Engineering and Project Management –Central technology architecture and development –Work with Advisory board meeting about once per month initially –6-12 “professional” people in 1-2 sites –Clear relationship to application requirements Distributed Sub-project Teams –“Independent” activities as now but aiming at deployable production software Set of focused workshops to refine key services and architecture –e.g. service management, messaging, workflow, integration of OGSA-DAI with Compute/File Grids (just a representative set)

Central Action Plan Projects Develop Grid infrastructure supporting –Robust Reliable Resilient (R3) Essential –Lightweight and Desirable –Peer-to-peer styles Desirable Could involve asynchronous messaging, federated security (fine-grain authorization), “e-ScienceBean”, notification (as part of service management), invocation frameworks “virtualizing” service component structure Integrate network monitoring/ reservation/ management including end-to-end network operations Support critical policies like security, provenance Powerful Service management (Research needed here) Need to either federate and/or interoperate a world of “Grid Islands”

Essential Services in Action Plan (layer 4) Workflow runtime supporting transactions and high volume dataflow –Different e-Science programming models/languages can use same runtime and be developed independently Federated Distributed Information System –From low level service registration through high-level semantic metadata (separated or integrated) –Support of service semantics most quoted “gap” (Semantic Grid leadership important) –Support P2P, Central (MDS style) and service-based (SDE) metadata –Here as elsewhere can collaborate with GT3, EDG …

Specific Grid Services (layers 5, 6) Core Domain Grid Services cover the critical Services for major Grid functionalities –Information Grid: OGSA-DAIT –Compute/File Grid: work with LCG, EDG (follow on), Trillium(USA) on robust infrastructure New central (R3) architecture affects strategy Include Campus Grid support –Hybrid Grids (Complexity Grids) integrating computing (filters, transformations) possibly on major parallel computing facilities and data repository access for Bioinformatics, Environmental (Earth) Science, Virtual Observatories …… Other Services as identified in Gap Analysis with distributed teams working on different services in concert with central team for software engineering and OGSA interfaces as appropriate