Managing Computational Activities on the Grid - from Specifications to Implementation: The GLUE 2 information model OGF25, 2nd March 2009, Catania Balázs Kónya, Lund University/NorduGrid Special thanks to Sergio Andreozzi, co-chair of the OGF GLUE WG, for many of the slides
Agenda Context and Problem Description Pre-Glue 2 schemas – GLUE 1.X ( ) – NorduGrid schema (2001-) Glue2 Working Group of Open Grid Forum – Birth of a specification Insight on GLUE 2 Info model
Context and Problem Description
Grid as a multi-institutional infrastructure Intra-site resource local credential Inter-site seamless collaboration Grid-level credential Virtual organization Need for a comon language!
Problem Statement How do we describe resources shared in Grid systems in order to enable: –Resource awareness –Resource discoverability –Resource requirements expression –Resource basic monitoring
Use Case 1 (discovery) I want to run my job on an execution environment characterized by: –OS Linux, Distribution X, version Y –CPU Archicture IA64 –Available software packages: S1, S2
Use Case 2 (monitoring) I want to know –how many job slots are used by members of the VO A –what is the global available storage space for the users of VO B
Grid Information System Where can I run a job requiring OS Linux, IA64 architecture, with software package X and Y As part of the VO A, how much storage can I use on the Grid? I can offer IA64 machines with OS Linux using BES interface to users of BLUE VO I offer 15 TB of storage, 10 TB are free and usable by GREEN VO
Generalization capture common aspects for different entities providing the same functionality e.g.: uniform view over different batch services Abstraction given by the Grid paradigm Virtual pool of resources Grid-related user attributes (e.g., VO, groups, roles) Main focus on discovery for brokering, monitoring and inventory concerns those attributes that are meaningful for locate resources on the basis of a set of preferences/constraints Avoid publishing unnecessary local information Modeling Guidelines
Pre-GLUE 2 Schemas
Situation Before GLUE 2 Middleware vendors were forced to define their own information model Globus: MDS schema NorduGrid: NorduGrid schema Glite: Glue 1.x schema Condor: classadds CIM Grid infrastructures deploying a middleware extended/mixed some of these schemas E.g. Globus MDS could only describe a computing node To bridge the gap, translators were created Even OGF specifications created there own embeded information model JSDL BES For interoperable Grids, we need to unify the modeling of Grid resources into a community standard
GLUE Schema 1.x Collaborative effort focusing on interoperability started by the EU DataTAG and US iVDGL Grid projects Initial Contributors: DataGrid, Globus, PPDG, GriPhyn, NorduGrid Goal: –a common description for Grid resources designed to support discovery and selection via Grid information Service Version 1.3 was released December 2006 Still heavily used in production by EGEE and OSG Grid
GLUE 1.X - concepts Core Site, Service, Element Computing Cluster/SubCluster/Host Computing Element Storage Storage Element Storage Area Access/Control Protocol
NorduGrid Schema Used in production since May 2001 Formulated as an LDAP schema Models computing elements by giving natural representation of – Clusters – Queues – User-specific information – Grid jobs Basic model of Storage Element also exists
Nordugrid Schema objects cluster queue jobs users job-01 job-02 job-03 user-01 user-02 queue jobs users job-04 job-05 user-02 user-03 user-01
GLUE 2 Working Group of OGF: birth of a specification
OGF GLUE WG A new OGF Working Group was approved at OGF19 (Jan 2007) Previous Glue activity was moved under the OGF umbrella Co-chairs: – Sergio Andreozzi (OMII-Europe) – Laurence Field (EGEE) – Balazs Konya (NorduGrid) Focus: – facilitate interoperability between Grid infrastructures via common information models and reference implementation for describing Grid resources in response to use cases Goal: – define a use case document collecting use cases from different Grid projects/infrastructures – define a conceptual model defining the abstract schema GLUE 2.0 satisfying the collected use cases. – develop reference implementations Unify modeling approaches and experience in production systems Bring information modeling to a common platform
Contributors & Adapters The definition of the GLUE 2 Info Model was an open process: End-users (persons using Grid systems) Site administrators Grid operators Virtual Organizations managers Middleware Developers Early adopters: Glite, ARC, Unicore, TeraGrid,... See Glue 2 Implementation session on Wednesday
Timeline: planning vs. reality June 2008: GLUE 2 specification entered the public comment period August 2008: public comment period ended February 2009 (OGF25): GLUE 2 final version submited to OGF Editor April 2009: Rendering documents to be submitted to the OGF Editor
GLUE 2 documents GLUE Specification – v.2.0 Conceptual model in three sub-models – Main Entities – Computing Entities – Storage Entities Final version submitted to OGF Editor GLUE v. 2.0 – Reference Realizations to Concrete Data Models XSD SQL LDAP Public comment version is available, not yet updated to the final version GLUE Use Cases – live document Available on the gridforge
Insight on the GLUE 2 Model
What is inside GLUE 2? Entities –Description –Attributes Type (GLUE types) Multiplicity Unit Relation of Entities –UML –Associations Misc. –Extension hooks –Defaults
Main Entities Entity Extension Location Contact Domain –AdminDomain –UserDomain Service Endpoint Share Manager Resource Activity Policy –AccessPolicy –Mappingpolicy
Computing entities ComputingService ComputingEndpoint ComputingShare ComputingManager Bencmark ExecutionEnvironment ApplicationEnvironment ApplicationHandle ComputingActivity ToStorageService
OpenPBS #50 P4 2 GHz, 1 GB RAM#50 Xeon GHZ, 4 GB RAM CREAM BLUE VO GREEN VO AdminDomain UserDomain ComputingManager ExecutionEnvironment GLUE 2.0 concepts ComputingService ApplicationEnvironment ComputingEndpoint ComputingShare Complex Computing Service CREAM-BES blue share green share
Storage entities StorageService StorageServiceCapacity StorageAccessProtocol StorageEndpoint StorageShare StorageShareCapacity ToComputingService StorageManager DataStore
Next steps We have an OGF-approved specification to represent grid entities BUT: Renderings are yet to be finalised Implementations, production deployments will give lots of feedback Glue is an abstract model therefore – No instructions on how to publish/obtain information – Howto consume information Profiles are needed to synchronize GLUE with other specifications BES JSDL Production Grid Infrastructure Profile (PGI)
References OGF GLUE Working Group GLUE 2.0 Documents Specification: Renderings: Use Cases: