Stephen Burke egi.eu EGI TF Prague September 20th 2012 GLUE 2.0 Profile for EGI Stephen Burke egi.eu EGI TF Prague September 20th 2012
GLUE 2.0 Profile - EGI TF Prague Overview Why we needed a profile document Classification of use cases Classification of importance of publication Validation Installed capacity Document structure Review process Implementation GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Introduction The GLUE 2.0 schema is intentionally very flexible Many ways to use it, not necessarily interoperable Several SRM implementations Need a profile to specify how it should be used in EGI Currently “in EGI” means “in the BDII” – may change Detailed semantics of each attribute, what should and should not be published Monitoring tools should enforce the usage First public draft is available Some internal comments already Will need detailed review, especially by middleware developers Hope to converge by the end of October Likely to need updates in the light of experience GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Use cases The schema can potentially satisfy many use cases Various implications: importance, need for accuracy, update rate, latency, caching ... The BDII currently has all information, but we may want to have separate systems for different kinds of information EMIR? The profile document classifies each attribute as potentially useful for one or more of 5 categories of use case The fact that something can be used in a particular way doesn’t mean that it will be! GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Use case categories Service Discovery (SD): used as selection parameters in queries to find services, or service attributes returned by such queries Service Selection (SS): dynamic information used to choose a particular service, e.g. number of queued jobs Monitoring (M): information useful to monitor the overall state of the Grid, e.g. as used in gstat Oversight (O): high-level management information, e.g. installed capacity and service versions Diagnostic (D): information which may help diagnose problems with the services or in the info system itself GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Importance The schema document just classifies attributes as mandatory or optional For EGI we need more detailed guidance as to which attributes need to be published Again these are classified in 5 categories GLUE 2.0 Profile - EGI TF Prague
Importance categories Mandatory (M): information which must be published. Includes all attributes which are mandatory in the schema, but adds some which are important for EGI. Recommended (R): information which should be published unless there is a good reason, e.g. technically impossible or not meaningful in a particular case. Desirable (D): information which is likely to be useful, and should be published if reasonably practical, but which may be omitted. Optional (O): information which is truly optional. Typically these attributes have no use in the existing EGI infrastructure, or are only useful in specialised cases. Undesirable (U): information which may be damaging in the EGI context, e.g. due to large data volumes or exposure of sensitive information. GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Validation Validation of the published information has traditionally been a weak point LDAP has few internal constraints, and gives cryptic errors for the things it does check The definition of which values are allowed is fairly poor Many things just defined by the code or common practice The gstat tool has some checks, but with limited coverage and inflexible for updating the tests The current information system is full of mistakes! Circular problem: hard to persuade people to publish unused attributes correctly, hard to persuade people to use attributes if there are many errors For GLUE 2 we should try to do better GLUE 2.0 Profile - EGI TF Prague
Validation categories FATAL: errors which invalidate the structure of the information, e.g. unique IDs which are not unique. ERROR: values which are definitely incorrect. WARNING: values which are likely to be incorrect, or which are valid within the schema in general but invalid in the EGI context, e.g. very large numbers of jobs. INFO: values which are technically correct but may be wrong, e.g. strings in the wrong case, locations at (0,0), typos, values in the wrong units etc. It is therefore desirable to have heuristics to try to identify such mistakes – e.g. collect a list of commonly-published batch system names and flag names which are not in the list. GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Validation tools We need tools which validate the information in different situations: Interoperability tests in middleware development Service-specific tests, e.g. for CE and SE During middleware acceptance tests Inside the BDII before publication Checking the entire published information in a top-level BDII May be able to have a common set of tests which can be plugged in to different tools? Need to report errors to the right place Middleware bugs to developers, not system managers! GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Validation tests The profile document has many suggested validation tests Intended as guidance and not fixed rules Some things may be too hard to implement New/different tests always welcome INFO-level tests will probably have many false positives Need to be able to switch them off and/or update them GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Installed capacity Useful to know the total installed capacity (CPU power, disk and tape sizes) and the sharing between VOs WLCG has a document for GLUE 1: https://twiki.cern.ch/twiki/pub/LCG/WLCGCommonComputingReadinessChallenges/WLCG_GlueSchemaUsage-1.8.pdf Some things are difficult in GLUE 1, should be easier in GLUE 2 Profile document has guidance Computing power straightforward Logical CPUs, HEPSPEC-06, VO shares Storage more complex Including/excluding unavailable disk servers Treatment of cached files, parity/spare disks, T1D0, ... Shared space vs space tokens Logical vs physical view, double counting May need to iterate with developers GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Document structure Introduction explaining concepts Detailed section for each schema class Doesn’t repeat information from schema document unless needed for clarity Table giving information for every attribute, plus text notes if necessary E.g. for the Contact class: GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Review process Long, technical document, needs careful review Especially by developers May take some time to converge May well need updates after experience Document is versioned Compliance with a given version should be published GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Timetable Internal EGI review Some responses already Will update the document by the end of September So far requested changes seem fairly small External review Especially EMI Comments by the end of October “Final” document by the end of November Version 1.0 Likely to evolve with experience GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague Implementation Difficulty will vary – many things may be compliant already Developers can check compliance while reviewing the document Need to start validation – if only by hand Submit bugs as problems are found Need people to work on validation tools GLUE 2.0 Profile - EGI TF Prague
GLUE 2.0 Profile - EGI TF Prague References Draft profile document https://documents.egi.eu/public/ShowDocument?docid=1324 GLUE 2.0 specification http://www.ogf.org/documents/GFD.147.pdf LDAP rendering specification (draft) http://forge.ogf.org/sf/go/doc15518?nav=1 WLCG installed capacity document https://twiki.cern.ch/twiki/pub/LCG/WLCGCommonComputingReadinessChallenges/WLCG_GlueSchemaUsage-1.8.pdf GLUE 2.0 Profile - EGI TF Prague