EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

Slides:



Advertisements
Similar presentations
1 jNIK IT tool for electronic audit papers 17th meeting of the INTOSAI Working Group on IT Audit (WGITA) SAI POLAND (the Supreme Chamber of Control)
Advertisements

© 2006 Open Grid Forum GGF18, 13th September 2006 OGSA Data Architecture Scenarios Dave Berry & Stephen Davey.
An open source approach for grids Bob Jones CERN EU DataGrid Project Deputy Project Leader EU EGEE Designated Technical Director
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
EGEE is a project funded by the European Union under contract IST R-GMA status and plans Abdeslem DJAOUI / RAL GRIDPP10 meeting at CERN, 3.
WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Data and Storage Management.
WP2: Data Management Gavin McCance University of Glasgow.
Stephen Burke - WP8 Status - 9/5/2002 Partner Logo WP8 Status Stephen Burke, PPARC/RAL.
Partner Logo Tier1/A and Tier2 in GridPP2 John Gordon GridPP6 31 January 2003.
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
User Board - Supporting Other Experiments Stephen Burke, RAL pp Glenn Patrick.
Stephen Burke - WP8 Status - 14/2/2002 Partner Logo WP8 Status Stephen Burke, PPARC/RAL.
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
GLite Status Stephen Burke RAL GridPP 13 - Durham.
Tony Doyle GridPP2 Proposal, BT Meeting, Imperial, 23 July 2003.
The National Grid Service and OGSA-DAI Mike Mineter
Current status of grids: the need for standards Mike Mineter TOE-NeSC, Edinburgh.
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Torsten Antoni – LCG Operations Workshop, CERN 02-04/11/04 Global Grid User Support - GGUS -
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
DataGrid is a project funded by the European Commission under contract IST WP2 – R2.1 Overview of WP2 middleware as present in EDG 2.1 release.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Joining the Grid Andrew McNab. 28 March 2006Andrew McNab – Joining the Grid Outline ● LCG – the grid you're joining ● Related projects ● Getting a certificate.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
UCL workshop – 4-5 March 2004 – HEP Assessment of EDG – n° 1 HEP Applications Evaluation of the EDG Testbed and Middleware Stephen Burke (EDG HEP Applications.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
LCG EGEE is a project funded by the European Union under contract IST LCG PEB, 7 th June 2004 Prototype Middleware Status Update Frédéric Hemmer.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
Rutherford Appleton Lab, UK VOBox Considerations from GridPP. GridPP DTeam Meeting. Wed Sep 13 th 2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress on first user scenarios Stephen.
14-May-2003 AWG FH, JT, JJB DataGrig Barcelona 1 HEP GRID use cases Common GRID use cases F.Harris, J.Templon, J.J Blaising.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
EGEE is a project funded by the European Union under contract IST Package Manager Predrag Buncic JRA1 ARDA 21/10/04
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
VOX Project Tanya Levshina. 05/17/2004 VOX Project2 Presentation overview Introduction VOX Project VOMRS Concepts Roles Registration flow EDG VOMS Open.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Stephen Burke – Sysman meeting - 22/4/2002 Partner Logo The Testbed – A User View Stephen Burke, PPARC/RAL.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
Claudio Grandi INFN Bologna CSN1 - Perugia 11/11/2002 Gli esperimenti LHC hanno qualcosa in comune? (HEPCAL RTAG di LCG) C. Grandi INFN - Bologna.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Report from.
Bob Jones EGEE Technical Director
Regional Operations Centres Core infrastructure Centres
EGEE Middleware Activities Overview
(on behalf of the POOL team)
U.S. ATLAS Grid Production Experience
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Network Requirements Javier Orellana
OGSA Data Architecture Scenarios
Integrating SRB with the GIGGLE framework
Presentation transcript:

EGEE is a project funded by the European Union under contract IST EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC Loose Cannon GridPP EB-TB Open Meeting, 13 th May

EB-TB Open Meeting, 13/5/04 - 2/17 Contents HEPCAL History HEPCAL Use Cases HEPCAL II EGEE/NA4 Summary

EB-TB Open Meeting, 13/5/04 - 3/17 HEPCAL History In early 2002 the Loose Cannons in EDG WP8 started interviewing representatives of the LHC experiments to collect common Use Cases, and produced a document called HEPCAL (HEP Common Application Layer). An LCG RTAG was then formed to extend the document. This was published in May The members of the HEPCAL RTAG formed a permanent LCG committee called GAG (Grid Application Group) at the beginning of 2003 to consider requirements and experiment needs, and give feedback to LCG and other Grid projects.

EB-TB Open Meeting, 13/5/04 - 4/17 HEPCAL History - 2 In October 2003 the GAG published the HEPCAL II document discussing requirements for analysis as opposed to production use. In March 2004 the original HEPCAL document was updated, including more information on priorities and quantitative requirements (known as HEPCAL-prime). gag/LCG_GAG_Docs/HEPCAL-prime.dochttp://project-lcg-gag.web.cern.ch/project-lcg- gag/LCG_GAG_Docs/HEPCAL-prime.doc

EB-TB Open Meeting, 13/5/04 - 5/17 HEPCAL Use Cases There are 43 Use Cases. They are generally intended to cover basic operations, and are not in any sense complete.  The documents also have some implications for general requirements, but this is not the main focus. In practice only about half of the Use Cases were implemented by EDG middleware, and there was fairly little progress between EDG 1.x and 2.x.

EB-TB Open Meeting, 13/5/04 - 6/17 USE CASE: DATASET BROWSING IdentifierUC#dsbrowse Goals in ContextBrowse the LDNs ActorsUser TriggersNeed to consult the DS list Included Use CasesGrid login Specialised Use Cases Pre-conditionsUser has a valid Grid login. A VO DMS is accessible by the user and contains the files to be browsed Post-conditions Basic FlowThe user connects to her VO DMS, via Web or command line interface The user browses the available DS. Devious Flow(s)1.User has no right to browse the DMS database. Operation is aborted. 2.DMS database is not accessible. Operation is aborted. Importance and FrequencyAs important and probably frequently used as the ls command. Additional Requirements Example$ dsbrowse [parameters to be defined such as date created etc] int dsbrowse(char* SQL_query, char* option, char*[] LDNs); Call returns the number of LDNs that satisfy the search options.

EB-TB Open Meeting, 13/5/04 - 7/17 Basic There are 19 Use Cases covering basic concepts.  These relate to fundamental Grid operations like submitting and controlling jobs, registering and replicating files, and querying the state of the system. Of these, 15 are implemented by the EDG middleware, although in some cases there are minor areas where the implementation is not ideal, in particular concerning the detection and treatment of errors and support for file metadata. Missing Use Cases relate to querying the state of jobs, detailed job control, and to a specific method of file registration (the latter has now been implemented by LCG).

EB-TB Open Meeting, 13/5/04 - 8/17 Security Security issues were not considered in detail, but there are 5 security-related Use Cases. Two concern the joining and leaving of a VO, and a third specifies single sign-on.  These are implemented in EDG/LCG and will be enhanced with the use of VOMS. The two other security Use Cases concern the advance reservation of resources and the allocation of resources between VO members.  These are not addressed in the current system.

EB-TB Open Meeting, 13/5/04 - 9/17 Metadata and Virtual Data Metadata is relevant to several Use Cases, but two of them specifically involve the modification of file-related metadata and performing queries to select files based on the metadata.  The EDG Replica Metadata Catalogue offers a prototype with partial support for these Use Cases, but more work is needed by both application and middleware developers in this area. Two Use Cases are associated with the concept of Virtual Data.  This was out of the scope of EDG, and would be likely to require substantial further work to implement. In general it is not a high priority.

EB-TB Open Meeting, 13/5/ /17 Optimisation - Data There are four Use Cases related to optimisation. One concerns the evaluation of cost functions for data access to allow the most efficient access method to be chosen.  The EDG middleware has a substantial amount of support for this concept, but testing has been limited, and the ROS is not deployed in LCG. Another case relates to the possibility of using remote access to a small part of a file to avoid the overhead of complete replication.  This has not been considered up to now, although GridFTP does support partial file access.

EB-TB Open Meeting, 13/5/ /17 Optimisation – Job Submission The other optimisation Use Cases relate to job submission. One concerns the specification of hints, e.g. for cpu time consumption, memory usage or disk space needed, to allow jobs to be scheduled efficiently.  This is supported to the extent that jobs can apply their own constraints and ranking criteria based on information stored in the information system, but any optimisation is provided by the user rather than the WMS. The final Use Case concerns the automatic splitting of jobs into subjobs.  This was one of the goals for the EDG WMS, adapting the Condor DAGMAN software, but the functionality is not fully integrated in the deployed system.

EB-TB Open Meeting, 13/5/ /17 Application Databases Four Use Cases relate to databases (referred to as Catalogues in the documents), i.e. read-write entities as opposed to read-only datasets.  So far this is not addressed by the middleware.  Even the middleware’s own databases (broker LB, R-GMA registry, LRC and RMC) are not distributed or replicated. R-GMA provides a different model for a distributed database which may be suitable for some Use Cases.

EB-TB Open Meeting, 13/5/ /17 Application Interfaces The final set of seven Use Cases are at a higher level, and relate to interactions between middleware and application software.  These can generally be achieved by implementing the functionality at the application level, but have no specific support in the middleware. Two relate to the submission and control of large sets of jobs treated as a single production, e.g. to process a large number of files, and a third relates to storing user-defined metadata about jobs in the WMS job database. Three concern specialised kinds of jobs: specification of input data via a metadata query, verification of the functionality of application software, and validation of the content of a dataset, either in a standalone job or as the final stage of a data production job. Finally, there is the question of the installation and publication of application software. This is a long-standing problem, although LCG has made some progress.

EB-TB Open Meeting, 13/5/ /17 HEPCAL II The original HEPCAL document was largely aimed at managed production-style jobs. HEPCAL II was an attempt to consider the needs of chaotic analysis jobs. The document has a fairly extensive description of models for analysis jobs, but does not have specifically identified requirements. There are also no detailed Use Cases, just three general analysis scenarios (user-level, group-level, and managed production). Analysis models could benefit from the experience of running experiments.

EB-TB Open Meeting, 13/5/ /17 Security Requirements Always difficult to get particle physicists to care about security!  No comprehensive requirements yet, although some documents exist. The main HEP requirements are likely to be in the areas of VO management, authorisation, accounting and quotas.  Also there is the never-ending battle over outbound ip access from worker nodes. The security model places a lot of weight on checking by the VOs – CAs only check identity and will issue certificates to ~anyone.  Experiments may not yet have taken this on board. The EDG security group said that accounting and quotas weren’t in its area – but someone needs to consider them.

EB-TB Open Meeting, 13/5/ /17 EGEE NA4 The EGEE NA4 Activity represents all application groups  HEP, biomed, … There is an NA4 HEP sub-group, currently led by Frank Harris.  So far this is strongly coupled to LCG/GAG/ARDA. It’s not entirely clear how non-LHC HEP experiments participate, and the UK is not a partner for NA4. NA4 has to produce a requirements document by May/June  In practice, for HEP this is likely to be based on HEPCAL - if non- LHC HEP experiments want to give any input they need to do it quickly. Timescales are short, this may be the only opportunity to influence the direction of the EGEE middleware in a significant way.  Need to identify major missing items and prioritise

EB-TB Open Meeting, 13/5/ /17 Summary EDG and LCG have developed requirements and Use Cases over several years, but largely with input from the LHC experiments. The HEPCAL Use Cases are fairly basic, but even so many are not implemented. EGEE is collecting requirements now, this is an opportunity to influence the direction of development.  GridPP, particularly the non-LHC experiments, should consider whether it wants to add anything to HEPCAL. There is an open NA4 meeting in Catania on July 14-16: