GridPP2 Workshop – 5 March 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Metadata Management.

Slides:



Advertisements
Similar presentations
GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Data and Storage Management.
Advertisements

Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC WP2+5: Data and Storage Management.
ATLAS-Specific Activity in GridPP EDG Integration LCG Integration Metadata.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
EGEE is a project funded by the European Union under contract IST JRA1 Testing Activity: Status and Plans Leanne Guy EGEE Middleware Testing.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Security Area in GridPP2 4 Mar 2004 Security Area in GridPP2 “Proforma-2 posts” overview Deliverables – Local Access – Local Usage.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Tony Doyle & Gavin McCance - University of Glasgow ATLAS MetaData AMI and Spitfire: Starting Point.
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
SA1/SA2 meeting 28 November The status of EGEE project and next steps Bob Jones EGEE Technical Director EGEE is proposed as.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE is a project funded by the European Union under contract IST Middleware Planning for LCG/EGEE Bob Jones EGEE Technical Director e-Science.
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
LCG EGEE is a project funded by the European Union under contract IST LCG PEB, 7 th June 2004 Prototype Middleware Status Update Frédéric Hemmer.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Metadata Mòrag Burgon-Lyon University of Glasgow.
E-Science Research Councils awarded e-Science funds ” science increasingly done through distributed global collaborations enabled by the Internet, using.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
EGEE MiddlewareLCG Internal review18 November EGEE Middleware Activities Overview Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as.
LCG LCG Workshop – March 23-24, Middleware Development within the EGEE Project LCG Workshop CERN March 2004 Frédéric Hemmer.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
WP3 Information and Monitoring Rob Byrom / WP3
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
EGEE is a project funded by the European Union under contract IST ARDA Project Status Massimo Lamanna ARDA Project Leader NA4/HEP Cork, 19.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
JRA1 Meeting – 09/02/ Software Configuration Management and Integration EGEE is proposed as a project funded by the European Union under contract.
GridPP2 Data Management work area J Jensen / RAL GridPP2 Data Management Work Area – Part 2 Mass storage & local storage mgmt J Jensen
ARDA Massimo Lamanna / CERN Massimo Lamanna 2 TOC ARDA Workshop Post-workshop activities Milestones (already shown in December)
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Javier Orellana EGEE-JRA4 Coordinator CERN March 2004 EGEE is proposed as a project funded by the European Union under contract IST Network.
EGEE is a project funded by the European Union under contract IST “The ARDA project” Massimo Lamanna / CERN on behalf of the LCG-ARDA project.
Baseline Services Group Status of File Transfer Service discussions Storage Management Workshop 6 th April 2005 Ian Bird IT/GD.
JRA1 Middleware re-engineering
Evolution of storage and data management
Bob Jones EGEE Technical Director
James Casey, CERN IT-GD WLCG Workshop 1st September, 2007
Regional Operations Centres Core infrastructure Centres
Gavin McCance University of Glasgow GridPP2 Workshop, UCL
EGEE Middleware Activities Overview
(on behalf of the POOL team)
JRA3 Introduction Åke Edlund EGEE Security Head
SA1 Execution Plan Status and Issues
Ian Bird GDB Meeting CERN 9 September 2003
EGEE and Induction Mike Mineter NeSC Training Team
JRA1 (Middleware) Overview
Fabric and Storage Management
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
LCG Operations Centres
LCG middleware and LHC experiments ARDA project
LCG Operations Workshop, e-IRG Workshop
Leigh Grundhoefer Indiana University
Collaboration Board Meeting
Global Grid Forum (GGF) Orientation
Status of Grids for HEP and HENP
LHC Computing, RRB; H F Hoffmann
Presentation transcript:

GridPP2 Workshop – 5 March 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Metadata Management Gavin McCance University of Glasgow GridPP2 Workshop, UCL

GridPP2 Workshop – 5 March 2004 – Data Management – n° 2Gavin McCance – University of Glasgow GridPP2 Middleware Metadata Management

GridPP2 Workshop – 5 March 2004 – Data Management – n° 3Gavin McCance – University of Glasgow Work areas u Metadata management and UK metadata group u Storage management n  See Jens’ talk

GridPP2 Workshop – 5 March 2004 – Data Management – n° 4Gavin McCance – University of Glasgow Metadata Management u The focus is upon Grid-enabling metadata services for the experiments n Building upon our previous work in this area n Building upon experiments’ existing work in this area u Formation of a UK metadata group within GridPP2 n 1 generic Grid metadata Glasgow n ~1 post per experiment s Glasgow, Oxford, Bristol US expts, others?? s The UK metadata group will form part of the work of these experiment posts n Interaction with the UK data management support teams

GridPP2 Workshop – 5 March 2004 – Data Management – n° 5Gavin McCance – University of Glasgow GridPP2 Metadata Group u Purpose will be to n Take overall responsibility for common experiment metadata technologies in order to Grid-enable the experiments’ metadata n Identify the commonalities and experience across experiments and make sure these are recognized s i.e. technologies, schema: data product navigational problem n Come to agreement and feed this back into the wider ARDA process u Work directly with interested groups forming the ARDA n EGEE JRA1 Data Management Group n LCG Deployment Teams n LCG Experiments n IT Database group

GridPP2 Workshop – 5 March 2004 – Data Management – n° 6Gavin McCance – University of Glasgow Metadata Responsibilities u Generic metadata n Concentration on the technologies used to create scalable, manageable and fault-tolerant metadata services s The underlying Grid software stack n Emphasis upon the service, not just the product s 24/7 supportable production metadata services n Not prescribing things like the schema, or saying the ‘API must look like Spitfire’: prototype interfaces should be based upon experiments’ existing metadata interfaces n Will track, develop and adopt as necessary Grid metadata access standards s Feed into standards to make sure we’re in a position to benefit from the future production products that implement these standards s Feed PPE use-case and experience back into the wider world

GridPP2 Workshop – 5 March 2004 – Data Management – n° 7Gavin McCance – University of Glasgow Metadata Responsibilities u Experiment metadata posts (~1 per experiment): n Document existing implementations from the experiments n Make sure all the experiments’ use-cases are satisfied by the products and the technologies being proposed by the group n Work within the group to ensure that commonalities and experience across experiments are recognized and effort is not wasted s At the technology level – e.g. using the same underlying Grid software stack s At the interface level – e.g. GANGA s Possibly at the schema level… n Feed this understanding and agreement back into the wider ARDA process and back into their own experiments n ARDA terminology: Dataset metadata  ARDA Metadata service Data product navigation  ARDA Job Provenance service

GridPP2 Workshop – 5 March 2004 – Data Management – n° 8Gavin McCance – University of Glasgow Short term plans of the group… u Immediate work: u Current task of the group is information gathering u u A review of how each experiment uses metadata: n What you mean by the term metadata: what does it include? n Details on this.. how do you use the metadata? n Implementation and deployment details: how is it split into services, the size of metadata, details on the schema, technologies used, etc. n Relation to other products, e.g. POOL n Future directions already in people’s minds?

GridPP2 Workshop – 5 March 2004 – Data Management – n° 9Gavin McCance – University of Glasgow …Short term plans of the group u The results of this review are being made available on a web page and should be pulled into a document n Common format to easily compare the different experiments uses of metadata u This document will serve as input to a metadata workshop n ~end of April..? Still to be VRVS? n Purpose of the workshop will be to identify areas of commonality and work on the future programme for the group n Generate ~short-lived sub-tasks within the group with a clear purpose and outcome n Continue regular planning meetings to guide these sub-tasks u Should ensure we have input from other sciences as well.. n Can request input from the EDG WP9/10 groups and EGEE Biomed groups

GridPP2 Workshop – 5 March 2004 – Data Management – n° 10Gavin McCance – University of Glasgow Links to other projects… u We can’t do this ourselves… u EGEE JRA1: The JRA1 data management development cluster of EGEE is based at CERN - we will build upon the relationship formed within EDG (it’s a similar team as EDG) n Primary interface to JRA1 will be the generic middleware post at Glasgow n Proposal to work directly with JRA1 DM s i.e. use the JRA1 CVS repository, use the same development tools and infrastructure, use the experience of the testing and integration teams of EGEE, deliver through this group n The large experiment participation in this UK metadata group is seen as a very helpful within the JRA1 DM cluster n Lack of any formal agreement…

GridPP2 Workshop – 5 March 2004 – Data Management – n° 11Gavin McCance – University of Glasgow …Links to other projects… u LCG / EGEE SA1: products delivered to LCG through EGEE JRA1?? n See UK data management support posts later… u Experiments: members of the experiments will form part of the metadata group n Feed-back the work of this group into the experiments and verify that the proposed solutions will work for their experiments n Hope is to establish a recognized UK lead in metadata that is recognisably cross-experiment u ARDA project: Some combination of the above.. n ARDA is now a real project at CERN, though the details of how we work need to be sorted out

GridPP2 Workshop – 5 March 2004 – Data Management – n° 12Gavin McCance – University of Glasgow …Links to other projects u Direct testing of our products and solutions for other sciences n Planning to do this through the other EGEE application groups n e.g. biomed have very strict security requirements n Is there another avenue in the UK for this sort of cross-science activity?? u Various Grid and web-service forums: n Global Grid Forum s Mainly the DAIS group, with probable participation in the related Data Area groups s Due to EDG focus on stability and support, we lost touch with the GGF data area groups the last year or so – re-establish… n W3C, OASIS ?

GridPP2 Workshop – 5 March 2004 – Data Management – n° 13Gavin McCance – University of Glasgow Review of objectives and timelines u Multiple experiment posts with different deliverables and focus n Not all of the experiment posts’ work will be within the scope of the metadata group, but all work done should be reported there so that commonalities can be identified early u As an example of how the work will be divided and for the general timelines, I highlight the relevant objectives for: n The generic-middleware metadata n The ATLAS n Then discuss the timelines for the development

GridPP2 Workshop – 5 March 2004 – Data Management – n° 14Gavin McCance – University of Glasgow Generic middleware objectives u Proforma 2 + 3: u Development of Grid technologies within a service-focussed architecture (such as WSRF) for use in metadata based applications for the experiments; u Delivery of fault-tolerant, reliable and manageable software for this purpose. The emphasis from the beginning will be upon developing services that meet the requirements of the experiments; u Use of this technology for the enabling of existing experiments’ metadata based products in line with the Metadata Catalog service described in the ARDA document (from LCG SC2 RTAG11); u Participation in the Grid Forum data areas to ensure that particle physics is in a position to benefit from developments here. Promising developments will influence the design of the metadata services and we will feed back our requirements and experience into these forums.

GridPP2 Workshop – 5 March 2004 – Data Management – n° 15Gavin McCance – University of Glasgow ATLAS middleware objectives u Proforma 2 + 3: u Gain a conceptual understanding of the existing ATLAS metadata structures and the ATLAS specific use-cases that drive them; u Develop, with reference to the use-cases and interactions with other ATLAS developers, the metadata necessary to support the navigational use-cases. Both the schema itself and the optimal location of the metadata require study; u Understand the analysis use-cases and optimise the event to file granularity for different types of analysis data (ESD, AOD, TAG) depending upon the use-case. Develop automated ways to monitor the best granularity of event data based on analysis access patterns; u Implement fully working and documented solutions, working with the ATLAS and UK metadata teams to ensure that the developments here are fully integrated with the rest of the ATHENA/GAUDI software, in particular, with the ATLAS Metadata Infrastructure (AMI) product.

GridPP2 Workshop – 5 March 2004 – Data Management – n° 16Gavin McCance – University of Glasgow Timescales for the deliverables… u Pre – Participate in architecture discussions and prototyping u PM1 – Architecture and Planning “Report” n Placing exercise in response to the EGEE architecture u PM2 – Understanding of the Experiment Metadata Requirements (process started now…) u PM3 – Design of Grid Services (Release 1) u PM7 – Software and Associated Documentation (Release 1) u PM9 – Participate in LCG TDR Review u PM10 – Tier 1 and 2 Support “Report” n In collaboration with UK data management support posts u PM11 – Detailed Metadata Requirements “Report” u PM11 – Architecture and Planning (Release 2)

GridPP2 Workshop – 5 March 2004 – Data Management – n° 17Gavin McCance – University of Glasgow …Timescales for deliverables u PM12 – Design and Refactor of Grid Services (Release 2) u PM16 – Software and Associated Documentation (Release 2) u PM21 – Tier 1 and Tier 2 Detailed Support Plan n In collaboration with UK data management support posts u PM23 – Architecture and Planning (Release 3) u PM26 - Design and Refactor of Grid Services (Release 3) u PM32 – Software and Associated Documentation (Release 3) u PM36 – Final Report

GridPP2 Workshop – 5 March 2004 – Data Management – n° 18Gavin McCance – University of Glasgow Support team of GridPP2 u UK data management support posts n Aim: to provide first-level support for all DM software s first stop for UK system administrators n Work directly with the development and deployment teams (GridPP2 Metadata Group and Storage, EGEE and LCG) n Provide hands-on deployment help for data challenge support n Develop how-to portal to collect deployment experience n Feed back sys-admin issues and experience to developers s Site policies, quotas, firewalls – survey sysadmins n Develop site validation tools n Responsible for developing the overall support plan for the data management services beyond GridPP2 n Need to fit all this in with the rest of the UK Support Plan