Metadata (data about data) GridPP-15, Paul Millar.

Slides:



Advertisements
Similar presentations
1 WP2: Data Management Paul Millar eScience All Hands Meeting September
Advertisements

WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Single Sign-On with GRID Certificates Ernest Artiaga (CERN – IT) GridPP 7 th Collaboration Meeting July 2003 July 2003.
Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
29 June 2006 GridSite Andrew McNabwww.gridsite.org VOMS and VOs Andrew McNab University of Manchester.
Tony Doyle GridPP2 Proposal, BT Meeting, Imperial, 23 July 2003.
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
19/06/2002WP4 Workshop - CERN WP4 - Monitoring Progress report
Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
WebFTS as a first WLCG/HEP FIM pilot
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
SITools Enhanced Use of Laboratory Services and Data Romain Conseil
ETICS2 All Hands Meeting VEGA GmbH INFSOM-RI Uwe Mueller-Wilm Palermo, Oct ETICS Service Management Framework Business Objectives and “Best.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
Www2.computer.org Basic Architecture Leo Wadsworth, Staff Manager April 2008.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal Database Selection Tim Bell 6 th June.
David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
Tony Doyle & Gavin McCance - University of Glasgow ATLAS MetaData AMI and Spitfire: Starting Point.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Metadata requirements for HEP Paul Millar. Slide 2 12 September 2007 Metadata requirements for HEP Some of the players in this game... WLCG – Umbrella.
Grid Security in a production environment: 4 years of running Andrew McNab University of Manchester.
…building the next IT revolution From Web to Grid…
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
A B A B AR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001.
Conditions Metadata for TAGs Elizabeth Gallas, (Ryan Buckingham, Jeff Tseng) - Oxford ATLAS Software & Computing Workshop CERN – April 19-23, 2010.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
A Flexible Distributed Event-level Metadata System for ATLAS David Malon*, Jack Cranshaw, Kristo Karr (Argonne), Julius Hrivnac, Arthur Schaffer (LAL Orsay)
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
Security Middleware Andrew McNab University of Manchester.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Alexei Klimentov. December 2, 2010 SW&C workshop. Database Session. December, 2010 ATLAS Metadata Handling and AMI Wokshop Highlights Alexei Klimentov.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
Conditions Metadata for TAGs Elizabeth Gallas, (Ryan Buckingham, Jeff Tseng) - Oxford ATLAS Software & Computing Workshop CERN – April 19-23, 2010.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
The GridPP DIRAC project DIRAC for non-LHC communities.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
What problems are we trying to solve? Hannes Tschofenig.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.
Grid monitoring: a holistic approach Paul Millar.
Database Replication and Monitoring
Readiness of ATLAS Computing - A personal view
Presentation transcript:

Metadata (data about data) GridPP-15, Paul Millar

GridPP-15Metadata2 Contents Monitoring metadata service Event-level metadata Work improving AMI Cataloguing ATLAS metadata Conclusions

GridPP-15Metadata3 Monitoring a metadata service Requirements document has been released Why do people want to monitor metadata services? How to make that happen? What already exist out there? What still needs to be done?

GridPP-15Metadata4 JMX – servlet monitoring

GridPP-15Metadata5 Monitoring Architecture

GridPP-15Metadata6 MonAMI architecture

GridPP-15Metadata7 MonAMI summary Follows the UNIX do one thing well philosophy. Takes data from from a site and pushes it somewhere. Plugin architecture: easy to add extra targets Easy to configure (really!) Multiligual: currently speaks Ganglia, but will (soon) speak Nagios, LEMON, R-GMA?,... Monitors: Apache, tomcat/JMX, MySQL,...

GridPP-15Metadata8 ATLAS Event Level Metadata Event tag infrastructure is being developed to: – Allow physicists to exclude uninteresting events from data sample – Extract samples of specific interest into a smaller fileset, for repeated running – Provide a global view of the data, useful for data mining Event tag infrastructure is currently under review Content & usage of event tags under discussion with physics groups Tags produced from Rome data for 2.3 million events, stored at CERN –

GridPP-15Metadata9 Integration with DDM For event tag infrastructure to be used successfully, it must be able to work with ATLAS Distributed Data Management system (DQ2) DQ2 uses datasets as basic units of data manipulation Event tag tools currently refer only to files Currently working on ways to solve this problem DQ2 Dataset 1 File 1 File 2 File 3 Tag browser File 1 File 3

GridPP-15Metadata10 Cataloguing Event Tags Event tags can now be built and written to a master database. – This may be replicated to T1 and T2s – Different physics groups may build their own tags Geographically diverse locations for data. How should these be catalogued? – Many open questions (e.g., should AOD and Tag be distributed together, and how?) – Aim to build on prototype CollectionCatalog, to design and implement Event Tag Catalogue

GridPP-15Metadata11 AMI VOMS Completed requirements analysis of the impact of VOMS on AMI. AMI will use ATLAS-wide agreed VOMS groups and roles. Mapping of VOMS groups to AMI decided. Problem of getting the proxy certificate into AMI... using your web browser... Solutions? mod_gridsite or MyProxy.

GridPP-15Metadata12 AMI and SQLite

GridPP-15Metadata13 Cataloguing metadata Looking at ATLAS, there's metadata in: DDM (several databases) Production system (ProdDB) Event-level (tag) database(s) COOL (used for correlation of DAQ and run Numbers) AMI Catalogues datasets by physics metadata. Detector Description (file catalogues, but should be able to ignore these) Risk of information being duplicated, unknown or impossible to get at. Implement easy navigation between different metadata

GridPP-15Metadata14 Summary The metadata collaboration's work ongoing with various aspects of metadata. Monitoring requirements document released. Opportunity for working together for developing additional monitoring tools. AMI will soon support off-line analysis How to implement event-level metadata is going through discussion stage and into prototyping.

GridPP-15Metadata15 Questions Comments Thoughts