Presentation is loading. Please wait.

Presentation is loading. Please wait.

M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.

Similar presentations

Presentation on theme: "M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE."— Presentation transcript:

1 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE metadata

2 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 2 Overview AliEn and aliensh File Catalogue –structure –path/file name definition Additional Run and File Level Metadata Event Level Metadata ‘Working’ Example Summary and Outlook

3 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 3 AliEn and aliensh AliEn (Alice Environment) –distributed computer environment for Alice –core services in PERL –provides Database interface (MySQL) File Catalogue Metadata Catalogue other services… aliensh –provides commands to access AliEn GRID computing resources and the AliEn virtual file system –bash like behaviour –interactive, single-command-, or script-execution informative + convenience commands (whoami, less, …) virtual file catalogue + data management commands (cp, rm, find, …) TaskQueue/job management commands (submit, ps, kill, …)

4 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 4 Structure of the File Catalogue File Catalogue –acts as and looks like a ‘File System’ –doesn’t own the files, just associates logical file names (LFN) with physical locations/physical file names (PFN) –MySQL database each virtual directory is represented by one table subdirectories are connected to directories by sub-table entries LFNs (base names) are represented as entries in directory tables These entries hold the name (LFN) and the PFN. PFN contains –protocol how to access the data –host where to find the data –access port –directory entry (= file name)

5 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 5 Pathname Definitions for real data /alice/data/‹Year›/‹AcceleratorPeriod›/‹RunNumber›/ for simulated data /alice/sim/‹Year›/‹ProductionType›/‹RunNumber›/ subdirectories: for raw data raw/ for links to calibration and condition files reco/‹PassX›/cond/ for ESD and corresponding tag files reco/‹PassX›/ESD for AOD files reco/‹PassX›/AOD

6 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 6 Filename Definition for ESD files ‹xxxx›.AliESD.root similar for all other files (except for condition files) a tool will be provided to generate ‘meaningful’ file names if somebody wants to make a local copy files to be registered in the file catalogue –raw data files, –AliESD.root files, –AliESDfriends.root files, and –ESDtags.root files

7 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 7 Run and File Level Metadata Metadata Catalogue –additional tables can be attached to each ‘directory’/table of the MySQL database  metadata –directory structure (grouping of ‘similar’ files) allows for reduction of (additional) metadata for a given directory enhancement of search performance

8 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 8 MetaData Overview I run comment run type –physics, laser, pulser, pedestal, simulation run start time run stop time run stop reason –Normal, beam loss, detector failure, … magnetic field setting –FullField, ReversedField, ZeroField, HalfField collision system –PbPb, pp, pPb, … collision energy trigger setup name detectors present in this run # of events in this run run sanity RunEvent File file sanity flag (“online/offline”, “available/ not available”) event id centrality multiplicity –an array for different detectors? luminosity magnetic field value trigger condition detectors with data in this event mean p T max p T # of protons # pions # of strange particles # of pos. charges # of neg. charges # of  … event sanity

9 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 9 MetaData Overview II for produced events production tag production software library version for simulation generator generator version generator comments generator parameters detector geometry detector configuration simulation comments RunEvent File All this is additional information to what is stored in the path name!

10 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 10 Event Level Metadata raw data is processed right after data taking some physical quantities will be extracted right away –multiplicity –vertex position –… each file containing physics events gets an additional file containing this event level metadata ‘attached’  ESDtags file –root file –stored in the same directory as the physics data file content can be extended later (or each user can even create his/her own tag files)

11 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 11 Event Level Metadata Creation/Selection RECONSTRUCTION POST PROCESS INDEX BUILDER BITMAP INDICES ANALYSIS CODE QUERY LIST OF EVENTS GROUPED BY GUID QUERY LIST OF EVENTS GROUPED BY GUID PROOF/AliEn P. Christakoglou

12 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 12 Working Example user wants to analyse –AliESDs –pp collisions –taken on 19. and 20.09.2007, before 10:20:33 h –… $ find -x pp /alice/data/2007/LHC07a/*/reco/Pass3/*AliESDs.root Run:collision_system=”pp” and Run:stop "2007-09-19" > pp.xml the events should meet the following additional specifications –properly reconstructed vertex –vertex z position in between ±1 cm –… Loop over list of events grouped by GUID/file for the file collection specified by ‘pp.xml’. Run Event

13 M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 13 Summary and Outlook System is fully setup and functional: –File Catalogue (with defined directory structure) exists and works –run and file level Metadata Catalogue (data fields) is defined and exists –event level metadata is defined, index builder is functional –all stages were tested and work properly But… –no large scale tests yet –many tables/catalogues not filled yet (at least not automatically) –not enough simulation data to effectively stress test the system Currently –large test production running –we start adding output files automatically to the file catalogue –overall system performance to be seen…

Download ppt "M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE."

Similar presentations

Ads by Google