Download presentation
Presentation is loading. Please wait.
Published byStuart Hensley Modified over 9 years ago
1
ATLAS Data Preservation and Access Roger Jones
2
Data Preservation & Access Opening data access Preparatory discussions with “management”, CB chair, authorship and Pubcom chairs Has clear implications for authorship/membership rules Needs CB-level discussion Past experience says these topics provoke long discussion in the CB! Common principles proposed by LHC experiment Data Policy Harmonization Group straw man This has been reviewed by the SIPB and taken to CERN Council to become a “policy suggestion” A draft policy is with the management for discussion & has been seen by the ICB
3
ATLAS DMP Organization Data Preservation now included as part of the upgrade activity planning May increase the funding options – some evidence already Data Management Planning is now required by some funders for upgrade grants Looking at the cost/benefit of various strategies Resource tensioning with other upgrade activities
4
Principles for preservation & access General agreement RAW data is preserved for the experiment and future – open data access is not usually possible even to the collaboration members (level 4 data) and is not proposed for general use Full reconstruction outputs for analysis might be made available after an embargo period – tbd, but clearly embargo of several years. The resource implications to make this useful are high. (Level 3 data) We support limited access of samples in simple formats for outreach and teaching (level 2 data) – but these are best integrated to our presenter tools Techniques like Recast may make data (information) usefully available, although it does not meet all the open access criteria for levels 2 & 3 We already make data from papers and supporting information available through HEPDAT/Inspire (Level 1 data)
5
Data Preservation Policies Data Preservation There are DP policies implied in the Computing TDRs conserve all raw data during the lifetime of the experiment All formats & code used for paper analyses to be archived Tier 0/1s responsible for the physical preservation Some tacit belief that older sets may be ‘retired’ Retired data no longer to be on disk or under active analysis This may need to be revised e.g. if external access is then granted Obvious resource implications First priority to to preserve data for active use by the collaboration
6
ATLAS DP Practical Steps Making sure raw data can be reprocessed long-term (Level 4) Identifying key datasets for ‘unique data’ preservation Setting up regular reprocessing and validation This has been underway as a test case for the 2009 data, but progress is slow Forward/backward compatibility issues illustrated in John Chapman’s talk on simulation release plans14/3/13 Ensure the capability to run old trigger selections offline AODfixing will help (reprocessing at analysis format level) This means level 4 operations can be applied to level 3 AOD format
7
Digesting validation results Must display the results of the validation in a comprehensible way: web based interface The test must determine the nature of the results Could be simple yes/no, plots, ROOT files, text-files with keywords or length,... Need for semi-automated, detailed physics validation David South is on ATLAS and was central to the DESY SP and DPHEP activities Identify the useful common components Identify the ATLAS-specific elements Set up CERN-based instance for ATLAS (and others?)
8
Existing open datasets The CB has authorized various datasets in (level-2) outreach formats for open use in education/outreach Event displays for interactive analysis (MINERVA/HYPATIA/LPPP/CAMELIA) JIVE-XML, root format data Absolutely not intended for any serious analysis, but illustrative
9
ATLAS Zpath Master the invariant mass technique to study and measure the (Z, J/ ) decaying to l + l - to search for new physics (Z’) And Higgs boson in and l + l - l + l - HYPATIA using the ATLANTIS event display Data from 2011 – 13000 events ~2.5 GB (password protected, 100 open) – 13 data groups/directories, 20 subgroups (A-T), and 50 events/mixed sample/2 students – 50% Z, 30% 10% (J/ , 5% Z', 5% l + l - l + l - – Higgs candidate events: – 1 fb -1 and cuts according to ATLAS publication – 125 GeV Higgs MC signals ready to upload (1fb -1, 10fb -1,25fb -1 ) M( )=125 GeV M(ee )=123 GeV 9
10
ATLAS Zpath tests OPloT: M ll and/or M and/or M llll to be discussed locally Moderator: 1 slide with 3 invariant masses; Invariant mass as a tool to identify particles, to discover new particles, and to search for exotic particles Web pages updated and measurement ready http://www.physicsmasterclasses. org/exercises/ATLAS- 2013/en/zpath.htm http://www.physicsmasterclasses. org/exercises/ATLAS- 2013/en/zpath.htm http://www.physicsmasterclasses. org/exercises/ATLAS- 2013/en/zpath.htm Introduced Higgs Described new measurements Prepared material for instructors, moderators, for discussions, … 10
11
OPloT Tests 2013 Higgs comments 4l provided without requiring 2l from Z, with lower cut on other pair provide MC with 125 Higgs and background Upload 125 Higgs MC ((1)&10 & 25 fb -1 ) 11
12
Measurements W l W l W+/W- ratio Angular distribution between leptons in WW events MINERVA program using the ATLANTIS event display 2011 real data: 693 WW/Higgs candidates (from released 1fb -1 ) mixed with 5307 W and other background events Histogram tool spreadsheet and histogram websites connected with database New measurement tested ATLAS W-path with real WW (+H) events 12
13
ATLAS W-path 13 Data from 2011, 1.1fb-1 350 should be WW (w/o Higgs) 160 should be ttbar or single top 120 should be Z+Jets50 should be W+Jets15 should be from H WW
14
or e + e - ? Left: p T >1GeV; right p T >5 GeV 2 apparent tracks pointing to 2 calorimeter objects Zoom reveals 2-pairs e + e - information The conclusion is that the 2 calorimeter objects correspond to 2 photons, which have converted and lead to 4 tracks; the tracks from one pair had less than 3 pixel hits So, to be classified and entered as 14
15
Level-2 observations The applications are all trying to illustrate the analyses and physics in the true context of a detector They use ATLANTIS as a presenter in most cases, which defines the natural common format Other formats would require an additional interface, to what benefit? Use case and resource justification for a common format not clear
16
ATLAS Analysis Roger Jones
17
The Generic Analysis Flow
18
Level 3 ATLAS has no approved level-3 formats for external use, and such release will require such approval We are concerned that anything released be useful, not consume large amounts of collaboration effort (both in production and response) As such, tools like Recast are more attractive The information incorporates the efficiency, acceptances and corrections – so is robust It also helps meet the internal requirement of full documentation of analyses
19
Analysis Practical steps - RECAST Framework developed to extend impact of existing analyses Candidate for within-experiment and long-term analysis archival, encapsulating the full trigger & event selection, data, backgrounds, systematics arXiv:1010.2506 Allow an existing analysis to be reinterpreted under an alternate model hypothesis Complete information from original analysis, including the tacit information, contained in the data Not optimized for the new model, but more reliable than a naïve reanalysis? Recast seen as a very promising solution for preserving analyses and useful, cost effective preservation of information – addresses levels ~1-~3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.