19 July 2007Paul Dauncey - Software Review1 Preparations for the Software “Review” Paul Dauncey.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Acceptance Testing.
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
17 Sep 2009Paul Dauncey1 US DHCAL integration with CALICE DAQ Paul Dauncey.
24 September 2002Paul Dauncey1 Trigger issues for the CALICE beam test Paul Dauncey Imperial College London, UK.
The Comparison of the Software Cost Estimating Methods
LCFI physics studies meeting, 28 th June 05 Sonja Hillertp. 1 Report from ILC simulation workshop, DESY June Aim of workshop: preparation for Snowmass;
Anne-Marie Magnan Imperial College London Tuesday, December18th 2007 Calice Software Review -- DESY -- A.-M. Magnan (IC London) 1 Tracking and ECAL reconstruction.
28 Feb 2006Digi - Paul Dauncey1 In principle change from simulation output to “raw” information equivalent to that seen in real data Not “reconstruction”,
Status of Tracking at DESY Paul Dauncey, Michele Faucci Giannelli, Mike Green, Hakan Yilmaz, Anne Marie Magnan, George Mavromanolakis, Fabrizio Salvatore.
Cluster Threshold Optimization from TIF data David Stuart, UC Santa Barbara July 26, 2007.
17 May 2007LCWS analysis1 LCWS physics analysis work Paul Dauncey.
1 Validation and Verification of Simulation Models.
Anne-Marie Magnan Imperial College London Kobe, May 11th, 2007 Calice meeting --- Anne-Marie Magnan 1 Status of MC reconstruction MC Reconstruction Chain.
29 Mar 2007Tracking - Paul Dauncey1 Tracking/Alignment Status Paul Dauncey, Michele Faucci Giannelli, Mike Green, Anne-Marie Magnan, George Mavromanolakis,
DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.
1 Functional Testing Motivation Example Basic Methods Timing: 30 minutes.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
Commercial Database Applications Testing. Test Plan Testing Strategy Testing Planning Testing Design (covered in other modules) Unit Testing (covered.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
06/03/06Calice TB preparation1 HCAL test beam monitoring - online plots & fast analysis - - what do we want to monitor - how do we want to store & communicate.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
Black Box Testing Techniques Chapter 7. Black Box Testing Techniques Prepared by: Kris C. Calpotura, CoE, MSME, MIT  Introduction Introduction  Equivalence.
Introduction Advantages/ disadvantages Code examples Speed Summary Running on the AOD Analysis Platforms 1/11/2007 Andrew Mehta.
11 Sep 2009Paul Dauncey1 TPAC test beam analysis tasks Paul Dauncey.
NA62 Trigger Algorithm Trigger and DAQ meeting, 8th September 2011 Cristiano Santoni Mauro Piccini (INFN – Sezione di Perugia) NA62 collaboration meeting,
Some Thoughts about Hits, Geometry etc Rob Kutschke, Hans Wenzel Fermilab March 13, 2007.
1 Calice UK Meeting 27/03/07David Ward Plans; timescales for having analysis results for LCWS Status of current MC/data reconstruction Reconstruction status;
Reconstruction Configuration with Python Chris Jones University of Cambridge.
Development of a Particle Flow Algorithms (PFA) at Argonne Presented by Lei Xia ANL - HEP.
21 Jun 2010Paul Dauncey1 First look at FNAL tracking chamber alignment Paul Dauncey, with lots of help from Daniel and Angela.
7 May 2009Paul Dauncey1 Tracker alignment issues Paul Dauncey.
David Adams ATLAS Virtual Data in ATLAS David Adams BNL May 5, 2002 US ATLAS core/grid software meeting.
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
Nigel Watson / BirminghamCALICE ECAL, UCL, 06-Mar-2006 Test Beam Task List - ECAL  Aim:  Identify all tasks essential for run and analysis of beam data.
Bob Jacobsen Aug 6, 2002 From Raw Data to Physics From Raw Data to Physics: Reconstruction and Analysis Introduction Sample Analysis A Model Basic Features.
© Imperial College LondonPage 1 Tracking & Ecal Positional/Angular Resolution Hakan Yilmaz.
TB1: Data analysis Antonio Bulgheroni on behalf of the TB24 team.
LM Feb SSD status and Plans for Year 5 Lilian Martin - SUBATECH STAR Collaboration Meeting BNL - February 2005.
1 A first look at the KEK tracker data with G4MICE Malcolm Ellis 2 nd December 2005.
Some thoughts on error handling for FTIR retrievals Prepared by Stephen Wood and Brian Connor, NIWA with input and ideas from others...
1ECFA/Vienna 16/11/05D.R. Ward David Ward Compare these test beam data with Geant4 and Geant3 Monte Carlos. CALICE has tested an (incomplete) prototype.
05/04/06Predrag Krstonosic - Cambridge True particle flow and performance of recent particle flow algorithms.
21 Sep 2009Paul Dauncey1 Status of Imperial tasks Paul Dauncey.
Why A Software Review? Now have experience of real data and first major analysis results –What have we learned? –How should that change what we do next.
Written by Changhyun, SON Chapter 5. Introduction to Design Optimization - 1 PART II Design Optimization.
Beam Extrapolation Fit Peter Litchfield  An update on the method I described at the September meeting  Objective;  To fit all data, nc and cc combined,
Refactoring Agile Development Project. Lecture roadmap Refactoring Some issues to address when coding.
Mokka, main guidelines and future P. Mora de Freitas Laboratoire Leprince-Ringuet Ecole polytechnique - France Linear collider Workshop 2004, Paris.
STAR Analysis Meeting, BNL – oct 2002 Alexandre A. P. Suaide Wayne State University Slide 1 EMC update Status of EMC analysis –Calibration –Transverse.
Vincenzo Innocente, CERN/EPUser Collections1 Grid Scenarios in CMS Vincenzo Innocente CERN/EP Simulation, Reconstruction and Analysis scenarios.
18 Sep 2008Paul Dauncey 1 DECAL: Motivation Hence, number of charged particles is an intrinsically better measure than the energy deposited Clearest with.
Extrapolation Techniques  Four different techniques have been used to extrapolate near detector data to the far detector to predict the neutrino energy.
9 Sep 2008Paul Dauncey1 Tracking software status Paul Dauncey.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
1 Calice TB Review DESY 15/6/06D.R. Ward David Ward Post mortem on May’06 DESY running. What’s still needed for DESY analysis? What’s needed for CERN data.
AliRoot survey: Calibration P.Hristov 11/06/2013.
1 SLAC simulation workshop, May 2003 Ties Behnke Mokka and LCDG4 Ties Behnke, DESY and SLAC MOKKA: european (france) developed GEANT4 based simulation.
MAUS Status A. Dobbs CM43 29 th October Contents MAUS Overview Infrastructure Geometry and CDB Detector Updates CKOV EMR KL TOF Tracker Global Tracking.
1 GlueX Software Oct. 21, 2004 D. Lawrence, JLab.
David Lange Lawrence Livermore National Laboratory
Analysis Tools interface - configuration Wouter Verkerke Wouter Verkerke, NIKHEF 1.
11 Sep 2007Tracking - Paul Dauncey1 Tracking Code Paul Dauncey, Imperial College London.
1 MC Production and Reconstruction Summary and Plans Valeria Bartsch, Fabrizio Salvatore, A.-M. Magnan, and Nigel Watson.
Final CALICE OsC meeting: Status and summary of project
Tracker Upgrade Simulations Software Harry Cheung (Fermilab)
fields of possible improvement
Stack Data Structure, Reverse Polish Notation, Homework 7
SCT Wafer Distortions (Bowing)
Presentation transcript:

19 July 2007Paul Dauncey - Software Review1 Preparations for the Software “Review” Paul Dauncey

19 July 2007Paul Dauncey - Software Review2 Aims Will hold a software “review” in early autumn Half or full day meeting; maybe at DESY or CERN? Review will be of our software “implementation model”; this is not fully defined. Hence we have to decide what we want beforehand; the review itself should just be a sign-off of the model Not intended to be an a postieri justification of what exists, although many elements will presumably be preserved Need to define the software model in next few months Must get opinions beforehand to develop the model in time for the review Review itself would be a detailed presentation of the model and evaluation of what work needs to be done to implement it Each subsystem should be expected to contribute Ideas on the model itself before the review Details of how their code fits in (or not) to the model at the review Implementation after the review; don’t expect Roman to do it all This is the first attempt to get input Covers only some of the issues for analysis and reconstruction

19 July 2007Paul Dauncey - Software Review3 Definitions “Reconstruction” = process of producing the reco files from the raw data files In bulk, usually done centrally by expert(s) Semi-experts contribute code Some user studies are done on raw data (e.g. calibrations); those are also considered to be reconstruction for this talk as the output is used there “Digitisation” = conversion of SimXxxHits in Mokka files to something which can be used by reconstruction Usage and comments as for reconstruction Usually run as part of reconstruction jobs for MC events “Analysis” = studies done on reco files Usually done by semi- or non-experts

19 July 2007Paul Dauncey - Software Review4 Assumptions LCIO will continue to be used throughout offline Significant experience with it Analysis work will not normally be CPU limited Most analyses use a subset of the total data sample Can be relatively sophisticated in analysis techniques Reconstruction does not take a long time if automated and done centrally on the Grid Don’t need to worry about updating reco files when new reconstruction code released; just redo from scratch We are aiming for Most of reconstruction and all of analysis to be uniform for DESY/CERN/FNAL Ditto for data/MC

19 July 2007Paul Dauncey - Software Review5 Critical choice #1 Reconstruction clearly must use the database but should analysis “usually” not require database access? Experts (and semi-experts) say it is simple to use The issue seems to be getting started; once set up, things seem to keep working But experience shows many users put off the step of learning to use it Consider (extreme cases of?) two models here Maximum database; all users should expect to use the database for all but the most trivial analyses and should be able to do relatively sophisticated operations Zero database; reco files should contain enough infrastructure that no database access is required for any analyses The optimum may be somewhere in between For both cases, all values (e.g. beam energy) should be accessed in the same way for data from all locations and for data and MC

19 July 2007Paul Dauncey - Software Review6 Critical choice #2 Systematic studies will be needed Dependence on calibration constants, threshold, non-linearities, noise model, track resolution, etc. Very little done in this area; obvious missing part of LCWS submission Unclear why; lack of time or technical difficulty? How should this be done in the future? Cleanest way is to change parameter and see effect on result Implies rerunning parts of the reconstruction code Should analysers rerun (parts of) digitisation and reconstruction themselves as part of their analysis? Makes systematic studies much more focussed But need to be confident that this has been done correctly Otherwise, need centrally produced files with all possible reconstruction variations; ×N, where N is large More efficient overall for CPU But takes a large amount of central diskspace

19 July 2007Paul Dauncey - Software Review7 Critical choice #2 (cont) The second major choice is then to Enable users to do this themselves Do it centrally Not do it at all Investigate here whether the user option for rerunning reconstruction can be supported by a software model Firstly, must be sure original result can be duplicated if no constants are changed; essential crosscheck Secondly, must be able to easily and efficiently change values Would this be done from raw or reco files? Raw files definitely will work but require rerunning whole reconstruction (and digitisation if MC) every time Reco files would need careful planning to be sure data needed by all reconstruction modules is included The way this could be done depends strongly on whether we assume the maximum or zero database models

19 July 2007Paul Dauncey - Software Review8 Analysis

19 July 2007Paul Dauncey - Software Review9 Maximum database model Once database access is made a requirement for analysis, then it should be used as much as possible All run information (beam energy, run type, data/MC flag, location, etc) from database Constants used in reconstruction, including those varied for systematics checks Geometry values; some are already in LCIO hits but e.g. the front plane of the ECAL for track extrapolation is not trivial to get There is a general issue about using a “conditions” database for “configuration” data “Conditions” is based on time; e.g. what is temperature at noon? “Configuration” is based on run structure; e.g. what beam energy was used for this run? One can be shoehorned into the other, but would we be better with two separate databases for the different data types? Would this be less error- prone for non-experts?

19 July 2007Paul Dauncey - Software Review10 Maximum database model (cont) Systematics studies require significant database interactions These would need to be done by the users To duplicate reconstruction, need to be sure to get database at the time of original reconstruction There may have been updates since which must be ignored Requires reco files to hold the processing time information and the database to be easily reset to this state This must work even for “privately” reconstructed files where database tag cannot be guaranteed To modify constants and rerun reconstruction Would need to (temporarily) load changed values into database beforehand so must include these updates However, all other constants must be kept at the values used originally, hence must ignor all other updates Is this level of database manipulation feasible for non-experts?

19 July 2007Paul Dauncey - Software Review11 Zero database model Reco files must contain all information needed Run information, constants, geometry, etc. Effectively, the reco files must be self-contained for analysis This means reconstruction constants must be copied into the reco files Technically easy as both are LCIO format Would be included in next LCEvent after they have changed; always for at least the first in the run to get values for single-run analysis job Modification of the constants by users is then easy They are LCIO objects and so handled in normal C++ as job is running There are no issues of updated constants since reconstruction was done as values in file cannot be modified

19 July 2007Paul Dauncey - Software Review12 Reconstruction

19 July 2007Paul Dauncey - Software Review13 General comments We want reconstruction to be rerunnable by non-experts Needs to be foolproof Would need substantial cleaning up Remove all steering file parameters; values must come from database or cannot be guaranteed to be reproducible Steering files common for DESY/CERN/FNAL and for data/MC This also would make the original production less error-prone Requires database folder to be derived from run number, etc, after raw file is read, not hardcoded in steering file (almost universally done now) Must remove/overwrite module output if it exists already Or else downstream modules will not process changed data Again, technically tricky with LCIO but can be done Must made as common for data and MC as possible Merge data and MC into common format as early in chain as possible Reduces probability of artificial differences

19 July 2007Paul Dauncey - Software Review14 Zero database model Constants need to be in files By changing these (for systematics studies), reconstruction modules must see changes Therefore, all reco modules which do processing work must use these constants and not access database Need “database handler” module for each subsystem (or one overall?) Pulls constants from database and puts into file Only done when there has been a change; this generally then implies the constants go into the LCEvent, not LCRunHeader All constants (including selection or reconstruction cuts) must be included and not specified through steering files

19 July 2007Paul Dauncey - Software Review15 Reconstruction comparison Reco modules DbHandler module Passed through LCEvent Maximum Database Zero Database Only these need to be rerun for systematics

19 July 2007Paul Dauncey - Software Review16 Digitisation

19 July 2007Paul Dauncey - Software Review17 How tied to reconstruction? How does database know reco constants to apply to a MC file? Does each simulation file need a unique association to a real run number? Where and how should this be defined? Can it work for user-generated MC events? How do run-dependent parameters in Mokka get set? E.g. beam spot size and divergence What happens for MC runs which do not correspond to real data? What about real data runs where layers which are missing in real data but not the MC? For systematics, do not want random number fluctuations to mask shifts Must seed random generator reproducibly As previous modules may not be run, each module must seed independently If really to be reproducible, they must reseed for every event

19 July 2007Paul Dauncey - Software Review18 Global detector studies How do we optimise our contribution to concept groups? We need to move more in this direction Use common (Mokka) simulation? This is fine for LDC (and hence probably GLDC) For SiD, there is an implementation of their detector but using this may have less impact in concept meetings Use concept group “native” simulations? Requires two independent implementations; are they guaranteed to be equivalent to each other and the beam test results? This may become more critical as detector concepts move to collaborations…

19 July 2007Paul Dauncey - Software Review19 This is just a first look at some of the issues Need input on what people expect/want to do with the data I am not sure on the process for agreeing on a model at the end… Summary