3rd November 2006 1Richard Hawkings Luminosity, detector status and trigger - conditions database and meta-data issues  How we might apply the conditions.

3rd November 2006 1Richard Hawkings Luminosity, detector status and trigger - conditions database and meta-data issues  How we might apply the conditions.

1 3rd November 2006 1Richard Hawkings Luminosity, detector status and trigger - conditions database and meta-data issues  How we might apply the conditions DB to some of the requirements in:  Luminosity task force report  Run structure report  Meta-data task force report (draft)  Data preparation/data quality discussions  This talk:  Reminder of conditions DB concepts relevant here  Proposal for storage of luminosity, status and trigger information in CondDB  Relation to TAG database, data flow through system  Other meta-data related comments  For more in-depth discussion, see document attached to agenda page ATLAS luminosity TF workshop, 3/11/06

2 3rd November 2006 2Richard Hawkings Conditions DB - basic concepts  COOL-based database architecture - data with an interval of validity (IOV) Online, Tier 0/1 COOL Relational Abstraction Layer (CORAL) Oracle DB MySQL DB Application Small DB replicas SQLite File Frontier web File-based subsets http-based Proxy/cache SQL-like C++ API C++, python APIs, specific data model IOV start IOV stop channel1(tag1)payload1 IOV start IOV stop channel2(tag2)payload2  COOL IOV (63 bit) can be interpreted as:  Absolute timestamp (e.g. for DCS)  Run/event number (e.g. for calibration)  Run/LB number (possible to implement)  COOL payload defined per ‘folder’  Ttuple of simple types  1 DB table row  Can also be a reference to external data  Use channels (int, soon string) for multiple instances of data in 1 folder  COOL tags allow multiple data versions  COOL folders organised in a hierarchy  Athena interfaces, replication, … Indexed

3 3rd November 2006 3Richard Hawkings Storage of luminosity block information in COOL  Luminosity block information from the online system  Start/end event number and timestamps per LB, {livetimes, prescales}/trigger chain  How might this look in COOL - an example structure (RE=run/event) RE start RE stop LB value RE start RE stop LB value T start T stop LB value T start T stop LB value RLB start RLB stop event start event stop T start T stop other data … RLB start RLB stop channel= Trigger chain livetimeL1 prescale HLT prescale other data … RLB start RLB stop Channel= Lumi estimate Tag= version Lumi value Uncertaintyother data … /TDAQ/LUMI/LBRUN - LB indexed by run/event /TDAQ/LUMI/LBLB - LB information (start/stop event, time span) indexed by RLB /TDAQ/LUMI/TRIGGERCHAIN - trigger chain info identified by channel, indexed by RLB /TDAQ/LUMI/ESTIMATES - luminosity estimates versioned and indexed by RLB /TDAQ/LUMI/LBTIME - LB indexed by timestamp

4 3rd November 2006 4Richard Hawkings Storage of detector status information in COOL  Detector status from DCS - many channels, many folders; to be merged:  Merge process combines folders, channels, derives set of IOVs for summary..  Involves ‘ANDing’ status over all channels, splitting/merging IOVs - > tool ?  Similar activity for data indexed by run-event … have to correlate this somehow  Final summary derived first as function of run-event (combining all information):  Then map status changes to luminosity block boundaries (using LB tables):  Status in an LB is defined as the status of the ‘worst’ event in the LB T start T stop Chan1TRT HV chan1 T start T stop Chan2TRT HV chan2 T start T stop Chan1Temp, gas property T start T stop Chan2Temp, gas property T start T stop Chan=TRTTag=pass1Traffic lightEfficiencyThrustBad-list RE start RE stop Chan=TRTTag=pass1Traffic lightEfficiencyThrustBad-list RLB start RLB stop Chan=TRTTag=pass1Traffic lightEfficiencyThrustBad-list /GLOBAL/STATUS/TISUMM - summary info (one channel per detector/physics), indexed by timestamp /GLOBAL/STATUS/RESUMM - summary info (one channel per detector/physics), indexed by run/evt /GLOBAL/STATUS/LBSUMM - summary info (one channel per detector/physics), indexed by RLB

5 3rd November 2006 5Richard Hawkings Storage of trigger information in COOL  Source for trigger setup information is the trigger configuration database  Complex relational database - complete trigger configuration accessed by key  Store trigger configuration used for each run  LVL1 prescales may change per LB - stored in /TDAQ/LUMI/TRIGGERCHAIN  In principle this is enough, but hard to access trigger config DB ‘everywhere’  Copy basic information needed for analysis/selection to condDB: ‘configured triggers’  Other information needed offline: efficiencies  Filled in offline, probably valid for IOVs much longer than a run: RE start RE stop channel=Trigger chainEnabled?other data (chain ctr?) RE start RE stop Trigger config keyother data … RE start RE stop Channel= Trigger chain Tag= version Efficiency other data … /TDAQ/TRIGGEREFI - efficiency info (one channel per chain, versioned), indexed by run (/event) /TDAQ/TRIGGEREFI - efficiency info (one channel per chain), indexed by run (/event) /TDAQ/TRIGGER/CONFIG - efficiency info (one channel per chain) - trigger configuration (Run/event key, really spanning complete runs)

6 3rd November 2006 6Richard Hawkings Relations to the TAG database  TAG database contains event level ‘summary’ quantities  For quickly evaluating selections, producing event collections (lists) for detailed analysis of subsample of AOD, ESD, etc…  Need luminosity block and detector status information to make useful queries ‘Give me list of events with 2 electrons, 3 jets, from lumiblocks with good calo and tracking status and where the e25i and 2e15i triggers were active’  Various ways to make this information available in TAGs 1.Put all LB, status and trigger information in every event: make it a TAG attribute  Wasteful of space, makes it difficult to update e.g. status information afterwards  Hard to answer non-event-oriented questions (‘give me list of LBs satisfying condition’) 2.Store just the (run,LB) number of each event in TAGs, have auxiliary tables(s) containing LB and run-level information  Tag database does internal joins to answer a query  Need to regularly ‘publish’ new (versioned) status information from COOL to TAGs 3.Have TAG queries get LB/status/trigger info from COOL on each query  Technically tricky, would have to go ‘underneath’ COOL API (or don’t use COOL at all)  Solution 2 seems to be the best … try it ?

7 3rd November 2006 7Richard Hawkings Data flow aspects  Walk through the information flow from online to analysis  Online data-taking: Luminosity, trigger, and ‘primary’ data quality written in COOL  Calibration processing: Detector status information is processed to produce first summary status information  Put this in COOL summary folders (tagged ‘pass1’); map to LB boundaries  Bulk reconstruction: Process data, produce tags  Detector quality information (‘pass1’) could be written to AODs and TAGs (per event)  Upload LB/run level information from COOL to TAG DB at same time as TAG event data upload … users can now make ‘quality/LB aware’ queries on TAGs  Refining data-quality: Subdetector experts look at pass1 reconstructed data, reflect, refine data quality information, enter it into COOL (‘pass1a’ tag)  At some point, intermediate quality information can be ‘published’ to TAG DB  Users can do new ‘pass1a’ TAG queries (LBs/events may come or go from selection)  This can be done before a new processing of the ESD or AOD is done  Estimating luminosity: Lumi experts estimate luminosities, fill in COOL  Export this info to TAGs, allow luminosity calculations directly from TAG queries?  Re-reconstruction: New data quality info ‘pass2’ in COOL, new AOD, new TAGs

8 3rd November 2006 8Richard Hawkings A few comments  Not all analyses will start from TAG DB and resulting event collection  Maybe just a list of files/datasets - need access to status/LB/trigger chain information in Athena  Make Athena IOVSvc match conditions info on RLB as well as run/event & timestamp  AOD (and even TAG) can have detector status stored event-by-event  Allows vetoing of bad-quality/bad-lumi block events even without Cond DB access  With Cond DB access, can make use of updated (e.g. pass1a) status  Overriding detector status stored in AOD files  But Cond DB access may be slow for sparse events - no caching (need to test)  Hybrid data selection scheme could also be supported:  Use TAG database to make a ‘data qualiy/trigger chain selection’ and output a list of good luminosity blocks  Feed this into Athena jobs running a list of files - veto any event from a LB not in list  Maintaining ability to do detector quality selection without LBs implies:  Correlation of event numbers with timestamps for each event (event index files?)  Storing detector status info per event in TAG DB (difficult to do ‘pass1a’ update)

9 3rd November 2006 9Richard Hawkings Comments on other meta-data issues  Luminosity TF requires ability to know which LBs are in a file, without the file  In case we lose / are unable to access file in our analysis  Implies need for file level metadata - on a scale of millions of files…  Who does this - DDM? AMI? New database? Should not be conditions DB?  Definition of datasets  The process by which files make it from online at SFOs to offline in catalogued datasets needs more definition  What datasets are made for the RAW data?  By run, by stream, by SFO? What metadata will be stored?  Datasets defined in AMI and DDM? Files catalogued in DDM?  What role would AMI play in selection of real data for an analysis? C.f. TAG DB ?  What about ESD and AOD datasets - per run? per stream?  What about datasets defined for the RAW/ESD sent to each Tier-1?  The RAW/ESD dataset for each run will never exist on a single site?

10 3rd November 2006 10Richard Hawkings Possible next steps  If this looks like a good direction to go in … some possible steps  Set up the suggested structures in COOL  Look at filling them, e.g. with data from the streaming tests  Explore size and scalability issues  In Athena …  Set up some access service and data structures to use the data  E.g. for status information, stored In condDB and/or AOD, accessible from either with the same interface  Make Athena IOVSvc ‘LB aware’  Look at speed issues - e.g. penalties for accessing status information from CondDB in every event in sparse data  Work closely with efforts on luminosity / detector status in tag database  First discussions on that (in context of streaming tests) have taken place this week

