L2 Upgrade review 19th June 2007Alison Lister, UC Davis1 XFT Monitoring + Error Rates Alison Lister Robin Erbacher, Rob Forrest, Andrew Ivanov, Aron Soha.

Slides:



Advertisements
Similar presentations
NorthGrid status Alessandra Forti Gridpp12 Brunel, 1 February 2005.
Advertisements

Status of the CTP O.Villalobos Baillie University of Birmingham April 23rd 2009.
GCT Software Jim Brooke GCT ESR, 7 th November 2006.
Final Commissioning Plan and Schedule Cheng-Ju Lin Fermilab Installation Readiness Review 09/27/2004 Outline: - Work to be done during shutdown - Work.
In this presentation you will:
June 19, 2002 A Software Skeleton for the Full Front-End Crate Test at BNL Goal: to provide a working data acquisition (DAQ) system for the coming full.
Some answers to the questions/comments at the Heavy/Light presentation of March 17, Outline : 1) Addition of the id quadrant cut (slides 2-3); 2)
MICE Tracker Front End Progress Tracker Data Readout Basics Progress in Increasing Fraction of Muons Tracker Can Record Determination of Recordable Muons.
Summary Ted Liu, FNAL Feb. 9 th, 2005 L2 Pulsar 2rd IRR Review, ICB-2E, video: 82Pulsar
EE694v-Verification-Lect5-1- Lecture 5 - Verification Tools Automation improves the efficiency and reliability of the verification process Some tools,
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
Presenter: Shant Mandossian EFFECTIVE TESTING OF HEALTHCARE SIMULATION SOFTWARE.
P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Compilers, Interpreters and Debuggers Ruibin Bai (Room AB326) Division of Computer Science.
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
Simulation II IE 2030 Lecture 18. Outline: Simulation II Advanced simulation demo Review of concepts from Simulation I How to perform a simulation –concepts:
Joint Commissioning (2) Commissioning plan Gene Flanagan Purdue University.
What to do for a Financial year end And When to do it.
M. Adinolfi – University of Oxford – MAPMT Workshop – Imperial College 27 June Rad-Hard qualification for the LHCb RICH L0 electronics M. Adinolfi.
Lesson13. JavaScript JavaScript is an interpreted language, designed to function within a web browser. It can also be used on the server.
Cluster Finder Report Laura Sartori (INFN Pisa) For the L2Cal Team Chicago, Fermilab, Madrid, Padova, Penn, Pisa, Purdue.
The Functions of Operating Systems Interrupts. Learning Objectives Explain how interrupts are used to obtain processor time. Explain how processing of.
South SG1 Trigger High Rate Sanghwa Park, Katsuro Nakamura, Yoshimitsu Imazu, and Itaru Nakagawa 1.
Introduction Advantages/ disadvantages Code examples Speed Summary Running on the AOD Analysis Platforms 1/11/2007 Andrew Mehta.
RPC DQA but also Monitoring for the DCS group: status and prospective for Marcello Bindi RPC L1MU Barrel DQM - 08/05/2013.
Svtsim status Bill Ashmanskas, CDF simulation meeting, Main authors: Ashmanskas, Belforte, Cerri, Nakaya, Punzi Design goals/features: –main.
17-Aug-00 L.RistoriCDF Trigger Workshop1 SVT: current hardware status CRNowFinal Hit Finders64242 Mergers31616 Sequencers2312 AMboards4624 Hit Buffers21212.
Simple ideas on how to integrate L2CAL and L2XFT ---> food for thoughts Ted May 25th, 2007.
Tracker Timing and ISIS RF Edward Overton 1. At CM32… 2 Had done some preliminary checks on the ISIS RF. Was beginning to think about how to handle the.
1 October 2003Paul Dauncey1 Mechanics components will be complete by end of year To assemble ECAL, they need the VFE boards VFE boards require VFE chips.
Simulation Commissioning, Validation, Data Quality A brain dump to prompt discussion Many points applicable to any of LHCb software but some simulation.
Healthcare Quality Improvement Dr. Nishan Sharma University of Calgary, Canada March
Database Projects in Visual Studio Improving Reliability & Productivity.
Section 06 (a)RDBMS (a) Supplement RDBMS Issues 2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ.
9/12/99R. Moore1 Level 2 Trigger Software Interface R. Moore, Michigan State University.
BIC Issues Alan Fisher PEP-II Run-4 Post-Mortem Workshop 2004 August 4–5.
Trigger Commissioning Workshop, Fermilab Monica Tecchio Aug. 17, 2000 Level 2 Calorimeter and Level 2 Isolation Trigger Status Report Monica Tecchio University.
MICE CM28 Oct 2010Jean-Sebastien GraulichSlide 1 Detector DAQ o Achievements Since CM27 o DAQ Upgrade o CAM/DAQ integration o Online Software o Trigger.
Online monitor for L2 CAL upgrade Giorgio Cortiana Outline: Hardware Monitoring New Clusters Monitoring
Nov 1, 2002D0 DB Taking Stock1 Trigger Database Status and Plans Elizabeth Gallas – FNAL CD (with recent help from Jeremy Simmons, John Weigand, and Adam.
Pulsar Status For Peter. L2 decision crate L1L1 TRACKTRACK SVTSVT CLUSTERCLUSTER PHOTONPHOTON MUONMUON Magic Bus α CPU Technical requirement: need a FAST.
Controls Zheqiao Geng Oct. 12, Autosave Additions/Upgrades and Experiences at SLAC Zheqiao Geng Controls Department SLAC National Accelerator Laboratory.
“Building Journaling Databases with PostgreSQL” Cybertec Geschwinde & Schoenig Hans-Juergen Schoenig
October Test Beam DAQ. Framework sketch Only DAQs subprograms works during spills Each subprogram produces an output each spill Each dependant subprogram.
L2TS and Plan for Integration Cheng-Ju Lin Fermilab Pulsar Meeting 06/11/2004.
L2toTS Status and Phase-1 Plan and Pulsar S-LINK Data Format Cheng-Ju Lin Fermilab L2 Trigger Upgrade Meeting 03/12/2004.
L2 CAL Status Vadim Rusu For the magnificent L2CAL team.
Joint Commissioning (1) Data size issues Gene Flanagan Purdue University.
LHC CMS Detector Upgrade Project RCT/CTP7 Readout Isobel Ojalvo, U. Wisconsin Level-1 Trigger Meeting June 4, June 2015, Isobel Ojalvo Trigger Meeting:
1 Status of Validation Board, Selection Board and L0DU Patrick Robbe, LAL Orsay, 19 Dec 2006.
Calorimeter global commissioning: progress and plans Patrick Robbe, LAL Orsay & CERN, 25 jun 2008.
Piquet report Pascal, Yuri, Valentin, Tengiz, Miriam Calorimeter meeting 16 March 2011.
News and Related Issues Ted & Kirsten May 27, 2005 TDWG News since last meeting (April 29th) Organization Issues: how to improve communication Future Plans:
November 15, 2005M Jones1 Track Lists in Level 2 Outputs from SLAM provide lists of tracks faster than the current path to SVT and Level 2 via XTRP Sparsify.
1M. Ellis - MICE Tracker PC - 1st October 2007 Station QA Analysis (G4MICE)  Looking at the same data as Hideyuki, but using G4MICE.  Have not yet had.
Evelyn Thomson Ohio State University Page 1 XFT Status CDF Trigger Workshop, 17 August 2000 l XFT Hardware status l XFT Integration tests at B0, including:
Piquet report (07/Aug/ /Aug/2012) D. Pinci.
Calorimeter Cosmics Patrick Robbe, LAL Orsay & CERN, 20 Feb 2008 Olivier, Stephane, Regis, Herve, Anatoly, Stephane, Valentin, Eric, Patrick.
MAUS Status A. Dobbs CM43 29 th October Contents MAUS Overview Infrastructure Geometry and CDB Detector Updates CKOV EMR KL TOF Tracker Global Tracking.
35t Readout Software John Freeman Dune Collaboration Meeting September 3, 2015.
ST Analysis: Introduction M. Needham EPFL. Outline Aims of the meeting Releasing code Performance Monitoring Results (IT and TT): Active fraction, noise.
The Troubleshooting Process. Hardware Maintenance Make sure that the hardware is operating properly.  Check the condition of parts.  Repair or replace.
Debugging Intermittent Issues
Debugging Intermittent Issues
A451 Theory – 7 Programming 7A, B - Algorithms.
L2 CPUs and DAQ Interface: Progress and Timeline
The Troubleshooting theory
XFT2B: Plans and Tasks XFT Workshop FNAL 19 December 2003; p.1.
Presentation transcript:

L2 Upgrade review 19th June 2007Alison Lister, UC Davis1 XFT Monitoring + Error Rates Alison Lister Robin Erbacher, Rob Forrest, Andrew Ivanov, Aron Soha

L2 Upgrade review 19th June 2007Alison Lister, UC Davis2 Monitoring: TrigMon Timeline of TrigMon releases for XFT L2 8th January: First official XFT L2 version of TrigMon running in Control room  Basic bank comparisons between data and data or simulation as well as data integrity checks (e.g. bunch counters) at each stage: Finders, Pulsars 3rd March: Updated monitoring  Better checks for errors in bank comparisons 16th April: Latest version running  Includes timing information for various components of the system  Includes better diagnostics for the “out-of-sync” problems  Includes plot for CO (but not in checklist yet) Further update will probably be required to finalise the checks, add canvas colour when bad, solve minor bugs  To be done once all hardware is finalised

L2 Upgrade review 19th June 2007Alison Lister, UC Davis3 Monitoring: Hardware TrigMon code looks at bank comparisons at different stages  XSFD (L1 path in Finders)  T2FD (L2 path in Finders)  TP2D (Pulsar bank)  XSFD Sim where the simulation starts from the XTCD bank  T2FD Sim where the simulation starts from the XTCD bank TP2D (data) vs T2FD Sim CO plot: Summary of all errors in whole system Other plots used to diagnose what the problem is Finder-space Pulsar-space SlotCrate Board Input Fiber

L2 Upgrade review 19th June 2007Alison Lister, UC Davis4 Monitoring: Hardware TP2D vs T2FD Sim TP2D vs T2FD XSFD vs XSFD Sim: SL7 Run Lumi e30 Error rate: Max 3 errors / finder / 20’000 events 2 event with XSFD error 1 event with T2FD error Bunch Counter mismatches 3 events where TP2D bank structure is corrupt (!= 12 finders in XFT01) Expert canvases used to diagnose problem

L2 Upgrade review 19th June 2007Alison Lister, UC Davis5 Monitoring: Timing Timing information needed for commissioning and debugging of system (“expert only” plots) Timing information is available for most components of the system  Arrival time for each finder packet to the Pulsar Available only when DATA I/O enabled  CPU timing Main one used for tests  L2TS Signal out to TS  Signal back from TS  Unpacking of different inputs  TL2D: Algorithm timing

L2 Upgrade review 19th June 2007Alison Lister, UC Davis6 Monitoring: Timing Lots of work put in by L2XFT-team to reduce timing delays in new PC CPU Timing for parasitic PC with XFT (no algos) vs “old” PC Most events new PC ~1us faster than old one For SVT fast abort ~6us slower Findings independent of which inputs are being read out i.e. same for L2CAL only and XFT+L2CAL

L2 Upgrade review 19th June 2007Alison Lister, UC Davis7 Error Rates When no obvious hardware problems are present (“SEU”) rate  ~ to have an error somewhere in the system  ~ to have an error in a given “finder”package  ~ 1/4 of them are related to problems upstream (also seen in L1)  These problems don’t cause “operational” problems with L2 PC (e.g. data corrupt for 3 errors in a row that causes an HRR)  Could affect outcome of trigger decision for single events  Additional ~1-2 errors per 10’000 events where the bunch counters for the first finders in each crate have a bunch counter mismatch (off by 1) Data agrees with simulation so not an operational problem Hardware failures  “out-of-sync” problem from March- April  Finder problem with L2 (data corrupt)

L2 Upgrade review 19th June 2007Alison Lister, UC Davis8 “Out of Sync” Problem March-April problem: Rate ~once per store on average 1 chip in a Pulsar board gets out of sync wrt other chip and rest of event Stays out of sync until an HRR is issued  HRR now issued from parasitic PC after 3 consecutive problems Not related to pulsar board Problem solved after swapping input fibers Think that 1 of the SL3 fibers might be slightly flakey In current configuration the problem hasn’t returned for ~2 months

L2 Upgrade review 19th June 2007Alison Lister, UC Davis9 Finder L2 Problem Problem occurs when 1 finder contains corrupted data L1 path agrees with simulation L2 path does not agree with L1 Causes operational problems (L2TO) as data gets out of sync (i.e. corrupted data in PC) and stays out of sync until problem is solved or XTC is masked on Seen in TrigMon plots (e.g T2FD vs XSFD) Happened once before in December when not running parasitically  Related to the finder board  Masking on XTC stopped the problem  Swapped out the boards but was never investigated further Happened again end of May (?)  Masking on XTCs allowed us to keep taking data  Unmasked XTCs to narrow down problem and problem did not re-appear  “Flakey” problem :-) Rate seems to be once every ~5 months

L2 Upgrade review 19th June 2007Alison Lister, UC Davis10 Conclusions Monitoring of all stages has been running in TrigMon for almost 6 months Still minor code changes to be implemented for “final” version Error rate for “normal” data taking is manageable Diagnostic tools for hardware failures have shown to be useful in quickly finding and solving problems

L2 Upgrade review 19th June 2007Alison Lister, UC Davis11 Backup

L2 Upgrade review 19th June 2007Alison Lister, UC Davis12 Finder L2 Problem My personal suggestion for “what to do if this problem happens again” Diagnostics  Many L2TO causing data to no longer be taken L2 will be paged  TrigMon plots should show some sign of which Finder is cause Not necessarily the case as TrigMon only runs on selected trigger path and if no data is being taken won’t appear recorded data..  L2 PC logfile should show which XFT Pulsar is corrupted  Dump Pulsar banks to narrow it down to 1 finder XFT should be paged Solving the problem  Trial and error of masking on XTCs (as can’t mask all of them at once) until problem goes away Need to understand where the problem is  Problem needs to happen again before can investigate further  If finder hardware problem we will need help diagnosing Current statistics tell us this might happen every ~ 5 months