CDF Offline Operations

Slides:



Advertisements
Similar presentations
Def f(n): if (n == 0): return else: print(“*”) return f(n-1) f(3)
Advertisements

CSC241 Object-Oriented Programming (OOP) Lecture No. 9.
RIVERSIDE RESEARCH INSTITUTE Helikaon Linux Debugger: A Stealthy Custom Debugger For Linux Jason Raber, Team Lead - Reverse Engineer.
CS-1030 Dr. Mark L. Hornick 1 Pointers And Dynamic Memory.
DEBUGGING IN THE REAL WORLD : Recitation 4.
David Notkin Autumn 2009 CSE303 Lecture 16 #preprocessor Debugging is twice as hard as writing the code in the first place. Therefore, if you write the.
1 Queues CPS212 Gordon College. 2 Introduction to Queues A queue is a waiting line – seen in daily life –Real world examples – toll booths, bank, food.
CSE 303 Lecture 13a Debugging C programs
Practical Session 8 Computer Architecture and Assembly Language.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Recap, Test 1 prep, Composition and Inheritance. Dates Test 1 – 12 th of March Assignment 1 – 20 th of March.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Plans for Trigger Software Validation During Running Trigger Data Quality Assurance Workshop May 6, 2008 Ricardo Gonçalo, David Strom.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
Dynamic Memory Allocation. Domain A subset of the total domain name space. A domain represents a level of the hierarchy in the Domain Name Space, and.
Smashing the Stack Overview The Stack Region Buffer Overflow
Unified scripts ● Currently they are composed of a main shell script and a few auxiliary ones that handle mostly the local differences. ● Local scripts.
VME Access Test Program int main(int argc, long Data1, long Data, long Data2, long Data3, long a, long dw, long dv, long b, long c, long Da32, long Da1,
Static Shared Library. Non-shared v.s. Shared Library A library is a collection of pre-written function calls. Using existing libraries can save a programmer.
The Disposition effect and Underreaction to news Abdullah Al-Ashi Jungha Woo Muna Albasman Talha Yasin 1.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
Bob Jacobsen Aug 6, 2002 From Raw Data to Physics From Raw Data to Physics: Reconstruction and Analysis Introduction Sample Analysis A Model Basic Features.
Debugging of # P. Hristov 04/03/2013. Introduction Difficult problem – The behavior is “random” and depends on the “history” – The debugger doesn’t.
CSE 351 Final Exam Review 1. The final exam will be comprehensive, but more heavily weighted towards material after the midterm We will do a few problems.
NOVA art. memory leaking Alexey Naumov Lebedev Physical Institute Moscow 1.
1 Recall that... char str [ 8 ]; str is the base address of the array. We say str is a pointer because its value is an address. It is a pointer constant.
1 Chapter 15-1 Pointers, Dynamic Data, and Reference Types Dale/Weems.
Tutorial 3. In this tutorial we’ll see Fork() and Exec() system calls.
1 C Basics Monday, August 30, 2010 CS 241. Announcements MP1, a short machine problem, will be released today. Due: Tuesday, Sept. 7 th at 11:59pm via.
Programmer Support. Our Primary Goal: Reproduce the Problem.
Announcements Partial Credit Due Date for Assignment 2 now due on Sat, Feb 27 I always seem to be behind and get tons of daily. If you me and.
CSE 333 – SECTION 2 Memory Management. Questions, Comments, Concerns Do you have any? Exercises going ok? Lectures make sense? Homework 1 – START EARLY!
How to run MC on the D0 farm. Steps On hoeve (user fbsuser) –Get MC request –Create macro –Submit jobs On schuur (user willem) –Store files into SAM –Clear.
Copyright © 2012 Pearson Education, Inc. 16/4/1435 h Sunday Lecture 3 1.Using a Loop to Step Through an array 2.Implicit Array Sizing 3.No Bounds Checking.
1 Ugly Realities The Dark Side of C++ Chapter 12.
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
1 CS 192 Lecture 4 Winter 2003 December 8-9, 2003 Dr. Shafay Shamail.
Administering the SOWN Network David R Newman & Chris Malton.
SQL Database Management
(New) Root Memory checker
New Capabilities for SDMS: Entity Data Loader
Pointer* Review Jason Stredwick.
Introduction to Information Security
Troubleshooting Tools
CDF Offline Operations
Valgrind Overview What is Valgrind?
Program Execution in Linux
Checking Memory Management
University of California Los Angeles
هجرة الشباب الدولية والتنمية الفرص والتحديات
CSC 253 Lecture 13.
ניפוי שגיאות - Debugging
Const in Classes CSCE 121 J. Michael Moore.
understanding memory usage by a c++ program
String Messy Details Unsigned vs signed.
Object-Oriented Programming (OOP) Lecture No. 18
CS-401 Computer Architecture & Assembly Language Programming
PPT1: How failures come to be
Tutorial 3 Tutorial 3.
Tools.
Virtual Memory CSCI 380: Operating Systems Lecture #7 -- Review and Lab Suggestions William Killian.
Debugging at Scale.
Tools.
Debug Logs for the Business Analyst
CSCI 380: Operating Systems William Killian
Valgrind Overview What is Valgrind?
Destructors, Copy Constructors & Copy Assignment Operators
James N. Bellinger University of Wisconsin at Madison 4 August 2010
Destructors, Copy Constructors & Copy Assignment Operators
Presentation transcript:

CDF Offline Operations Status: Rerun zee validation sample for 5.1.1. No discrepancies found as expected. Checked farm crashes. Reproduced 3 crashes: CdfTrack.cc (in prewrite). if (_siHits.size() > 0) { CdfTrackHits* storedSvxHits; storedSvxHits = new CdfTrackHits; for (SiHitIterator ihit = beginSIHits(); ihit != endSIHits(); ++ihit) { int packed = ((*ihit)->id() & 0x1FFFFFFF) | (((*ihit)->getAmbIndex() & 0x7 ) << 29);  crash (ihit !=0x0) storedSvxHits->accumulate(packed); }  Matt and Chris

Crashes Mark Fischler (needs help with debugging) The location in ELextendedID is basic_string& operator=(const basic_string& str); basic_string& operator=(const charT* s) {return assign( s,traits::length(s) );}  crash basic_string& operator=(charT c) {return assign( size_type(1), c );} and in ErrorObj::clear() is mySerial = 0; myXid.clear();  crash myIdOverflow = ""; Mark Fischler (needs help with debugging) 0x8fa13bd in SiStripCorrectorManager::correctStripSet (this=0xcd5b338,stripSet=0xe392094) at /home/cdfsoft/dist/packages/SvxDaqObjects/V00-00-74/src/SiStripCorrectorManager.cc:62  Matt (fixed)

Valgrind Run valgrind over the other crashes: Other: (Matt & Jason) ==18449== Conditional jump or move depends on uninitialised value(s) ==18449== at 0x420A6879: __mktime_internal (in /lib/i686/libc-2.2.5.so) ==18449== by 0x420A6EBE: timelocal (in /lib/i686/libc-2.2.5.so) ==18449== by 0x9B0D0C1: DateUtil::time_from_string(char const *) (/home/cdfsoft/dist/packages/DBObjects/V00-00-72/src/TimeStamp.cc:264) ==18449== by 0x904C794: ChipStatus::__ct(std::basic_string<char,std::char_traits<char>,std::allocator<char>>, int) (/home/cdfsoft/dist/packages/TrackingObjects/V00-01-73/src/ChipStatus.cc:54) ==18449== by 0x8F94AE5: PedestalUpdator::changed(void) (/home/cdfsoft/dist/packages/SvxDaqObjects/V00-0074/src/PedestalUpdator.cc:226) Other: (Matt & Jason) ==18449== at 0x904EFBB: ChipStatus::putBit(char *, int, int) (/home/cdfsoft/dist/packages/TrackingObjects/V00-01-73/src/ChipStatus.cc:133) ==18449== by 0x904F372: ChipStatus::sortBitString(int, int, char *) (/home/cdfsoft/dist/packages/TrackingObjects/V00-01-73/src/ChipStatus.cc:252) ==18449== by 0x904EC15: ChipStatus::makeMap(int) (/home/cdfsoft/dist/packages/TrackingObjects/V00-01-73/src/ChipStatus.cc:212) ==18449== by 0x904C8CC: ChipStatus::__ct(std::basic_string<char,std::char_traits<char>,std::allocator<char>>, int ) (/home/cdfsoft/dist/packages/TrackingObjects/V00-01-73/src/ChipStatus.cc:67) ==18449== by 0x8F94AE5: PedestalUpdator::changed(void) (/home/cdfsoft/dist/packages/SvxDaqObjects/V00-00-74/src/PedestalUpdator.cc:226)

Valgrind Still there (1X) (Aseet) ==6977== Conditional jump or move depends on uninitialised value(s) ==6977== at 0x914484D: PadSqz::Huffman_T::operator<<( (PadSqz::BitStream_T &)) (/home/cdfsoft/dist/packages/PADSObjects/V00-00-23/src/Huffman.cc:368) ==6977== by 0x9145E4C: PadSqz::PadRawBank::Fluff( (int)) (/home/cdfsoft/dist/packages/PADSObjects/V00-00-23/src/PadRawBank.cc:173) ==6977== by 0x84CF42C: PadRawModule<PadSqz::COTQ>::event(EventRecord *) (/home/cdfsoft/dist/releases/5.1.1/include/PADSMods/PadRawModule.icc:57)

Nodes Check crash rate per node: Node 171 (Take out)

Memory usage

Memory usage per Run Large memory usage

Memory increase

Daily checking New cron job  checks in log files for sever errors: Found yesterday: %ERLOG-s : *Fluffed bank(s) != original(s) PadRawBanks %ERLOG-s CalDataMaker: /home/cdfsoft/dist/packages/Calor/V00-01-52/src/CalDataMaker.cc : 754 unpack HATD bank : more than 8 hits in PHA GlobalLibraryLogger vxfit0() 28-Oct-2003 10:26:23 CST run = 163956 event = 262325 /home/cdfsoft/dist/packages/Calor/V00-01-52/src/CalDataMaker.cc: 745 unpack HATD bank : more than 8 hits in WHA GlobalLibraryLogger chi2wrtVertex() 28-Oct-2003 10:07:22 CST run = 163955 event =191711

fcdflnx3 Problems with disk space Take more scratch space Get a new disk