Presentation is loading. Please wait.

Presentation is loading. Please wait.

CDF Offline Operations

Similar presentations


Presentation on theme: "CDF Offline Operations"— Presentation transcript:

1 CDF Offline Operations
Status: Rerun zee validation sample for No discrepancies found as expected. Checked farm crashes. Reproduced 3 crashes: CdfTrack.cc (in prewrite). if (_siHits.size() > 0) { CdfTrackHits* storedSvxHits; storedSvxHits = new CdfTrackHits; for (SiHitIterator ihit = beginSIHits(); ihit != endSIHits(); ++ihit) { int packed = ((*ihit)->id() & 0x1FFFFFFF) | (((*ihit)->getAmbIndex() & 0x7 ) << 29);  crash (ihit !=0x0) storedSvxHits->accumulate(packed); }  Matt and Chris

2 Crashes Mark Fischler (needs help with debugging)
The location in ELextendedID is basic_string& operator=(const basic_string& str); basic_string& operator=(const charT* s) {return assign( s,traits::length(s) );}  crash basic_string& operator=(charT c) {return assign( size_type(1), c );} and in ErrorObj::clear() is mySerial = 0; myXid.clear();  crash myIdOverflow = ""; Mark Fischler (needs help with debugging) 0x8fa13bd in SiStripCorrectorManager::correctStripSet (this=0xcd5b338,stripSet=0xe392094) at /home/cdfsoft/dist/packages/SvxDaqObjects/V /src/SiStripCorrectorManager.cc:62  Matt (fixed)

3 Valgrind Run valgrind over the other crashes: Other: (Matt & Jason)
==18449== Conditional jump or move depends on uninitialised value(s) ==18449== at 0x420A6879: __mktime_internal (in /lib/i686/libc so) ==18449== by 0x420A6EBE: timelocal (in /lib/i686/libc so) ==18449== by 0x9B0D0C1: DateUtil::time_from_string(char const *) (/home/cdfsoft/dist/packages/DBObjects/V /src/TimeStamp.cc:264) ==18449== by 0x904C794: ChipStatus::__ct(std::basic_string<char,std::char_traits<char>,std::allocator<char>>, int) (/home/cdfsoft/dist/packages/TrackingObjects/V /src/ChipStatus.cc:54) ==18449== by 0x8F94AE5: PedestalUpdator::changed(void) (/home/cdfsoft/dist/packages/SvxDaqObjects/V /src/PedestalUpdator.cc:226) Other: (Matt & Jason) ==18449== at 0x904EFBB: ChipStatus::putBit(char *, int, int) (/home/cdfsoft/dist/packages/TrackingObjects/V /src/ChipStatus.cc:133) ==18449== by 0x904F372: ChipStatus::sortBitString(int, int, char *) (/home/cdfsoft/dist/packages/TrackingObjects/V /src/ChipStatus.cc:252) ==18449== by 0x904EC15: ChipStatus::makeMap(int) (/home/cdfsoft/dist/packages/TrackingObjects/V /src/ChipStatus.cc:212) ==18449== by 0x904C8CC: ChipStatus::__ct(std::basic_string<char,std::char_traits<char>,std::allocator<char>>, int ) (/home/cdfsoft/dist/packages/TrackingObjects/V /src/ChipStatus.cc:67) ==18449== by 0x8F94AE5: PedestalUpdator::changed(void) (/home/cdfsoft/dist/packages/SvxDaqObjects/V /src/PedestalUpdator.cc:226)

4 Valgrind Still there (1X) (Aseet)
==6977== Conditional jump or move depends on uninitialised value(s) ==6977== at 0x914484D: PadSqz::Huffman_T::operator<<( (PadSqz::BitStream_T &)) (/home/cdfsoft/dist/packages/PADSObjects/V /src/Huffman.cc:368) ==6977== by 0x9145E4C: PadSqz::PadRawBank::Fluff( (int)) (/home/cdfsoft/dist/packages/PADSObjects/V /src/PadRawBank.cc:173) ==6977== by 0x84CF42C: PadRawModule<PadSqz::COTQ>::event(EventRecord *) (/home/cdfsoft/dist/releases/5.1.1/include/PADSMods/PadRawModule.icc:57)

5 Nodes Check crash rate per node: Node 171 (Take out)

6 Memory usage

7 Memory usage per Run Large memory usage

8 Memory increase

9 Daily checking New cron job  checks in log files for sever errors:
Found yesterday: %ERLOG-s : *Fluffed bank(s) != original(s) PadRawBanks %ERLOG-s CalDataMaker: /home/cdfsoft/dist/packages/Calor/V /src/CalDataMaker.cc : 754 unpack HATD bank : more than 8 hits in PHA GlobalLibraryLogger vxfit0() 28-Oct :26:23 CST run = event = /home/cdfsoft/dist/packages/Calor/V /src/CalDataMaker.cc: 745 unpack HATD bank : more than 8 hits in WHA GlobalLibraryLogger chi2wrtVertex() 28-Oct :07:22 CST run = event =191711

10 fcdflnx3 Problems with disk space Take more scratch space
Get a new disk


Download ppt "CDF Offline Operations"

Similar presentations


Ads by Google