On Valgrind Mike Kordosky UCL Ely, if you are so unlucky as to have need of it
What is it? ● Valgrind is a tool that finds: – uses of uninitialized memory – memory leaks (e.g. orphaned pointers) – cases of illegal memory access ● Runs on any code w/o recompilation... fabulous – Caveat: slow & uses lots of memory... run on minos?.fnal.gov ● Actually, this is just the Memcheck option. Valgrind has some other features too.
When to use it? ● Your program seems to leak memory ● Your program crashes in a bizarre way – inside an obviously safe function/class – depending on recompilation, time of day, and other voodoo ● Your program appears to be non- deterministic
How to use it ● Then wait a while... ● if you have db-attach set you need to monitor the job www-numi.fnal.gov/offline_software/srt_public_context/WebDocs/valgrind.html valgrind --tool=memcheck\ --db-attach=yes\* --gen-suppressions=yes \* --suppressions=root.supp \* --error-limit=no \ --leak-check=yes \ (full) loon -bq my_script.C date_file
Example Output ● This is the source of the crash-on-exit problem Rustem was having ● TTree apparatus overwriting memory as a result of unallocated pointer ==2536== Invalid write of size 4 ==2536== at 0x1D350A4F: frombuf(char*&, unsigned*) (Bytes.h:319) ==2536== by 0x1D3501A6: TBuffer::operator>>(int&) (TBuffer.h:439) ==2536== by 0x1D43DFD8: int TStreamerInfo::ReadBuffer(TBuffer&, char** const&, int, int, int, int) (TStreamerInfoReadBuffer.cxx:589) ==2536== by 0x1E180306: TBranchElement::ReadLeaves(TBuffer&) (TBranchElement.cxx:1829) ==2536== by 0x1E171CAA: TBranch::GetEntry(long long, int) (TBranch.cxx:763) ==2536== by 0x1E17DB8A: TBranchElement::GetEntry(long long, int) (TBranchElement.cxx:1227) ==2536== by 0x1E61A6FD: TTreeIndex::GetEntry(long long) (TTreeIndex.cxx:182) ==2536== by 0x1E619EFA: TTreeIndex::TTreeIndex(TTree const*, char const*, char const*) (TTreeIndex.cxx:139) ==2536== by 0x1E61BF51: TTreePlayer::BuildIndex(TTree const*, char const*, char const*) (TTreePlayer.cxx:312) ==2536== by 0x1E1A757E: TTree::BuildIndex(char const*, char const*) (TTree.cxx:1482) ==2536== by 0x1CD5EAC2: PerOutputStream::BuildTreeIndex() (PerOutputStream.cxx:656) ==2536== by 0x1CD5E7F0: PerOutputStream::Write() (PerOutputStream.cxx:628) ==2536== Address 0x22815B40 is not stack'd, malloc'd or (recently) free'd ==2536==
Example II ● Valgrind's summary output ● The program consumed lots of memory but didn't actually “leak” (much) ● Valgrind didn't find the error.... or did it? ● Why? ==21775== LEAK SUMMARY: ==21775== definitely lost: bytes in 203 blocks. ==21775== indirectly lost: bytes in 968 blocks. ==21775== possibly lost: bytes in blocks. ==21775== still reachable: bytes in blocks. ==21775== suppressed: 0 bytes in 0 blocks. ==21775== Reachable blocks (those to which a pointer was found) are not shown. ==21775== To see them, rerun with: --show-reachable=yes
What was it? ● Was going on in tracker ● Isn't a real memory “leak”... just keeping hits in memory long after they were necessary ● Found using google-perftools – but only after identifying tracker as the problem ● Lesson: Valgrind useful but not appropriate for all leaks... try multiple tools if possible CandSliceHandle * dupSlice=slice->DupHandle(); but never deleted CandSliceHandle created:
Example: google- perftools AlgTrackSRList::RunAlg() CandSliceHandle::DupHandle() new TObject leak