Download presentation
Presentation is loading. Please wait.
Published byBrooke Elliott Modified over 8 years ago
1
Data Analysis with CMSSW ● Running a simple analysis: Within the framework: EDAnalyzer Interactive: FWLite + PyRoot ● Finding the data with DBS/DLS ● Running CMSSW with CRAB Most of the files used in the tut. can be found in /afs/cern.ch/user/g/gpetrucc/public/Tutorial151206
2
Initialize the environment First time only: scramv1 project CMSSW CMSSW_1_2_0_pre9 cd CMSSW_1_2_0_pre9/src eval `scramv1 runtime -(c)sh` cmscvsroot CMSSW cvs login (use “98passwd” as password) All the other times: cd CMSSW_1_2_0_pre9/src eval `scramv1 runtime -(c)sh` cmscvsroot CMSSW
3
Create a EDAnalyzer skeleton ● Create your working directory under CMSSW_xxx/src mkdir Tutorial151206; cd Tutorial151206 ● Create an EDAnalyzer named “Simple” mkedanlzr Simple This will create the following structure Simple/ (contains “BuildFile”) Simple/src (contains “Simple.cc”) Simple/interface,doc,test (all empty)
4
“Simple.cc” structure: #include....... class Simple : public EDAnalyzer { public:... private:... } void Simple::analyze(...) {... } void Simple::beginJob(...) {... } void Simple::endJob(...) {... }
5
Simple analysis task Count the number of tracks with pT > 5 GeV We need to: ● At the beginning: create an empty histogram. ● For every event: Get the tracks Loop on tracks, cut on pt and count Fill the histogram ● At the end: write the histogram to a root file
6
How are tracks stored ? ● Go to the documentation page for RECO data: http://cmsdoc.cern.ch/Releases/CMSSW/latest_nightly/doc /html/RecoData.htmlRECO We have found out that tracks are of type reco::Track, stored in a reco::TrackCollection with name “ctfWithMaterialTracks”
7
What's a “Track” for CMSSW ? Click on the reco::Track link and find out:reco::Track ● Include file ● Package: DataFormats/TrackReco Then click on List all members to get the info:List all members You will find a member function “pt()”. Click on it.pt() Now we can start writing C++ code
8
How are tracks stored ? ● Go to the documentation page for RECO data: http://cmsdoc.cern.ch/Releases/CMSSW/latest_nightly/doc /html/RecoData.htmlRECO We have found out that tracks are of type reco::Track, stored in a reco::TrackCollection with name “ctfWithMaterialTracks”
9
Create the histogram class Simple : public EDAnalyzer {... private:... // --------- member data ---------------- TH1F *m_Tracks; } void Simple::beginJob(...) { m_Tracks = new TH1F(“tracks”, “Tracks (Pt > 5 GeV)”, 10, 0, 10); }
10
Get track collection void Simple::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup) { using namespace edm; using namespace reco; Handle tracks; iEvent.getByLabel(“ctfWithMaterialTracks”, tracks) [...] }
11
Loop over the tracks Handle tracks; iEvent.getByLabel([...]); TrackCollection::const_iterator trk; for (trk = tracks->begin(); trk != tracks->end(); ++trk) { [...] }
12
Cut on track pT and count int count = 0; TrackCollection::const_iterator trk; for (trk = tracks->begin(); [...]) { if (trk->pt() > 5.0) { count++; } m_Tracks->Fill(count);
13
Save the histogram void Simple::endJob(...) { TFile *f = new TFile(“histo.root”, “RECREATE”); f->WriteTObject(m_Tracks); f->Close(); delete m_Tracks; delete f; }
14
Now some technicalities: ● Adding the required include files (at the beginning of Simple.cc) #include... #include "DataFormats/TrackReco/interface/Track.h" #include
15
Adding libraries in BuildFile......
16
Compile your EDAnalyzer ● Go into the main folder of your project (CMSSW_xxx/src/Tutorial151206/Simple) ● scramv1 build (and cross your fingers) Parsing BuildFiles Entering Package Tutorial151206/Simple [...] >> Compiling [...]/Simple/src/Simple.cc >> Building shared library [...]/libTutorial151206Simple.so [...] @@@@ Checking shared library for missing symbols: [...] --- Registered SEAL plugin Tutorial151206Simple [...] ● >> Package Simple built
17
Create test/Simple.cfg Process Demo = { source = PoolSource { untracked vstring fileNames = { "/afs/cern.ch/user/g/gpetrucc/public/Tutorial151206/ PhysVal-DiElectron-Ene10.root" } module demo = Simple { } path p = {demo} }
18
Run the EDAnalyzer ● Go to the Simple/test directory cmsRun Simple.cfg Using the site default catalog [...] %MSG-i FwkReport: [...] BeforeEvents Begin processing the 1th record. Run 1, Event 1 %MSG-i FwkReport: [...] Run: 1 Event: 1 Begin processing the 2th record. Run 1, Event 2 [...] [...] 10 %MSG-i FwkJob: PostSource [...] Run: 1 Event: 10 [...] ● Open “histo.root” and enjoy the plot
19
Links to more details: Core CMSSW Documentation: https://twiki.cern.ch/twiki/bin/view/CMS/WorkBook http://cmsdoc.cern.ch/Releases/CMSSW/latest_nightly/doc/html/ http://cmsdoc.cern.ch/Releases/CMSSW/latest_nightly/doc/html/ (some days the link is broken) http://cmslxr.fnal.gov/lxr/ http://cmssw.cvs.cern.ch/cgi-bin/cmssw.cgi/CMSSW/ Setting up CMSSW Environment: https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookSetComputerNode Writing a framework module: https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookWriteFrameworkModule Tutorials from last CMSWeek: https://twiki.cern.ch/twiki/bin/view/CMS/December06CMSweekTutorials
20
Same thing, interactive Install the python tools (only once) cd CMSSW_xxxx/src cmscvsroot CMSSW cvs co -r HEAD PhysicsTools/PythonAnalysis Setup python environment (every time) (bash:) export PYTHONPATH=${PYTHONPATH}:$CMSSW_BASE/src/ PhysicsTools/PythonAnalysis/python (tcsh:) setenv PYTHONPATH ${PYTHONPATH}:$CMSSW_BASE/src/ PhysicsTools/PythonAnalysis/python
21
Interactive: startup ● Create a new file simple.py ● Start with the lines to initialize FWLite/PyROOT from ROOT import * from cmstools import * gSystem.Load("libFWCoreFWLite.so") AutoLibraryLoader.enable()
22
Interactive: read the data data = TFile("/afs/cern.ch/user/g/gpetrucc/ public/Tutorial151206/PhysVal-DiElectron- Ene10.root") events = EventTree(data.Get("Events")) trackBranch = events.branch("ctfWithMaterialTracks")
23
Interactive: event loop for event in events: tracks = trackBranch() # read tracks count = 0 # init counter for trk in tracks: # loop over tracks if trk.pt() > 5.0: # cut on pT count++ # increment print "Found ",count," tracks" # print
24
Interactive: running python simple.py Preparing CMS tab completer tool... Loading FWLite dictionary... Warning in [...] Found 0 tracks [...] Found 1 tracks
25
Histograms in pyton [..] histo = TH1F("tracks", "Tracks (Pt > 5 GeV)", 10, 0, 10) for event in events: [...] print "Found ",count," tracks" # print histo.Fill(count) f = TFile("histo.root", "RECREATE") f.WriteTObject(histo) f.Close()
26
Pros and cons of Python/FWLite PRO ● No need to recompile ● No need to include headers, BuildFile,... ● Shorter code ● Can be used interactively (check also ipython) ● Untyped functions allow greater code reuse CON ● Can use only some CMSSW packages ● Currently there are problems with: Refs (e.g. B-tagging) AssociationMaps) TChains [there are workarounds] ● Can just read events... ● Can't run on CRAB
27
Finding data with DBS/DLS ● Reach for the DBS/DLS page: http://cmsdbs.cern.ch/discovery/expert (“expert” is needed to get 1_2_x samples) http://cmsdbs.cern.ch/discovery/expert
28
Finding data ● DBS Instance: RelVal/Writer (for 1_2_0_pre9) ● Application: anything with 1_2_0_pre9 (those with FEVT or Merged should work fine) ● Primary dataset: RelVal120pre9
29
Search results (summary) You can read from the summary view: A) The collection name (for CRAB) /RelVal120pre9Higgs-ZZ-4Mu/FEVT/ CMSSW_1_2_0_pre9-FEVT-1165234098-unmerged B) The site at which is stored (cern, fnal) C) The number of events available (2k, 1.2k)
30
Search results (Block details) Clicking on “Blocks” more information is given. To see the logical file names for the data, click on “plain” under “LFN list”. You should have a list of files like /store/unmerged/RelVal/... The physical location on castor is (usually) /castor/cern.ch/cms/store/unmerged/...
31
Reading that data with CMSSW ● Write LFNs in the.cfg file source = PoolSource { untracked int32 maxEvents = 3 untracked vstring fileNames = { “/store/ungerged/...”, [...] } } (write just the LFN, no “file:” and no “/castor”!) ● Remember to set maxEvents unless you want to read all the events in the file... ● Check if the sample is really in /castor before...
32
Running on remote samples CRAB Before using crab you need: ● A working CMSSW ● A working EDAnalyzer (with his cfg file) ● Access to Grid: certificate, VO membership ● The name of a data sample you want to access
33
Setup crab Setup your environment (every time): source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.sh (on lxplus) (source xxx.csh if you use tcsh) Additional tasks (first time only): ● Execute $CRABDIR/configureBoss ● Copy the default crab.cfg file from /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.cfg
34
Configure CRAB (crab.cfg) ● Read the comments in the cfg file ! ● [CRAB] section: main configuration jobtype = cmssw (always) scheduler = glitecoll (also edg should work) ● [CMSSW]: your job configuration (important!) datasetpath= (“None” if you use Pythia...) pset= total_numer_of_events events_per_job output_file =
35
Configure CRAB (crab.cfg) ● [USER] section: common info return_data = 1 (get your output back with crab) copy_data = 0 (=1 to save the output on castor... more tricky) ● [EDG] section: GRID configuration (optional) ce_white_list, se_white_list: use only the CE/SE with names in the list; you can try “cern”, “infn”) ce_black_list, se_black_list: never use CE/SE whith the specified name (i.e. “tw”, “fnal”, “cern”) rb = CERN (try CNAF if cern does not work)
36
Configure CRAB for RelVal ● By default, CRAB looks for samples in the MCGlobal/Writer DBS ● In order to read the RelVal samples, some more tweaking of crab.cfg is needed: the following parameters must be added under the [CMSSW] section dbs_instance=RelVal/Writer dls_endpoint=prod-lfc-cms-central.cern.ch/ grid/cms/DLS/RelVal ● This allows to set datasetpath to RelVal samples
37
Set up your EDAnalyzer.cfg ● The normal cfg file used for your job works fine. ● Crab takes care of setting up the options of the PoolSource (maxEvents, fileNames) ● Check the name of the output files! Crab takes care of adding “_ ” to each file name when retriving the job output.
38
Running CRAB ● Create and submit the jobs: crab -create -submit ● See the status of your jobs crab -status (hint: watch -n 120 “crab -status” ) ● Get the output of the completed jobs crab -getoutput
39
Further information https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookRunningGrid http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/ https://twiki.cern.ch/twiki/pub/CMS/December06CMSweekTutorials/CRAB_tutor ial.pdf http://arda-dashboard.cern.ch/cms/
40
Backup slides ● Python crash course (4 slides)
41
Python crash course (1) ● Python is a scripting language. Script are executed just by typing “python ” ● You can also open a python interactive prompt: gpetrucc@lxplus$ python [...] >>> 2+2 4 >>> ● Writing is done with print print “Hello world. I = ”,i ● There is no “;” at the end of line
42
Python crash course (2) ● Comments start with “#” end finish at end of line: # this will be ignored ● Variable types are not declared. i = 37 (and not int a = 37 as in C++) ● Blocks are done with indentation, not “{“, “}”: if x > 3: print “x is large (x=“, x, “)” else: print “x is negligible” for i in range(5): # 0,1,2,3,4 print i
43
Python crash course (3) ● Python is object oriented ● There is no “new” keyword for creating objects: file = TFile(“ciao”) ● Members are accessed with “.” (dot) file.Close() (and not file->Close() ) ● Memory management is automatic: there is no need to call “delete”, “free()” as in C++ ● No pointers (objects are always “references”)
44
Further info on Python Tutorials and guides: http://docs.python.org/tut/tut.html http://hetland.org/python/instant-python.php http://www.wag.caltech.edu/home/rpm/python_course/ http://wiki.python.org/moin/BeginnersGuide PyROOT (use ROOT from Python): ftp://root.cern.ch/root/doc/chapter20.pdf Python within CMSSW (twiki): https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookMakeAnalysi s https://twiki.cern.ch/twiki/bin/view/CMS/UserManualPythonAnal ysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.