Hands-on Session RooStats and TMVA Exercises
Getting Started A few advice words: Install version 5.25.02 locally setting up . root_Installaton_directory/bin/thisroot.sh (or .csh) recommend to compile your macros and run with: .x macro.C+; or .L macro.C+ then macro();, include: using namespace RooFit; using namespace RooStats;, avoid running a macro multiple times in the same ROOT session If not working locally: setting ROOT 5.25.02 using the CERN AFS installation: on SLC4: /afs/cern.ch/sw/lcg/app/releases/ROOT/5.25.02/slc4_amd64_gcc34/root/bin/thisroot.csh or .sh
RooStats Exercises (1) Generate a Poisson Model (use RooFit factory) Use RooStats calculators class to find limits, significances and produce plots // Use a RooWorkspace to store the PDF models, prior // informations, list of parameters, ... myWS.factory("S[2,0,10]"); // default value 2 and range [0,10] // Number of signal and background events RooWorkspace myWS("myWS"); // Observable (dummy observable used) myWS.factory("B[1]"); // value fixed to 1 // Signal and background distribution of the observable myWS.var("x")->setBins(1); myWS.factory("x[0,1]"); // arbitrary range [0,1] myWS.factory("Uniform::bkgPdf(x)"); myWS.factory("Uniform::sigPdf(x)"); myWS.factory("ExtendPdf::modelBkg(bkgPdf,B)"); myWS.factory("SUM::model(S*sigPdf,B*bkgPdf"); // S+B and B-only models (both extended PDFs) RooAbsData* data = myWS.pdf("model")->generateBinned(*myWS.var("x"),myWS.var("S")->getVal()+myWS.var("B")->getVal(),Name("data")); // generate binned data with fixed number of events //plot the data
ProfileLikelihood Exercise ProfileLikelihoodCalculator get interval: get significance: RooRealVar * S = myWS.var("S");RooArgSet POI(*S);RooAbsPdf * model = myWS.pdf("model");ProfileLikelihoodCalculator plc(*data,*model,POI); //set thest sizeplc.SetTestSize(0.10); model->fitTo(*data,SumW2Error(kFALSE));LikelihoodInterval * interval = plc.GetInterval();const double lowerLimit = interval->LowerLimit(*S);const double upperLimit = interval->UpperLimit(*S);LikelihoodIntervalPlot lplot(interval);lplot.Draw() // Create a copy of the POI parameters to set the values to zeroRooArgSet nullparams;nullparams.addClone(*myWS.var("S"));((RooRealVar *) (nullparams.first()))->setVal(0);plc.SetNullParameters(nullparams); HypoTestResult* plcResult = plc.GetHypoTest();const double significance = plcResult->Significance(); //get significance
Result for S=2 you should get: Significance = 1.60987
Gaussian Model Generate Gaussian signal over flat background systematics in sigma of mass and background RooWorkspace myWS("myWS"); // ObservablemyWS.factory("mass[0,500]"); // range [0,500]// Signal and background distribution of the observable myWS.factory("Gaussian::sigPdf(mass,200,sigSigma[0,100])") ;myWS.factory("Uniform::bkgPdf(mass)") ;myWS.factory("SUM::model(S[5,0,30]*sigPdf,B[10,0,100]*bkgPdf") ;// Background only pdfmyWS.factory("ExtendPdf::modelBkg(bkgPdf,B)") ; // Prior for signalmyWS.factory("Uniform::priorPOI(S)") ;// Priors for nuisance parameters (signal + backg) myWS.factory("Gaussian::prior_sigSigma(sigSigma,50,5)") ;myWS.factory("Gaussian::prior_B(B,10,3)") ;myWS.factory("PROD::priorNuisance(prior_sigSigma,prior_B)") ; RooAbsData * data = myWS.pdf("model")->generate(*myWS.set("observables"),Extended(),Name("data")); myWS.defineSet("observables","mass"); // generate unbinned data
RooStats Exercise Compute: 68% CL 2-sided confidence interval and significance from the (profiled-) likelihood ratio plot profile log-likelihood ratio Frequentist p-value in the S+B and B-only hypotheses, signal significance, CL_S ratio (HybridCalculator) Repeat including systematic uncertainty in the background and signal width Use also Bayesian calculator , MCMCCalculator and Neyman construction for finding limits (with Poisson or Gaussian example) More examples and code available in https://twiki.cern.ch/twiki/bin/view/RooStats/TutorialsOctober2009
HybridCalculator //Run and retrieve the results HybridCalculator hc("hc","HybridCalculator",*data,*modelSB,*modelB); hc.SetNumberOfToys(5000); hs.SetTestStatistics(1); //Run and retrieve the results HybridResult* hcResult = hc.GetHypoTest(); double p_value_sb = hcResult->AlternatePValue(); double p_value_b = hcResult->NullPValue(); double cl_s = hcResult->CLs(); double significance = hcResult->Significance(); //Making a plot of the results HybridPlot* hcPlot = hcResult->GetPlot("hcPlot","p-Values plot",100); hcPlot->Draw();
Bayesian Calculator //Compute the credibility interval BayesianCalculator bc(data,*model,RooArgSet(*POI),*priorPOI,&nuisanceParameters); //Compute the credibility interval //Set the confidence level of the credibility interval and compute it. Returns a SimpleInterval. bc.SetTestSize(0.05); SimpleInterval* interval = bc.GetInterval(); double lowerLimit = interval->LowerLimit(); double upperLimit = interval->UpperLimit(); // The code below produce a plot: RooAbsPdf* fPosteriorPdf = bcalc.GetPosteriorPdf(); RooPlot* plot = POI->frame(); plot->SetTitle(TString("Posterior probability of parameter \"")+TString(POI->GetName())+TString("\"")); fPosteriorPdf->plotOn(plot,RooFit::Range(interval->LowerLimit(),interval->UpperLimit(),kFALSE),RooFit::VLines(),RooFit::DrawOption("F"),RooFit::MoveToBack(),RooFit::FillColor(kGray)); fPosteriorPdf->plotOn(plot); plot->GetYaxis()->SetTitle("posterior probability"); plot->Draw();
RooStats Exercises (2) Generate a multi-dimensional model have both S and mass as parameter of interest Compare results of Profile likelihood, Neyman construction and MCMC calculator examples are to be downloaded from: http://www.cern.ch/moneta/temp/roostats2.tar run to generate data: rs500e_PrepareWorkspace_GaussOverFlat_withSystematics_floatingMass.C to run exercise: rs501_ThreeTypesOfLimits.C
TMVA exercises Download macro code from http://www.cern.ch/moneta/temp/tmva_exercises.tar Try first generating different data-sets (use RooFit for doing it ) macro makesample.C usage: makesample(int sampleID) ID=0,1,2,3,4 depending on type of data to generate Try and play with the various methods macro driver.C driver(Int_t sampleID = 0, TString obsList = "x,y,z", TString myMethodList = "Fisher,Likelihood,MLP,BDT") Obtaine TMVA GUI from macro produce plots of the signal and background distribution look at the result, classifier outputs and performances (ROC curve )