Download presentation
Presentation is loading. Please wait.
1
8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 Inclusion of systematic errors LHR methods needs a prediction (from MC simulation) for the expected numbers of s and b in each bin („channel“) Statistical p.d.f.´s for these numbers are poissonian (or gaussian, if large) Prediction of s and b also have systematic uncertainties - finite MC statistics - theoretical uncertainties in production cross section - uncertainties from detector efficiencies and acceptances - uncertainty in integrated luminosity - … some of these uncertainties can be correlated between channels tough job: determine these systematic uncertainties statistical procedure: convolute the (estimated) p.d.f.´s for systematics (usually assumed gaussian) with the poissonians statistical p.d.f.´s
2
8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 (rather) easy-to-use root class Tlimit() public: TLimit() TLimit(const TLimit&) virtual ~TLimit() static TClass* Class() static TConfidenceLevel* ComputeLimit(TLimitDataSource* data, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(Double_t s, Double_t b, Int_t d, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(TH1* s, TH1* b, TH1* d, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(Double_t s, Double_t b, Int_t d, TVectorD* se, TVectorD* be, TObjArray*, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(TH1* s, TH1* b, TH1* d, TVectorD* se, TVectorD* be, TObjArray*, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) only needs vectors of signal, background, observed data (and their errors) and computes (e.g.) CL b, CL s+b,CL s, exptected CL b, CL s+b,CL s and much more…
3
8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 The „look elsewhere“ effect The LHR test is a „either-or“ test of two hypotheses (e.g. „Higgs at 114 GeV“ or „no Higgs at 114 GeV“) When the question of a discovery of a new particle is asked, often many „signal“ hypotheses are tested against the background hypothesis simultaneously (e.g. m=105, m=108, m=111, m=114, …) The probability that any of these hypotheses yields a „false-positve“ result is larger than the probability for a single hypothesis to be false-positive This is the „look elsewhere“ effect If the probabilites are small, the 1-CL b can simply be multiplied by the number of different hypotheses that are tested simultaneously In case there is continous „test mass“ in principle infinitely many hypotheses are tested – but they are correlated (excess for m test = 114 will also cause excess for m test = 114.5)
4
8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 The „look elsewhere“ effect (ctd.) need an „effective“ number of tested hypotheses hard to quantify exactly Ansatz: two hypotheses are uncorrelated if their reconstructed mass distributions do not overlap Estimate: effective number of hypotheses = range of test masses / average mass resolution
5
9. Classification K. Desch – Statistical methods of data analysis SS10 Task: how to find needles in haystacks? how to enrich a sample of events with „signal-like“ events? Why is it important? Only one out of 10 11 LHC events contains a Higgs (decay) if it exists. Existence („discovery“) can only be shown if a statistically significant excess can be extracted from the data Most particle physics analyses require separation of a signal from background(s) based on a set of discriminating variables
6
9. Classification K. Desch – Statistical methods of data analysis SS10 Different types of input information (discriminating variables) multivariate analysis Combine these variables to extract the maximum of discriminating power between signal and background Kinematic variables (masses, momenta, decay angles, …) Event properties (jet/lepton multiplicity, sum of charges, …) Event shape (sphericity, …) Detector response (silicon hits, dE/dx, Cherenkov angle, shower profiles, …) etc.
7
9. Classification K. Desch – Statistical methods of data analysis SS10 We have found discriminating input variables x 1, x 2, … What decision boundary should we use to select events of type H 1 ? Suppose data sample with two types of events: H 0, H 1 Linear boundary Nonlinear boundary(ies) Rectangular cuts
8
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
9
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
10
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
11
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
12
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
13
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
14
9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]
15
9. Classification MVA algorithms K. Desch – Statistical methods of data analysis SS10 Finding the optimal Multivariate Analysis (MVA) algorithm is not trivial Large variety of different algorithms exist
16
9. Classification (Projective) Likelihood-Selection K. Desch – Statistical methods of data analysis SS10 [H.Voss]
17
9. Classification (Projective) Likelihood-Selection K. Desch – Statistical methods of data analysis SS10 [H.Voss]
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.