Presentation is loading. Please wait.

Presentation is loading. Please wait.

8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 Inclusion of systematic errors LHR methods needs a prediction (from.

Similar presentations


Presentation on theme: "8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 Inclusion of systematic errors LHR methods needs a prediction (from."— Presentation transcript:

1 8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 Inclusion of systematic errors LHR methods needs a prediction (from MC simulation) for the expected numbers of s and b in each bin („channel“) Statistical p.d.f.´s for these numbers are poissonian (or gaussian, if large) Prediction of s and b also have systematic uncertainties - finite MC statistics - theoretical uncertainties in production cross section - uncertainties from detector efficiencies and acceptances - uncertainty in integrated luminosity - … some of these uncertainties can be correlated between channels tough job: determine these systematic uncertainties statistical procedure: convolute the (estimated) p.d.f.´s for systematics (usually assumed gaussian) with the poissonians statistical p.d.f.´s

2 8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 (rather) easy-to-use root class Tlimit() public: TLimit() TLimit(const TLimit&) virtual ~TLimit() static TClass* Class() static TConfidenceLevel* ComputeLimit(TLimitDataSource* data, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(Double_t s, Double_t b, Int_t d, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(TH1* s, TH1* b, TH1* d, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(Double_t s, Double_t b, Int_t d, TVectorD* se, TVectorD* be, TObjArray*, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) static TConfidenceLevel* ComputeLimit(TH1* s, TH1* b, TH1* d, TVectorD* se, TVectorD* be, TObjArray*, Int_t nmc = 50000, bool stat = false, TRandom* generator = 0) only needs vectors of signal, background, observed data (and their errors) and computes (e.g.) CL b, CL s+b,CL s, exptected CL b, CL s+b,CL s and much more…

3 8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 The „look elsewhere“ effect The LHR test is a „either-or“ test of two hypotheses (e.g. „Higgs at 114 GeV“ or „no Higgs at 114 GeV“) When the question of a discovery of a new particle is asked, often many „signal“ hypotheses are tested against the background hypothesis simultaneously (e.g. m=105, m=108, m=111, m=114, …) The probability that any of these hypotheses yields a „false-positve“ result is larger than the probability for a single hypothesis to be false-positive This is the „look elsewhere“ effect If the probabilites are small, the 1-CL b can simply be multiplied by the number of different hypotheses that are tested simultaneously In case there is continous „test mass“ in principle infinitely many hypotheses are tested – but they are correlated (excess for m test = 114 will also cause excess for m test = 114.5)

4 8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 The „look elsewhere“ effect (ctd.) need an „effective“ number of tested hypotheses hard to quantify exactly Ansatz: two hypotheses are uncorrelated if their reconstructed mass distributions do not overlap Estimate: effective number of hypotheses = range of test masses / average mass resolution

5 9. Classification K. Desch – Statistical methods of data analysis SS10 Task: how to find needles in haystacks? how to enrich a sample of events with „signal-like“ events? Why is it important? Only one out of 10 11 LHC events contains a Higgs (decay) if it exists. Existence („discovery“) can only be shown if a statistically significant excess can be extracted from the data Most particle physics analyses require separation of a signal from background(s) based on a set of discriminating variables

6 9. Classification K. Desch – Statistical methods of data analysis SS10 Different types of input information (discriminating variables)  multivariate analysis Combine these variables to extract the maximum of discriminating power between signal and background Kinematic variables (masses, momenta, decay angles, …) Event properties (jet/lepton multiplicity, sum of charges, …) Event shape (sphericity, …) Detector response (silicon hits, dE/dx, Cherenkov angle, shower profiles, …) etc.

7 9. Classification K. Desch – Statistical methods of data analysis SS10 We have found discriminating input variables x 1, x 2, … What decision boundary should we use to select events of type H 1 ? Suppose data sample with two types of events: H 0, H 1 Linear boundary Nonlinear boundary(ies) Rectangular cuts

8 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

9 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

10 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

11 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

12 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

13 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

14 9. Classification K. Desch – Statistical methods of data analysis SS10 [H.Voss]

15 9. Classification MVA algorithms K. Desch – Statistical methods of data analysis SS10 Finding the optimal Multivariate Analysis (MVA) algorithm is not trivial Large variety of different algorithms exist

16 9. Classification (Projective) Likelihood-Selection K. Desch – Statistical methods of data analysis SS10 [H.Voss]

17 9. Classification (Projective) Likelihood-Selection K. Desch – Statistical methods of data analysis SS10 [H.Voss]


Download ppt "8. Hypotheses 8.4 Two more things K. Desch – Statistical methods of data analysis SS10 Inclusion of systematic errors LHR methods needs a prediction (from."

Similar presentations


Ads by Google