Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Probability and Statistics  What is probability?  What is statistics?

Similar presentations


Presentation on theme: "1 Probability and Statistics  What is probability?  What is statistics?"— Presentation transcript:

1 1 Probability and Statistics  What is probability?  What is statistics?

2 2 Probability and Statistics  Probability Formally defined using a set of axioms Seeks to determine the likelihood that a given event or observation or measurement will or has happened  What is the probability of throwing a 7 using two dice?  Statistics Used to analyze the frequency of past events Uses a given sample of data to assess a probabilistic model’s validity or determine values of its parameters  After observing several throws of two dice, can I determine whether or not they are loaded Also depends on what we mean by probability

3 3 Probability and Statistics  We perform an experiment to collect a number of top quarks How do we extract the best value for its mass? What is the uncertainty of our best value? Is our experiment internally consistent? Is this value consistent with a given theory, which itself may contain uncertainties? Is this value consistent with other measurements of the top quark mass?

4 4 Probability and Statistics  CDF “discovery” announced 4/11/2011

5 5 Probability and Statistics

6 6  Pentaquark search - how can this occur?  2003 – 6.8  effect 2005 – no effect

7 7 Probability  Let the sample space S be the space of all possible outcomes of an experiment  Let x be a possible outcome Then P(x found in [x,x+dx]) = f(x)dx f(x) is called the probability density function (pdf) It may be called f(x;  ) since the pdf could depend on one or more parameters   Often we will want to determine  from a set of measurements Of course x must be somewhere so

8 8 Probability  Definitions of mean and variance are given in terms of expectation values

9 9 Probability  Definitions of covariance and correlation coefficient

10 10 Probability  Error propagation

11 11 Probability  This gives the familiar error propagation formulas for sums (or differences) and products (or ratio)

12 12 Uniform Distribution  Let What is the position resolution of a silicon or multiwire proportional chamber with detection elements of space x?

13 13 Binomial Distribution  Consider N independent experiments (Bernoulli trials)  Let the outcome of each be pass or fail  Let the probability of pass = p

14 14 Permutations  Quick review

15 15 Binomial Distribution  For the mean and variance we obtain (using small tricks)  And note with the binomial theorem that

16 16 Binomial Distribution  Binomial pdf

17 17 Binomial Distribution  Examples Coin flip (p=1/2) Dice throw (p=1/6) Branching ratio of nuclear and particle decays (p=Br) Detector or trigger efficiencies (pass or not pass) Blood group B or not blood group B

18 18 Binomial Distribution  It’s baseball season! What is the probability of a 0.300 hitter getting 4 hits in one game?

19 19 Poisson Distribution  Consider when

20 20 Poisson Distribution

21 21 Poisson Distribution  Poisson pdf

22 22 Poisson Distribution  Examples Particles detected from radioactive decays  Sum of two Poisson processes is a Poisson process Particles detected from scattering of a beam on target with cross section  Cosmic rays observed in a time interval t Number of entries in a histogram bin when data is accumulated over a fixed time interval Number of Prussian soldiers kicked to death by horses Infant mortality QC/failure rate predictions

23 23 Poisson Distribution  Let

24 24 Gaussian Distribution  Gaussian distribution Important because of the central limit theorem  For n independent variables x 1,x 2,…,x N that are distributed according to any pdf, then the sum y=∑x i will have a pdf that approaches a Gaussian for large N  Examples are almost any measurement error (energy resolution, position resolution, …)

25 25 Gaussian Distribution  The familiar Gaussian pdf is

26 26 Gaussian Distribution  Some useful properties of the Gaussian distribution are P(x in range  ±  ) = 0.683 P(x in range  ±2  ) = 0.9555 P(x in range  ±3  ) = 0.9973 P(x outside range  ±3  ) = 0.0027 P(x outside range  ±5  ) = 5.7x10 -7 P(x in range  ±0.6745  ) = 0.5

27 27  2 Distribution  Chi-square distribution

28 28  2 Distribution

29 29 Probability

30 30 Probability  Probability can be defined in terms of Kolmogorov axioms The probability is a real-valued function defined on subsets A,B,… in sample space S This means the probability is a measure in which the measure of the entire sample space is 1

31 31 Probability  We further define the conditional probability P(A|B) read P(A) given B  Bayes’ theorem

32 32 Probability  For disjoint A i  Usually one treats the A i as outcomes of a repeatable experiment

33 33 Probability  Usually one treats the A i as outcomes of a repeatable experiment Then P(A) is usually assigned a value equal to the limiting frequency of occurrence of A Called frequentist statistics  But A i could also be interpreted as hypotheses, each of which is true or false Then P(A) represents the degree of belief that hypothesis A is true Called Bayesian statistics

34 34 Bayes’ Theorem  Suppose in the general population P(disease) = 0.001 P(no disease) = 0.999  Suppose there is a test to check for the disease P(+, disease) = 0.98 P(-, disease) = 0.02  But also P(+, no disease) = 0.03 P(-, no disease) = 0.97  You are tested for the disease and it comes back +. Should you be worried?

35 35 Bayes’ Theorem  Apply Bayes’ theorem  3.2% of people testing positive have the disease  Your degree of belief about having the disease is 3.2%

36 36 Bayes’ Theorem  Is athlete A guilty of drug doping?  Assume a population of athletes in this sport P(drug) = 0.005 P(no drug) = 0.995  Suppose there is a test to check for the drug P(+, drug) = 0.99 P(-, drug) = 0.01  But also P(+, no drug) = 0.004 P(-, no drug) = 0.996  The athlete is tested positive. Is he/she involved in drug doping?

37 37 Bayes’ Theorem  Apply Bayes’ theorem  ???

38 38 Binomial Distribution  Calculating efficiencies Usually use  instead of p

39 39 Binomial Distribution  But there is a problem If n=0,  (  ’) = 0 If n=N,  (  ’) = 0  Actually we went wrong in assuming the best estimate for  is n/N We should really have used the most probable value of  given n and N  A proper treatment uses Bayes’ theorem but lucky for us (in HEP) the solution is implemented in ROOT h_num->Sumw2() h_den->Sumw2() h_eff->Divide(h_num,h_den,1.0,1.0,”B”)


Download ppt "1 Probability and Statistics  What is probability?  What is statistics?"

Similar presentations


Ads by Google