Probability and Statistics for Particle Physics

Probability and Statistics for Particle Physics
Javier Magnin CBPF – Brazilian Center for Research in Physics Rio de Janeiro - Brazil

Outline Course: three one hour lectures 1st lecture:
General ideas / Preliminary concepts Probability and statistics Distributions 2nd lecture: Error matrix Combining errors / results Parameter fitting and hypothesis testing 3rd lecture: Parameter fitting and hypothesis testing (cont.) Examples of fitting procedures

1st lecture

Preliminary concepts Two types of experimental results:
Determination of the numerical value of some physical quantity Testing whether a particular theory is consistent with data Parameter determination Hypothesis testing In real life there is a degree of overlapping between both types above We will go through both types of results along these lectures

Why estimate errors ? Question: Are these two numbers consistent ?
Example: consider the accepted value of the speed of light c = x 108 m/s Assume that a new measurement gives c´ = (3.09  x) x 108 m/s Question: Are these two numbers consistent ?

Why estimate errors ? Example:
consider the accepted value of the speed of light c = x 108 m/s Assume that a new measurement gives c´ = (3.09  x) x 108 m/s If x =  0.15  the new determination is consistent with the accepted value. If x =  0.01  both values are inconsistent  there is evidence for a change in the speed of light ! If x =  2  both values are consistent, but accuracy is so low that is impossible to detect a change on c !

Random and systematic errors
Consider the experiment of determining the decay constant  of a radioactive source: Count how many decays are observed in a time interval t Determine the decay rate Number of decaying nuclei

Random errors: inherent statistical error in counting events uncertainty in the mass of the sample timing of the period for which the decay are observed Systematic errors: efficiency of the counter used to detect the decays background (i.e. particles coming from other sources) purity of the radioactive sample calibration errors

Probability Suppose you repeat an experiment several times. Even if you are able to keep constant the essential conditions, the repetition will produce different results The result of an individual measurement will be unpredictable even when the possible results of a series of measurements have a well-defined distribution

Definition: The probability p of obtaining a specific result when performing one measurement or trial is defined as p = Number of times on which that result occurs Total number of measurements or trials

Rules of probability If P(A) is the probability of a given event A, then 0  P(A)  1. The probability P(A+B) that at least A or B occurs is such that P(A+B)  P(A) + P(B). The equality is valid only if A and B are exclusive events. The probability P(AB) of obtaining both A and B is P(AB) = P(A|B)P(B) = P(B|A)P(A), where P(A|B) is the probability of obtaining A given that B has occurred (P(A|B) is know as the conditional probability of A given B). The rule 3. defines the conditional probability as P(A|B) = P(AB)/P(B).

Comments (about the rules !)
P(A+B) = P(A) + P(B) – P(AB) to avoid double counting ! P(A|B) = P(AB)/P(B) = (NC/N)/(NB/N) = NC/NB P(A|B)  P(B|A) If P(A|B) = P(A) then A and B are independent, which is equivalent to say that P(AB) = P(A)P(B)

Example: use of conditional probability
Measurement of the mass difference Dm = m(KL) – m(KS) from the K0p and K0p cross sections K+ p p+ K0 p- K+ + p  K0p+p (production) K0 + p  K0 + p K0  p+ + p- (decay) K0 are detected in the decay (event B) We want to measure K0p  K0p (event A) We are interested in P(AB) = P(B|A)P(A)

Bayes theorem j P(A|Bj)P(Bj)
Let the sample space  be spanned by n mutually exclusive and exhaustive sets Bi, i P(Bi) = 1. If A is also a set belonging to , then P(Bi|A) = P(A|Bi) P(Bi) j P(A|Bj)P(Bj)

Example (of the Bayes theorem)
Consider three drawers B1, B2, B3, each one with two coins. B1 has two gold coins, B2 has one gold and one silver and B3 has two silver coins. Now select a random drawer and pick a coin from it. Supposing that the coin is gold, what is the probability of having a second gold coin in the same drawer (or, what is the probability of having selected the drawer B1, given that I selected a gold coin)? B1 B2 B3

j P(A|Bj)P(Bj) If A is the event of first picking a gold coin, then
P(A|B1) = 1; P(A|B2) = 1/2; P(A|B3) = 0 Since the drawers is selected at random, P(B1) = P(B2) = P(B3) = 1/3 Hence, from Bayes theorem follows that P(B1|A) = P(A|B1) P(B1) j P(A|Bj)P(Bj) 1 x 1/3 1x1/3 + 1/2x1/3 + 0x1/3 = = 2/3 Thus, although the probability of selecting the drawer B1 is 1/3, the observation that the first coin drawn was gold doubles the probability that the drawer B1 had been selected

Statistics Probability: we start with a well defined problem, and calculate from this the possible outcome of a specific experiment Statistics: we use data from a experiment to deduce what are the rules or laws relevant to the experiment

Statistics Statistics  two different problems (with some overlap)
Hypothesis testing: use data to test the hypothesis of a model Parameter determination: use data to measure the parameter of a given model or theory

Probability Statistics
Toss a coin. If P(heads) = 0.5, how many heads you get in 50 tosses? Given a sample of K* mesons of known polarization, what is the forward/backward asymmetry? If you observe 27 heads in 50 tosses, what is the value of P(heads)? (*) Of 1000 K* mesons, 600 were observed that decay forward. What is the K* polarization? (**) From theory to data From data to theory Probability Statistics (*) Parameter determination: deduce a quantity and its error (**) Hypothesis testing: check a theory

Distributions In general, the result of repeating the same measurement many times does not lead to the same result Experiment: measure the length of one side of your table 10 times and display the results in a histogram. What happens if you repeat the measurement 50 times ?

Sample: the total number of measurements of the side of the table, N
Measuring the table... Distribution n(x): describes how often a value of the variable x occurs in a sample Sample: the total number of measurements of the side of the table, N x  discrete or continuous variable distributed in a finite or infinite range Bin size n x

Mean and variance: center and width of a distribution
Assume a set of N separate measurements of the same physical quantity Mean Variance

Mean and variance: center and width of a distribution
Assume a set of N separate measurements of the same physical quantity Mean Variance Convention: m and s2 are the true values of the mean and the variance resp.

Then s2 will not change by increasing N, but the variance of the mean, s2/ N, decreases with N
Entries in each x bin (66 in total) The mean x is known to an accuracy of s/ N (more about 1/ N later) s2 is a measurement of how the distribution spreads out and not a measurement of the error in x ! x is better determined as N increases s2 is a property of the distribution

Continuous distributions
Number of events Mean N large Variance

Special distributions
Binomial distribution N independent trials, each of which has only two possible outcomes, success p or failure (1-p) Probability of obtaining successes on r attempts Probability of failure in the remaining N-r trials Ordering of the successes and failures

Mean: Variance: r = non negative integer 0 < p < 1, p Real Symmetric if p=(1-p)

Where the binomial distribution apply ?
Throw a dice N times. What is the probability of obtaining a 6 on exactly r occasions ? The angles that the decay products from a given source make with some fixed axis are measured. If the expected distribution is known, what is the probability of observing r decays in the forward hemisphere ( < p/2) from a total sample of N decays ?  Poisson  Gaussian Properties and limiting cases

Poisson distribution Probability of observing r independent events in a time interval t, when the counting rate is m and the expected number of events in the time interval is l Limiting case of the binomial distribution when r non negative integer l positive real number Mean = Variance = l

Where the Poisson distribution apply ?
Particle emission from a radioactive source: if particles are emitted from the source at an average rate of l (= number of particles emitted by unit time), the number of particles emitted, r, in a time interval Dt follows a Poisson distribution.

Additive property of independent Poisson variables
Assume that you have a radioactive emitting source in a medium where there exist radioactive background emissions at a rate mb. The radioactive source emits at a ratio mx Radioactive source Background What is the distribution probability for the emission of source + background ?

r = number of particles emitted by the source + background
Binomial formula

Gaussian distribution
General form: Mean Variance Normalization Continuous variable Standard form

I[m-s,m+s] ~ 0.68 I[-,+ ] Properties Symmetric w.r.t. m if xi is gaussian with m and s2 then is gaussian with m and s2/n

Where the Gaussian distribution apply ?
The result of repeating an experiment many times produces a spread of answers whose distribution is approximately Gaussian. If the individual errors contributing to the final answer are small, the approximation to a Gaussian is especially good.

Binomial Poisson Gaussian N  ; Np = m m   N  

Probability and Statistics for Particle Physics

Similar presentations

Presentation on theme: "Probability and Statistics for Particle Physics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Probability and Statistics for Particle Physics

Similar presentations

Presentation on theme: "Probability and Statistics for Particle Physics"— Presentation transcript:

Similar presentations

About project

Feedback