Probability and Statistics for Particle Physics

Slides:



Advertisements
Similar presentations
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Advertisements

Evaluating Hypotheses
CHAPTER 6 Statistical Analysis of Experimental Data
Lecture Slides Elementary Statistics Twelfth Edition
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 4 and 5 Probability and Discrete Random Variables.
1 CY1B2 Statistics Aims: To introduce basic statistics. Outcomes: To understand some fundamental concepts in statistics, and be able to apply some probability.
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
1 Probability and Statistics  What is probability?  What is statistics?
Modeling and Simulation CS 313
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Chapter 8 Probability Section R Review. 2 Barnett/Ziegler/Byleen Finite Mathematics 12e Review for Chapter 8 Important Terms, Symbols, Concepts  8.1.
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Probability The definition – probability of an Event Applies only to the special case when 1.The sample space has a finite no.of outcomes, and 2.Each.
Theory of Probability Statistics for Business and Economics.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
OPIM 5103-Lecture #3 Jose M. Cruz Assistant Professor.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Dr. Ahmed Abdelwahab Introduction for EE420. Probability Theory Probability theory is rooted in phenomena that can be modeled by an experiment with an.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Chapter 5 Probability Distributions 5-1 Overview 5-2 Random Variables 5-3 Binomial Probability Distributions 5-4 Mean, Variance and Standard Deviation.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Chapter5 Statistical and probabilistic concepts, Implementation to Insurance Subjects of the Unit 1.Counting 2.Probability concepts 3.Random Variables.
Probability and Statistics for Particle Physics Javier Magnin CBPF – Brazilian Center for Research in Physics Rio de Janeiro - Brazil.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
Virtual University of Pakistan
Virtual University of Pakistan
MECH 373 Instrumentation and Measurements
Lecture 1 Probability and Statistics
Discrete Random Variables
Welcome to MM305 Unit 3 Seminar Dr
Random Variables.
Binomial and Geometric Random Variables
Business Statistics Topic 4
Virtual University of Pakistan
Chapter 5 Sampling Distributions
Lecture 13 Sections 5.4 – 5.6 Objectives:
Chapter 5 Sampling Distributions
Gaussian (Normal) Distribution
Lecture 11 Sections 5.1 – 5.2 Objectives: Probability
Chapter 4 – Part 3.
Chapter 5 Some Important Discrete Probability Distributions
Chapter 5 Sampling Distributions
Probability distributions
STATISTICAL MODELS.
Counting Statistics and Error Prediction
Chapter 6: Random Variables
Lecture 2 Binomial and Poisson Probability Distributions
Lecture 11: Binomial and Poisson Distributions
3. Random Variables Let (, F, P) be a probability model for an experiment, and X a function that maps every to a unique point.
Probability.
Statistical analysis and its application
12/12/ A Binomial Random Variables.
Discrete Random Variables: Basics
Discrete Random Variables: Basics
Discrete Random Variables: Basics
Chapter 5: Sampling Distributions
CHAPTER 2.1 PROBABILITY DISTRIBUTIONS.
CHAPTER 2.1 PROBABILITY DISTRIBUTIONS.
Chapter 11 Probability.
Presentation transcript:

Probability and Statistics for Particle Physics Javier Magnin CBPF – Brazilian Center for Research in Physics Rio de Janeiro - Brazil

Outline Course: three one hour lectures 1st lecture: General ideas / Preliminary concepts Probability and statistics Distributions 2nd lecture: Error matrix Combining errors / results Parameter fitting and hypothesis testing 3rd lecture: Parameter fitting and hypothesis testing (cont.) Examples of fitting procedures

1st lecture

Preliminary concepts Two types of experimental results: Determination of the numerical value of some physical quantity Testing whether a particular theory is consistent with data Parameter determination Hypothesis testing In real life there is a degree of overlapping between both types above We will go through both types of results along these lectures

Why estimate errors ? Question: Are these two numbers consistent ? Example: consider the accepted value of the speed of light c = 2.998 x 108 m/s Assume that a new measurement gives c´ = (3.09  x) x 108 m/s Question: Are these two numbers consistent ?

Why estimate errors ? Example: consider the accepted value of the speed of light c = 2.998 x 108 m/s Assume that a new measurement gives c´ = (3.09  x) x 108 m/s If x =  0.15  the new determination is consistent with the accepted value. If x =  0.01  both values are inconsistent  there is evidence for a change in the speed of light ! If x =  2  both values are consistent, but accuracy is so low that is impossible to detect a change on c !

Random and systematic errors Consider the experiment of determining the decay constant  of a radioactive source: Count how many decays are observed in a time interval t Determine the decay rate Number of decaying nuclei

Random errors: inherent statistical error in counting events uncertainty in the mass of the sample timing of the period for which the decay are observed Systematic errors: efficiency of the counter used to detect the decays background (i.e. particles coming from other sources) purity of the radioactive sample calibration errors

Probability Suppose you repeat an experiment several times. Even if you are able to keep constant the essential conditions, the repetition will produce different results The result of an individual measurement will be unpredictable even when the possible results of a series of measurements have a well-defined distribution

Definition: The probability p of obtaining a specific result when performing one measurement or trial is defined as p = Number of times on which that result occurs Total number of measurements or trials

Rules of probability If P(A) is the probability of a given event A, then 0  P(A)  1. The probability P(A+B) that at least A or B occurs is such that P(A+B)  P(A) + P(B). The equality is valid only if A and B are exclusive events. The probability P(AB) of obtaining both A and B is P(AB) = P(A|B)P(B) = P(B|A)P(A), where P(A|B) is the probability of obtaining A given that B has occurred (P(A|B) is know as the conditional probability of A given B). The rule 3. defines the conditional probability as P(A|B) = P(AB)/P(B).

Comments (about the rules !) P(A+B) = P(A) + P(B) – P(AB) to avoid double counting ! P(A|B) = P(AB)/P(B) = (NC/N)/(NB/N) = NC/NB P(A|B)  P(B|A) If P(A|B) = P(A) then A and B are independent, which is equivalent to say that P(AB) = P(A)P(B)

Example: use of conditional probability Measurement of the mass difference Dm = m(KL) – m(KS) from the K0p and K0p cross sections K+ p p+ K0 p- K+ + p  K0p+p (production) K0 + p  K0 + p K0  p+ + p- (decay) K0 are detected in the decay (event B) We want to measure K0p  K0p (event A) We are interested in P(AB) = P(B|A)P(A)

Bayes theorem j P(A|Bj)P(Bj) Let the sample space  be spanned by n mutually exclusive and exhaustive sets Bi, i P(Bi) = 1. If A is also a set belonging to , then P(Bi|A) = P(A|Bi) P(Bi) j P(A|Bj)P(Bj)

Example (of the Bayes theorem) Consider three drawers B1, B2, B3, each one with two coins. B1 has two gold coins, B2 has one gold and one silver and B3 has two silver coins. Now select a random drawer and pick a coin from it. Supposing that the coin is gold, what is the probability of having a second gold coin in the same drawer (or, what is the probability of having selected the drawer B1, given that I selected a gold coin)? B1 B2 B3

j P(A|Bj)P(Bj) If A is the event of first picking a gold coin, then P(A|B1) = 1; P(A|B2) = 1/2; P(A|B3) = 0 Since the drawers is selected at random, P(B1) = P(B2) = P(B3) = 1/3 Hence, from Bayes theorem follows that P(B1|A) = P(A|B1) P(B1) j P(A|Bj)P(Bj) 1 x 1/3 1x1/3 + 1/2x1/3 + 0x1/3 = = 2/3 Thus, although the probability of selecting the drawer B1 is 1/3, the observation that the first coin drawn was gold doubles the probability that the drawer B1 had been selected

Statistics Probability: we start with a well defined problem, and calculate from this the possible outcome of a specific experiment Statistics: we use data from a experiment to deduce what are the rules or laws relevant to the experiment

Statistics Statistics  two different problems (with some overlap) Hypothesis testing: use data to test the hypothesis of a model Parameter determination: use data to measure the parameter of a given model or theory

Probability Statistics Toss a coin. If P(heads) = 0.5, how many heads you get in 50 tosses? Given a sample of K* mesons of known polarization, what is the forward/backward asymmetry? If you observe 27 heads in 50 tosses, what is the value of P(heads)? (*) Of 1000 K* mesons, 600 were observed that decay forward. What is the K* polarization? (**) From theory to data From data to theory Probability Statistics (*) Parameter determination: deduce a quantity and its error (**) Hypothesis testing: check a theory

Distributions In general, the result of repeating the same measurement many times does not lead to the same result Experiment: measure the length of one side of your table 10 times and display the results in a histogram. What happens if you repeat the measurement 50 times ?

Sample: the total number of measurements of the side of the table, N Measuring the table... Distribution n(x): describes how often a value of the variable x occurs in a sample Sample: the total number of measurements of the side of the table, N x  discrete or continuous variable distributed in a finite or infinite range Bin size n x

Mean and variance: center and width of a distribution Assume a set of N separate measurements of the same physical quantity Mean Variance

Mean and variance: center and width of a distribution Assume a set of N separate measurements of the same physical quantity Mean Variance Convention: m and s2 are the true values of the mean and the variance resp.

Then s2 will not change by increasing N, but the variance of the mean, s2/ N, decreases with N Entries in each x bin (66 in total) The mean x is known to an accuracy of s/ N (more about 1/ N later) s2 is a measurement of how the distribution spreads out and not a measurement of the error in x ! x is better determined as N increases s2 is a property of the distribution

Continuous distributions Number of events Mean N large Variance

Special distributions Binomial distribution N independent trials, each of which has only two possible outcomes, success p or failure (1-p) Probability of obtaining successes on r attempts Probability of failure in the remaining N-r trials Ordering of the successes and failures

Mean: Variance: r = non negative integer 0 < p < 1, p Real Symmetric if p=(1-p)

Where the binomial distribution apply ? Throw a dice N times. What is the probability of obtaining a 6 on exactly r occasions ? The angles that the decay products from a given source make with some fixed axis are measured. If the expected distribution is known, what is the probability of observing r decays in the forward hemisphere ( < p/2) from a total sample of N decays ?  Poisson  Gaussian Properties and limiting cases

Poisson distribution Probability of observing r independent events in a time interval t, when the counting rate is m and the expected number of events in the time interval is l Limiting case of the binomial distribution when r non negative integer l positive real number Mean = Variance = l

Where the Poisson distribution apply ? Particle emission from a radioactive source: if particles are emitted from the source at an average rate of l (= number of particles emitted by unit time), the number of particles emitted, r, in a time interval Dt follows a Poisson distribution.

Additive property of independent Poisson variables Assume that you have a radioactive emitting source in a medium where there exist radioactive background emissions at a rate mb. The radioactive source emits at a ratio mx Radioactive source Background What is the distribution probability for the emission of source + background ?

r = number of particles emitted by the source + background Binomial formula

Gaussian distribution General form: Mean Variance Normalization Continuous variable Standard form

I[m-s,m+s] ~ 0.68 I[-,+ ] Properties Symmetric w.r.t. m if xi is gaussian with m and s2 then is gaussian with m and s2/n

Where the Gaussian distribution apply ? The result of repeating an experiment many times produces a spread of answers whose distribution is approximately Gaussian. If the individual errors contributing to the final answer are small, the approximation to a Gaussian is especially good.

Binomial Poisson Gaussian N  ; Np = m m   N  