Presentation is loading. Please wait.

Presentation is loading. Please wait.

פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב 2007 6.

Similar presentations


Presentation on theme: "פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב 2007 6."— Presentation transcript:

1 פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב 2007 6

2 Administrative stuff Projects status Other homework problems: –Open questions in HW #1 (questions about the Quantum Universe) and HW #3 (difference between D mixing and Bs mixing analyses) – we will go over them when return from break The plan for the next few weeks: –Statistics (with as many real examples as possible) –Root and RooFit –Practicing statistics and analysis techniques Lecture on Tuesday, April 10 (Mimona) instead of Monday (Passover break)?

3 Why do we use statistics in EPP? Scientific claims need to be based on solid mathematics –How confident are we of the result? What is the probability that we are wrong? –Especially important when working at the frontier of knowledge: extraordinary claims require extraordinary proof Proving something with high certainty is usually expensive –Many first measurements are made with marginal certainty Statistical standards: –“Evidence” –“Observation”

4 Probability Set S (sample space) Subset A  S The probability P(A) is a real number that satisfies the axioms 1.P(A)  0 2.If A and B are disjoint subsets, i.e., A  B = 0, then P(A  B) = P(A) + P(B) 3.P(S) = 1

5 Derived properties P(!A) = 1 – P(A), where !A = S – A P(A  !A) = 1 1  P(A)  0 P(null set) = 0 If A  B, then P(B)  P(A) P(A  B) = P(A) + P(B) – P(A  B)

6 More definitions Subsets A and B are independent if P(A  B) = P(A) P(B) A random variable x is a variable that has a specific value for each element of the set An element may have more than 1 random variables: x = {x 1, …x n }

7 Interpretation of Probability in data analysis Limiting relative frequency: –Elements of the sample space S = possible outcomes of a repeatable measurement –The probability of a particular outcome e (= element of S) is (note that the single element e belongs to a subset with one element = an elementary subset) –A non-elementary subset A corresponds to an occurrence of any of the outcomes in the subset, with probability

8 Example 1 Element e = D mixing parameter y’ measured to be 0.01 Subset A = y’ measured to be in range [0.005, 0.015] P(A) = fraction of experiments in which y’ is measured in [0.005, 0.015], given that its true value is 0.002

9 Example 2 e = (x’ 2, y’) measured to be (-0.0002, 0.01) A = (x’ 2, y’) measured to be anywhere outside the brown (“4  ”) contour P(A) = fraction of experiments in which (x’ 2, y’) are measured outside the contour, given that their true values are the measured ones

10 Example 3 e = error on CP-violating parameter   measured to be 42  A =   error measured to be 42  or greater P(A) = fraction of experiments in which the   error is measured to be 42  or greater

11 About the relative frequency interpretation Straightforward when measurements are repeatable: –Particle collisions in an experiment –Radioactive decays of identical nuclei Also works when measurements are repeatable only in principle : –Measurement of the D mixing parameters using all the data we will ever have –Measurement of the average height of all humans Physical laws don’t change

12 Probability density functions Outcome of an experiment is a continuous random variable x –Applies to most measurements in particle physics Define: the probability density function (PDF) f(x), such that f(x) dx = probability to observe x in [x, x+dx] = fraction of experiments in which x will be measured in [x, x+dx] To satisfy axiom 3: P(S) = 1, normalize the PDF:

13 The PDF and finite number of observations A set of n meas measurements x m (m=1…n meas ) can be presented as a histogram: n b (b=1…n bins ) = number of measurements for which x falls in bin b n b / n meas = probability for a measurement to be in bin b. –  b n b / n meas = 1 n b / (n meas  x b ) = (discrete) probability density function Continuum limit (infinite number of observations, infinitely fine binning):

14 Cumulative distribution The cumulative distribution of f(x) is Alternatively: F(x) = probability to obtain measurement whose value is < x f(x) = dF(x)/dx (for differentiable F(x))  -point x  is the value of x such that F(x  ) = , where 1    0. Or: x  = F  (  ) Median = x ½ = value of x such that F(x ½ ) = ½ Mode = x mode such that f(x mode ) > f(all other values of x) –may not be useful or unique if f(x) has multiple local maxima

15 Extension to multi-variable PDFs For f(x), x = {x 1, … x n }, the  -point turns into an  -contour of dimension n-1 Marginal PDFs: –f x (x) =  f(x,y) dy –f y (y) =  f(x,y) dx x and y are independent variables if f(x,y) = f x (x) f y (y) –Also called uncorrelated variables

16 Functions of random variables a(x) is a continuous function of random variable x, which has PDF f(x) –E.g., a = x 2, a = log(x), etc. What is the PDF g(a)? Require equal probabilities in corresponding infinitesimal regions: g(a) da = f(x) dx  g(a) = f(x(a)) |dx/da| Assumes a(x) can be inverted Abs value to keep PDF positive x a dx da

17 Example The CP-violation phases  are not measured directly. We measure cos  or sin  or sin 2 , then transform to the phases:

18 Multiple-valued x(a) If a(x) is not uniquely invertable, need to add up the different contributions. x a dx 1 da dx 2 dS(a) = sum of 2 regions

19 Functions of multiple random variables What is g(a) for a(x), x = {x 1, … x n } Example.: z = xy, what is f(z) given g(x) and h(y)? f(z) is the Mellin convolution of g(x) h(y) dS is the hypersurface in x that encloses [a, a+da]

20 Another example: z = x + y f(z) is the familiar Fourier convolution of g(x) and h(y). Recall from the D mixing analysis: The measured decay time t is the true decay time t’ (distribution P(t’)) + a random detector error  t (distribution  (  t):

21 Multiple functions of multiple random variables g(a 1, …. a n ) = f(x 1, … x n ) |J|, where the Jacobian is To determine the marginal distribution g i (a i ), need to integrate g(a 1, …. a n ) over the a j  i variables

22 Expectation values The expectation value of a random variable x distributed according to the PDF f(x): Also called population mean E[x] is the most commonly used location parameter (others are the  - point x  and the mode) The expectation value of a function a(x) is

23 Moments The nth algebraic moment of f(x): –Note that the population mean  is the special case  ’ 1 The nth central moment In particular, is the population variance of f(x) The standard deviation gives an idea of the spread of f(x)

24 Mean and variance of functions Take a function of many random variables: a(x). Then

25 Covariance For 2 random variables x, y, the covariance cov[x,y] or V xy is For 2 functions a(x), b(x), the covariance is Note that V ab = V ba and V aa =  a 2 The dimensionless correlation coefficient is  ab = V ab / (  a  b ) –Note that 1   ab   1

26 Understanding covariance and correlation V xy =E[(x  x )(y  y )] is the expectation value of the product of the deviations from the means. If having x >  x increases the probability of having y >  y then V xy > 0, x and y are positively correlated If having x >  x increases the probability of having y <  y then V xy < 0, x and y are negatively correlated or anti-correlated. For independent variables (defined as f(x,y) = f x (x) f y (y)), we find E[xy] = E[x] E[y] =  x  y so V xy = 0. Does V xy = 0 necessarily mean that the variables are independent?...

27 Covariance and correlation …No. E.g.,

28 Propagation of errors Take n random variables x with unknown PDF f(x), but with E[x] and V ij known (or estimated) Take the function y(x). What are E[y] and V[y]? –Remember: we don’t know f(x). Expand y:

29 Why is this “error propagation”? Because we often estimate errors from covariances

30 Special cases Note: These formulae don’t work if y is significantly non-linear within a distance  i around the mean 

31 Orthogonal transformation of variables It is often useful to work in variables in which the covariance matrix is diagonal: cov[y i,y j ] =  i 2  ij This can always achieved with a linear transformation: where the rows of the transformation matrix A ij are the eigenvectors of cov[x i,x j ]. Then  i 2 are the eigenvalues of cov[x i,x j ]

32 Visualize for 2 dimensions Recall the definition of the correlation coefficient  ab = V ab / (  a  b ). So we can write Eigenvector 1 Eigenvector 2

33 More on linear variable transformations The uncorrelated variables y i have a simpler covariance matrix, but may not always correspond to physically interesting quantities E.g., in D mixing, x’ 2 and y’ have a very high correlation coefficient of  =  0.94 But they are the physically interesting variables…


Download ppt "פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב 2007 6."

Similar presentations


Ads by Google