The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.

Slides:



Advertisements
Similar presentations
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Advertisements

Estimation in Sampling
Continuous Probability Distributions.  Experiments can lead to continuous responses i.e. values that do not have to be whole numbers. For example: height.
Correlation Mechanics. Covariance The variance shared by two variables When X and Y move in the same direction (i.e. their deviations from the mean are.
Introduction to Statistics
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Probability Densities
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Chapter 6 Continuous Random Variables and Probability Distributions
Calculus Review Texas A&M University Dept. of Statistics.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Review of Probability and Statistics
1 Sampling Distribution Theory ch6. 2  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s.  Ex6.1-1: X1is the number of.
Variance Fall 2003, Math 115B. Basic Idea Tables of values and graphs of the p.m.f.’s of the finite random variables, X and Y, are given in the sheet.
Statistical Theory; Why is the Gaussian Distribution so popular? Rob Nicholls MRC LMB Statistics Course 2014.
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
R. Kass/S07 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Linear Regression Inference
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
Topic 4 - Continuous distributions
Physics 114: Lecture 10 PDFs Part Deux Dale E. Gary NJIT Physics Department.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Moment Generating Functions
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
The Gaussian (Normal) Distribution: More Details & Some Applications.
Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …).
Continuous Distributions The Uniform distribution from a to b.
G. Cowan Computing and Statistical Data Analysis / Stat 2 1 Computing and Statistical Data Analysis Stat 2: Catalogue of pdfs London Postgraduate Lectures.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.
1 8. One Function of Two Random Variables Given two random variables X and Y and a function g(x,y), we form a new random variable Z as Given the joint.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 1 Computational Statistics with Application to Bioinformatics Prof. William.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Statistical Data Analysis: Lecture 4 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Probability distributions
Chapter 4 Continuous Random Variables and Probability Distributions  Probability Density Functions.2 - Cumulative Distribution Functions and E Expected.
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Chapter 9: One- and Two-Sample Estimation Problems: 9.1 Introduction: · Suppose we have a population with some unknown parameter(s). Example: Normal( ,
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
Joint Moments and Joint Characteristic Functions.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
One Function of Two Random Variables
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
1 Opinionated in Statistics by Bill Press Lessons #15.5 Poisson Processes and Order Statistics Professor William H. Press, Department of Computer Science,
Selecting Input Probability Distributions. 2 Introduction Part of modeling—what input probability distributions to use as input to simulation for: –Interarrival.
R. Kass/W04 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
Introduction to Probability - III John Rundle Econophysics PHYS 250
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
ASV Chapters 1 - Sample Spaces and Probabilities
The Exponential and Gamma Distributions
Chapter 4 Continuous Random Variables and Probability Distributions
Other confidence intervals
The Gaussian Probability Distribution Function
More about Normal Distributions
Functions of Random variables
Lecture 7 Sampling and Sampling Distributions
#10 The Central Limit Theorem
#8 Some Standard Distributions
Continuous Distributions
Presentation transcript:

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy Summer School Drawing Astrophysical Inferences from Data Sets William H. Press The University of Texas at Austin Lecture 2

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 2 Many, though not all, common distributions look sort-of like this: Suppose we want to summarize p(x) by a single number a, its “center”. Let’s find the value a that minimizes the mean-square distance of the “typical” value x: We already saw the beta distribution with ,  > 0 as an example on the interval [0,1]. We’ll see more examples soon.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 3 expec t a t i onno t a t i on: h any t h i ng i ´ R x ( any t h i ng ) p ( x ) d x expec t a t i on i s l i near, e t c. This is the variance Var(x), but all we care about here is that it doesn’t depend on a. T h em i n i mum i so b v i ous l ya = h x i. ( T a k e d er i va t i ve wr t aan d se tt ozero i f you l i k emec h an i ca l ca l cu l a- t i ons. ) m i n i m i ze: ¢ 2 ´ ­ ( x ¡ a ) 2 ® = ­ x 2 ¡ 2 ax + a 2 ® = ( ­ x 2 ® ¡ h x i 2 ) + (h x i ¡ a ) 2 (in physics this is called the “parallel axis theorem”)

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 4 Higher moments, centered moments defined by ¹ i ´ ­ x i ® = R x i p ( x ) d x M i ´ ­ ( x ¡ h x i) i ® = R ( x ¡ h x i) i p ( x ) d x But generally wise to be cautious about using high moments. Otherwise perfectly good distributions don’t have them at all (divergent). And (related) it can take a lot of data to measure them accurately. Third and fourth moments also have “names” The centered second moment M 2, the variance, is by far most useful M 2 ´ V ar ( x ) ´ ­ ( x ¡ h x i) 2 ® = ­ x 2 ® ¡ h x i 2 ¾ ( x ) ´ p V ar ( x ) “standard deviation” summarizes a distribution’s half-width (r.m.s. deviation from the mean)

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 5 Certain combinations of higher moments are also additive. These are called semi-invariants. Mean and variance are additive over independent random variables: note “bar” notation, equivalent to Skew and kurtosis are dimensionless combinations of semi-invariants A Gaussian has all of its semi-invariants higher than I 2 equal to zero. A Poisson distribution has all of its semi-invariants equal to its mean.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 6 This is a good time to review some standard (i.e., frequently occurring) distributions: Normal (Gaussian): Cauchy (Lorentzian): tails fall off “as fast as possible” tails fall off “as slowly as possible”

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 7 Student: “bell shaped” but you get to specify the power with which the tails fall off. Normal and Cauchy are limiting cases. (Also occurs in some statistical tests.) we’ll see uses for “heavy-tailed” distributions later note that  is not (quite) the standard deviation! “Student” was actually William Sealy Gosset ( ), who spent his entire career at the Guinness brewery in Dublin, where he rose to become the company’s Master Brewer.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 8 Common distributions on positive real line: Exponential:

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 9 Lognormal:

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 10 Gamma distribution: Gamma and Lognormal are both commonly used as convenient 2- parameter fitting functions for “peak with tail” positive distributions. Both have parameters for peak location and width. Neither has a separate parameter for how the tail decays. –Gamma: exponential decay –Lognormal: long-tailed (exponential of square of log)

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 11 Chi-square distribution (we’ll use this a lot!) Has only one parameter  that determines both peak location and width. is often an integer, called “number of degrees of freedom” or “DF” the independent variable is  2, not  It’s actually just a special case of Gamma, namely Gamma( /2,1/2)

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 12 A deviate from is called a t-value. is exactly the distribution of the sum of the squares of t-values. Let’s prove the case of =1: p Y ( y ) d y = 2 p X ( x ) d x S o, p Y ( y ) = y ¡ 1 = 2 p X ( y 1 = 2 ) = 1 p 2 ¼y e ¡ 1 2 y Why this will be important: We will know the distribution of any “statistic” that is the sum of a known number of t 2 -values.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 13 The characteristic function of a distribution is its Fourier transform. Á X ( t ) ´ Z 1 ¡ 1 e i t x p X ( x ) d x (Statisticians often use notational convention that X is a random variable, x its value, p X (x) its distribution.) Á X ( 0 ) = 1 Á 0 X ( 0 ) = Z i xp X ( x ) d x = i ¹ ¡ Á 00 X ( 0 ) = Z x 2 p X ( x ) d x = ¾ 2 + ¹ 2 So, the coefficients of the Taylor series expansion of the characteristic function are the (uncentered) moments. Characteristic function of a distribution

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 14 l e t S = X + Y p S ( s ) = Z p X ( u ) p Y ( s ¡ u ) d u Á S ( t ) = Á X ( t ) Á Y ( t ) Addition of independent r.v.’s: (Fourier convolution theorem.) Á a X ( t ) = Z e i t x p a X ( x ) d x = Z e i t x 1 a p X ³ x a ´ d x = Z e i ( a t )( x = a ) p X ³ x a ´ d x a = Á X ( a t ) Scaling law for characteristic functions: Scaling law for r.v.’s: Properties of characteristic functions:

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 15 Á X ( t ) ´ Z 1 ¡ 1 e i t x p X ( x ) d x Proof of convolution theorem: Fourier transform pair

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 16 What’s the characteristic function of a Gaussian? Tell Mathematica that sig is positive. Otherwise it gives “cases” when taking the square root of sig^2

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 17 What’s the characteristic function of ? Since we already proved that =1 is the distribution of a single t 2 -value, this proves that the general case is the sum of t 2 -values.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 18 Cauchy distribution has ill-defined mean and infinite variance, but it has a perfectly good characteristic function: Matlab and Mathematica both sadly fails at computing the characteristic function of the Cauchy distribution, but you can use fancier methods* and get: Á C auc h y ( t ) = e i ¹ t ¡ ¾ j t j note non-analytic at t=0 *If t>0, close the contour in the upper 1/2-plane with a big semi-circle, which adds nothing. So the integral is just the residue at the pole (x-  )/  =i, which gives exp(-  t). Similarly, close the contour in the lower 1/2-plane for t<0, giving exp(  t). So answer is exp(-|  t|). The factor exp(i  t) comes from the change of x variable to x- .