Statistical Theory; Why is the Gaussian Distribution so popular? Rob Nicholls MRC LMB Statistics Course 2014.

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

Chapter 6 Sampling and Sampling Distributions
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Continuous Probability Distributions.  Experiments can lead to continuous responses i.e. values that do not have to be whole numbers. For example: height.
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
Chapter 7 Introduction to Sampling Distributions
Ch 4 & 5 Important Ideas Sampling Theory. Density Integration for Probability (Ch 4 Sec 1-2) Integral over (a,b) of density is P(a
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
Sampling Distributions
Chapter 6 Introduction to Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Introduction to Statistics Chapter 7 Sampling Distributions.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Probability By Zhichun Li.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Continuous random variables Uniform and Normal distribution (Sec. 3.1, )
Chapter Sampling Distributions and Hypothesis Testing.
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Probability Theory Random Variables and Distributions Rob Nicholls MRC LMB Statistics Course 2014.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Continuous Probability Distribution  A continuous random variables (RV) has infinitely many possible outcomes  Probability is conveyed for a range of.
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Prof. SankarReview of Random Process1 Probability Sample Space (S) –Collection of all possible outcomes of a random experiment Sample Point –Each outcome.
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Overview course in Statistics (usually given in 26h, but now in 2h)  introduction of basic concepts of probability  concepts of parameter estimation.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Continuous Probability Distributions Continuous random variable –Values from interval of numbers –Absence of gaps Continuous probability distribution –Distribution.
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Estimation This is our introduction to the field of inferential statistics. We already know why we want to study samples instead of entire populations,
Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Chap 7-1 Developing a Sampling Distribution Assume there is a population … Population size N=4.
Distributions of the Sample Mean
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Lecture 2 Basics of probability in statistical simulation and stochastic programming Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius,
One Random Variable Random Process.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Sampling Theory and Some Important Sampling Distributions.
Lecture 5 Introduction to Sampling Distributions.
Joint Moments and Joint Characteristic Functions.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Sums of Random Variables and Long-Term Averages Sums of R.V. ‘s S n = X 1 + X X n of course.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Random Variables By: 1.
And distribution of sample means
Introduction to Probability - III John Rundle Econophysics PHYS 250
Sampling Distributions
STAT 311 REVIEW (Quick & Dirty)
Ch3: Model Building through Regression
Frequency and Distribution
Chapter 7: Sampling Distributions
Review of Probability Theory
POPULATION (of “units”)
CHAPTER 15 SUMMARY Chapter Specifics
Central Limit Theorem: Sampling Distribution.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

Statistical Theory; Why is the Gaussian Distribution so popular? Rob Nicholls MRC LMB Statistics Course 2014

Contents Continuous Random Variables Expectation and Variance Moments The Law of Large Numbers (LLN) The Central Limit Theorem (CLT)

A Random Variable is an object whose value is determined by chance, i.e. random events Probability that the random variable X adopts a particular value x : : discrete : continuous Continuous Random Variables

Continuous Uniform Distribution: Probability Density Function:

Continuous Random Variables Example:

Continuous Random Variables Example:

Continuous Random Variables Example:

Continuous Random Variables Example:

Continuous Random Variables Example:

Continuous Random Variables Example: In general, for any continuous random variable X:

Continuous Random Variables : discrete : continuous “Why do I observe a value if there’s no probability of observing it?!” Answers: Data are discrete You don’t actually observe the value – precision error Some value must occur… even though the probability of observing any particular value is infinitely small

Continuous Random Variables For a random variable: The Cumulative Distribution Function (CDF) is defined as: ( discrete/continuous) Properties: Non-decreasing

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Continuous Random Variables Probability Density function: Cumulative Distribution function: Boxplot:

Cumulative Distribution Function (CDF): Discrete: Continuous: Probability Density Function (PDF): Discrete: Continuous: Continuous Random Variables

Expectation and Variance Motivational Example: Experiment on Plant Growth (inbuilt R dataset) - compares yields obtained under different conditions

Expectation and Variance Motivational Example: Experiment on Plant Growth (inbuilt R dataset) - compares yields obtained under different conditions Compare means to test for differences Consider variance (and shape) of the distributions – help choose appropriate prior/protocol Assess uncertainty of parameter estimates – allow hypothesis testing

Expectation and Variance Motivational Example: Experiment on Plant Growth (inbuilt R dataset) - compares yields obtained under different conditions Compare means to test for differences Consider variance (and shape) of the distributions – help choose appropriate prior/protocol Assess uncertainty of parameter estimates – allow hypothesis testing In order to do any of this, we need to know how to describe distributions i.e. we need to know how to work with descriptive statistics

Expectation and Variance Discrete RV: Sample (empirical): (explicit weighting not required) Continuous RV:

Expectation and Variance Normal Distribution:

Expectation and Variance Normal Distribution:

Expectation and Variance Standard Cauchy Distribution: (also called Lorentz)

Expectation and Variance Expectation of a function of random variables: Linearity:

Expectation and Variance Variance: X ~ N(0,1)

Expectation and Variance Variance: X ~ N(0,1)

Expectation and Variance Variance: X ~ N(0,1) X ~ N(0,2)

Expectation and Variance Variance: X ~ N(0,1) X ~ N(0,2) Population Variance: Unbiased Sample Variance:

Expectation and Variance Variance: Non-linearity: Standard deviation (s.d.):

Expectation and Variance Variance: Non-linearity: Standard deviation (s.d.):

Expectation and Variance Variance: Non-linearity: Standard deviation (s.d.):

Expectation and Variance Variance: Non-linearity: Standard deviation (s.d.):

Expectation and Variance Variance: Non-linearity: Standard deviation (s.d.):

Expectation and Variance Often data are standardised/normalised Z-score/value: Example:

Moments Shape descriptors Li and Hartley (2006) Computer Vision Saupe and Vranic (2001) Springer

Moments Shape descriptors Li and Hartley (2006) Computer Vision Saupe and Vranic (2001) Springer

Moments Shape descriptors Li and Hartley (2006) Computer Vision Saupe and Vranic (2001) Springer

Moments Moments provide a description of the shape of a distribution Raw moments Central momentsStandardised moments : mean : variance : skewness : kurtosis

Moments Standard Normal: Standard Log-Normal:

Moments Moment generating function (MGF): Alternative representation of a probability distribution.

Moments Moment generating function (MGF): Alternative representation of a probability distribution. Example:

Moments However, MGF only exists if E(X n ) exists Characteristic function always exists: Related to the probability density function via Fourier transform Example:

The Law of Large Numbers (LLN) Motivational Example: Experiment on Plant Growth (inbuilt R dataset) - compare yields obtained under different conditions Want to estimate the population mean using the sample mean. How can we be sure that the sample mean reliably estimates the population mean?

The Law of Large Numbers (LLN) Does the sample mean reliably estimate the population mean? The Law of Large Numbers: Providing X i : i.i.d.

The Law of Large Numbers (LLN) Does the sample mean reliably estimate the population mean? The Law of Large Numbers: Providing X i : i.i.d.

The Central Limit Theorem (CLT) Question - given a particular sample, thus known sample mean, how reliable is the sample mean as an estimator of the population mean? Furthermore, how much will getting more data improve the estimate of the population mean? Related question - given that we want the estimate of the mean to have a certain degree of reliability (i.e. sufficiently low S.E.), how many observations do we need to collect? The Central Limit Theorem helps answer these questions by looking at the distribution of stochastic fluctuations about the mean as

The Central Limit Theorem states: For large n : Or equivalently: More formally: Conditions: : i.i.d. RVs (any distribution) The Central Limit Theorem (CLT)

The Central Limit Theorem states: For large n :

The Central Limit Theorem (CLT) The Central Limit Theorem states: For large n :

The Central Limit Theorem (CLT) The Central Limit Theorem states: For large n :

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem:

The Central Limit Theorem (CLT) Proof of the Central Limit Theorem: = characteristic function of a

Summary Considered how: Probability Density Functions (PDFs) and Cumulative Distribution Functions (CDFs) are related, and how they differ in the discrete and continuous cases Expectation is at the core of Statistical theory, and Moments can be used to describe distributions The Central Limit Theorem identifies how/why the Normal distribution is fundamental The Normal distribution is also popular for other reasons: Maximum entropy distribution (given mean and variance) Intrinsically related to other distributions ( t, F, χ 2, Cauchy, … ) Also, it is easy to work with

References Countless books + online resources! Probability and Statistical theory: Grimmett and Stirzker (2001) Probability and Random Processes. Oxford University Press. General comprehensive introduction to (almost) everything mathematics: Garrity (2002) All the mathematics you missed: but need to know for graduate school. Cambridge University Press.