Sampling and estimation Petter Mostad 2005.09.26.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Statistics review of basic probability and statistics.
Chapter 7 Introduction to Sampling Distributions
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Chapter 6 Introduction to Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Topic 2: Statistical Concepts and Market Returns
Evaluating Hypotheses
Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Continuous Random Variables and Probability Distributions
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
1 Sampling Distribution Theory ch6. 2  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s.  Ex6.1-1: X1is the number of.
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Some standard univariate probability distributions
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
Sampling Theory Determining the distribution of Sample statistics.
Standard error of estimate & Confidence interval.
Business Statistics: Communicating with Numbers
Statistical Hypothesis Testing. Suppose you have a random variable X ( number of vehicle accidents in a year, stock market returns, time between el nino.
Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.
Sampling Theory Determining the distribution of Sample statistics.
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
Random variables Petter Mostad Repetition Sample space, set theory, events, probability Conditional probability, Bayes theorem, independence,
1 As we have seen in section 4 conditional probability density functions are useful to update the information about an event based on the knowledge about.
Probability theory 2 Tron Anders Moger September 13th 2006.
Moment Generating Functions
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistics for Data Miners: Part I (continued) S.T. Balke.
Random Sampling, Point Estimation and Maximum Likelihood.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Continuous Distributions The Uniform distribution from a to b.
Ch5. Probability Densities II Dr. Deshi Ye
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Confidence intervals and hypothesis testing Petter Mostad
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Continuous Random Variables and Probability Distributions
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Theoretical distributions: the other distributions.
The normal distribution
Sampling and Sampling Distributions
Elementary Statistics
The normal distribution
Presentation transcript:

Sampling and estimation Petter Mostad

The normal distribution The most used continuous probability distribution: –Many observations tend to approximately follow this distribution –It is easy and nice to do computations with –BUT: Using it can result in wrong conclusions when it is not appropriate

The mean μ μ-2σ μ+2σ

The normal distribution The probability density function is where Notation Standard normal distribution Using the normal density is often OK unless the actual distribution is very skewed

Normal probability plots Plotting the quantiles of the data versus the quantiles of the distribution. If the data is approximately normally distributed, the plot will approximately show a straight line

The Normal versus the Binomial distribution When n is large and π is not too close to 0 or 1, then the Binomial distribution becomes very similar to the Normal distribution with the same expectation and variance. This is a phenomenon that happens for all distributions that can be seen as a sum of independent observations. It can be used to make approximative computations for the Binomial distribution.

The Exponential distribution The exponential distribution is a distribution for positive numbers (parameter λ): It can be used to model the time until an event, when events arrive randomly at a constant rate

Sampling We need to start connecting the probability models we have introduced with the data we want to analyze. We (usually) want to regard our data as a simple random sample from a probability model: –Each is sampled independently from the other –Each is sampled from the probability model Thus we go on to study the properties of simple random samples.

Example: The mean of a random sample If X 1,X 2,…,X n is a random sample, then their sample mean is defined as As it is a function of random variables, it is a random variable. If E(X i )=μ, then If Var(X i )=σ 2, then

Example Assume X 1,X 2,…,X 10 is a random sample from the binomial distribution Bin(20,0.2) We get

Simulation Simulation: To generate outcomes by computer, on the basis of pseudo-random numbers Pseudo-random number: Generated by an algorithm completely unrelated to the way numers are used, so they appear random. Usually generated to be uniformly distributed between 0 and 1. There is a correspondence between random variables and algorithms to simulate outcomes.

Examples To simulate outcomes 1,2,…,6 each with probability 1/6: Simulate pseudo-random u in [0,1), and let the outcome be i if u is between (i-1)/6 and i/6. To simulate exponentially distributed X with parameter λ: Simulate pseudo-random u in [0,1), and compute x=-log(u)/λ

Stochastic variables and simulation of outcomes The histogram of n simulated values will approach the probability distribution simulated from, as n increases n= n=1000 n=100

Using simulation to study properties of samples We saw how we can find theoretically the expectation and variance of some functions of a sample Instead, we can simulate the function of the sample a large number of times, and study the distribution of these numbers: This gives approximate results.

Example X 1,X 2,…,X 10 is a random sample from the binomial distribution Bin(20,0.2) Simulating these times, and computing, we get The average of these numbers is 4.001, the variance is

Studying the properties of averages If X1,X2,…,Xn is a random sample from some distribution, it is very common to want to study the mean In the following example, we have sampled from the Exponential distribution with λ parameter 1: –First (done times) taken average of 3 samples –Then (done times) taken average of 30 samples –Then (done times) taken average of 300 samples

Exp. distr; λ=1 Average of 3 Average of 30 Average of 300

The Central Limit Theorem It is a very important fact that the above happens no matter what distribution you start with. The theorem states: If X 1,X 2,…,X n is a random sample from a distribution with expectation μ and variance σ 2, then approaches a standard normal distribution when n gets large.

Example Let X be from Bin(n,π): X/n can be seen as the average over n Bernoulli variables, so we can apply theory We get that when n grows, the expression gets an approximate standard normal distribution N(0,1). A rule for when to accept the approximation:

The sampling distribution of the sample variance Recall: the sample variance is We can show theoretically that its expectation is equal to the variance of the original distribution We know that its distribution is approximately normal if the sample is large If the underlying distribtion is normal N(μ,σ 2 ): – – is distributed as the distribution

The Chi-square distribution The Chi-square distribution with n degrees of freedom is denoted It is equal to the sum of the squares of n independent random variables with standard normal distributions.

Estimation We have previously looked at –Probability models (with parameters) –Properties of samples from such probability models We now turn this around and start with a dataset, and try to find a probability model fitting the data. A (point) estimator is a function of the data, meant to estimate a parameter of the model A (point) estimate is a value of the estimator, computed from the data

Properties of estimators An estimator is unbiased if its expectation is equal to the parameter it is estimating The bias of an estimator is its expectation minus the parameter it is estimating The efficiency of an unbiased estimator is measured by its variance: One would like to have estimators with high efficiency (i.e., low variance)

Confidence intervals: Example Assume μ and σ 2 are some real numbers, and assume the data X 1,X 2,…,X n are a random sample from N(μ,σ 2 ). –Then –thus –so and we say that is a confidence interval for μ with 95% confidence, based on the statistic

Confidence intervals: interpretation Interpretation: If we do the following a large number of times: –We pick μ (and σ 2 ) –We generate data and the statistic –We compute the confidence interval then the confidence interval will contain μ roughly 95% of the time. Note: The confidence interval pertains to μ (and σ 2 ), and to the particular statistic. If a different statistic is used, a different confidence interval could result.

Example: a different statistic Assume in the example above we use instead of. We then get as before, and the confidence interval Note how this is different from before, as we have used a different statistic.

Alternative concept: Credibility interval The knowledge about μ can be formulated as a probability distribution If an interval I has 95% probability under this distribution, then I is called a credibility interval for μ, with credibility 95% It is very common, but wrong, to interpret confidence intervals as credibility intervals

Example: Finding credibility intervals We must always start with a probability distribution π(μ) describing our knowledge about μ before looking at data As above, the probability distribution g for Z|μ is the normal distribution N(μ,σ 2 /n) Using Bayes formula, we get a probability distribution f for μ|Z:

Finding credibility intervals (cont.) IF we assume ”flat” knowledge about μ before observing data, i.e., that π(μ)=1, then and a credibility interval becomes Similarily, if we assume π(μ)=1 and only observe X1, then a credibility interval becomes

Summary on confidence and credibility intervals Confidence and credibility intervals are NOT the same. A confidence interval says something about a parameter AND a random variable (or statistic) based on it. A credibility interval describes the knowledge about the parameter; it must always be based also on a specification of the knowledge before making the observations, as well as the observations In many cases, computed confidence intervals correspond to credibility intervals with a certain prior knowledge assumed.