Statistical Inferences Jake Blanchard Spring 2010 Uncertainty Analysis for Engineers1.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Tests of Hypotheses Based on a Single Sample
Chapter 4 Inference About Process Quality
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Chapter 7. Statistical Estimation and Sampling Distributions
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Chap 9: Testing Hypotheses & Assessing Goodness of Fit Section 9.1: INTRODUCTION In section 8.2, we fitted a Poisson dist’n to counts. This chapter will.
1. Estimation ESTIMATION.
1 MF-852 Financial Econometrics Lecture 4 Probability Distributions and Intro. to Hypothesis Tests Roy J. Epstein Fall 2003.
Chapter 7 Sampling and Sampling Distributions
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Topic 2: Statistical Concepts and Market Returns
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Inference about a Mean Part II
Chapter 2 Simple Comparative Experiments
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Inferences About Process Quality
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Chapter 9 Hypothesis Testing.
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Statistical Inference for Two Samples
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
8 - 1 © 2003 Pearson Prentice Hall Chi-Square (  2 ) Test of Variance.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Statistical Intervals for a Single Sample
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Statistics for Data Miners: Part I (continued) S.T. Balke.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Random Sampling, Point Estimation and Maximum Likelihood.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
Probability = Relative Frequency. Typical Distribution for a Discrete Variable.
STATISTICAL INFERENCE PART IV CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 1.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Inferences from sample data Confidence Intervals Hypothesis Testing Regression Model.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
© Copyright McGraw-Hill 2004
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Chapter 2 Simple Comparative Experiments
Inference on Mean, Var Unknown
CONCEPTS OF HYPOTHESIS TESTING
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
STATISTICAL INFERENCE PART IV
Presentation transcript:

Statistical Inferences Jake Blanchard Spring 2010 Uncertainty Analysis for Engineers1

Introduction Statistical inference=process of drawing conclusions from random data Conclusions of this process are “propositions,” for example ◦ Estimates ◦ Confidence intervals ◦ Credible intervals ◦ Rejecting a hypothesis ◦ Clustering data points Part of this is the estimation of model parameters Uncertainty Analysis for Engineers2

Parameter Estimation Point Estimation ◦ Calculate single number from a set of observational data Interval Estimation ◦ Determine interval within which true parameter lies (along with confidence level) Uncertainty Analysis for Engineers3

Properties Bias=expected value of estimator does not necessarily equal parameter Consistency=estimator approaches parameter as n approaches infinity Efficiency=smaller variance of parameter implies higher efficiency Sufficient=utilizes all pertinent information in a sample Uncertainty Analysis for Engineers4

Point Estimation Start with data sample of size N Example: estimate fraction of voters who will vote for particular candidate (estimate is based on random sample of voters) Other examples: quality control, clinical trials, software engineering, orbit prediction Assume successive samples are statistically independent Uncertainty Analysis for Engineers5

Estimators Maximum likelihood Method of moments Minimum mean squared error Bayes estimators Cramer-Rao bound Maximum a posteriori Minimum variance unbiased estimator Best linear unbiased estimator etc Uncertainty Analysis for Engineers6

Maximum Likelihood Suppose we have a random variable x with pdf f(x;  ) Take n samples of x What is value of  that will maximize the likelihood of obtaining these n observations? Let L=likelihood of observing this set of values for x Then maximize L with respect to  Uncertainty Analysis for Engineers7

Maximum Likelihood Uncertainty Analysis for Engineers8

Example Time between successive arrivals of vehicles at an intersection are 1.2, 3, 6.3, 10.1, 5.2, 2.4, and 7.2 seconds Assume exponential distribution Find MLE for Uncertainty Analysis for Engineers9

Solution 10

2-Parameter Example Measure cycles to failure of saturated sand (25, 20, 28, 33, 26 cycles) Assume lognormal distribution Uncertainty Analysis for Engineers11

Solution Uncertainty Analysis for Engineers12

Method of Moments Use sample moments (mean, variance, etc.) to set distribution parameters Uncertainty Analysis for Engineers13

Example Time between successive arrivals of vehicles at an intersection are 1.2, 3, 6.3, 10.1, 5.2, 2.4, and 7.2 seconds Assume exponential distribution Mean=5.05 Uncertainty Analysis for Engineers14

2-Parameter Example Measure cycles to failure of saturated sand (25, 20, 28, 33, 26 cycles) Assume lognormal distribution Mean=26.4 Standard Deviation=4.72 Solve for and  =3.26  =0.177 Uncertainty Analysis for Engineers15

Solution Uncertainty Analysis for Engineers16

Minimum Mean Square Error Choose parameters to minimize mean squared error between measured data and continuous distribution Essentially a curve fit Uncertainty Analysis for Engineers17

Approach Excel ◦ Guess parameters ◦ Calculate sum of squares of errors ◦ Vary guessed parameters to minimize error (use the Solver) Matlab ◦ Use fminsearch function Uncertainty Analysis for Engineers18

Example Solar insolation data ◦ Gather data ◦ Form histogram ◦ Normalize histogram by number of samples and width of bins Uncertainty Analysis for Engineers19

Scatter Plot and Histogram Uncertainty Analysis for Engineers20

Normal and Weibull Fits Uncertainty Analysis for Engineers21 Mean=3980 (fit) Mean=3915 (data)

Excel Screen Shot Uncertainty Analysis for Engineers22

Excel Screen Shot Uncertainty Analysis for Engineers23

Solver Set Up Uncertainty Analysis for Engineers24

Matlab Script y=xlsread('matlabfit.xlsx','normal') [s,t]=hist(y,8); s=s/((max(t)-min(t))/8)/numel(y); numpts=numel(t); zin(1)=mean(t); zin(2)=std(t); sumoferrs(zin,t,s) sumoferrs(z,t,s), zin) sumoferrs(zout,t,s) xplot=t(1):(t(end)-t(1))/(10*numel(t)):t(end); yplot=curve(xplot,zout); plot(t,s,'+',xplot,yplot) Uncertainty Analysis for Engineers25

Matlab Script function f=curve(x,z) mu=z(1); sig=z(2); f=normpdf(x,mu,sig); function f=sumoferrs(z, x, y) f=sum((curve(x,z)-y).^2); Uncertainty Analysis for Engineers26

Sampling Distributions How do we assess inaccuracy in using sample mean to estimate population mean? Uncertainty Analysis for Engineers27

Conclusions Expected value of mean is equal to population mean Mean of sample is unbiased estimator of mean of population Variance of sample mean is sampling error By CLT, sample mean is Gaussian for large n Mean of x is N( ,  /  n) Estimator for  improves as n increases Uncertainty Analysis for Engineers28

Sample Mean with Unknown  In previous derivation,  is the population mean This is generally not known All we have is the sample variance (s 2 ) If sample size is small, distribution will not be Gaussian We can use a “student’s t-distribution” Uncertainty Analysis for Engineers29 f=number of degrees of freedom

Distribution of Sample Variance Uncertainty Analysis for Engineers30

Conclusions Sample variance is unbiased estimator of population variance For normal variates Uncertainty Analysis for Engineers31 Chi-Square Distribution with n-1 dof This approaches normal distribution for large n

Testing Hypotheses Used to make decisions about population based on sample Steps ◦ Define null and alternative hypotheses ◦ Identify test statistic ◦ Estimate test statistic, based on sample ◦ Specify level of significance  Type I error: rejecting null hypothesis when it is true  Type II error: accepting null hypothesis when it is false ◦ Define region of rejection (one tail or two?) Uncertainty Analysis for Engineers32

Level of Significance Type I error ◦ Level of significance (  ) ◦ Typically 1-5% Type II error (  ) is seldom used Uncertainty Analysis for Engineers33

Example We need yield strength of rebar to be at least 38 psi We order sample of 25 rebars Sample mean from 25 tests is 37.5 psi Standard deviation of rebar strength =3 psi Use one-sided test Hypotheses: null-  =38; alt.-  <38 Uncertainty Analysis for Engineers34

Solution Uncertainty Analysis for Engineers35 So we cannot reject the null hypothesis and the supplier is considered acceptable

Variation of This Example Suppose standard deviation is not known Use student’s t-distribution Sample stand. dev. = 3.5 psi Uncertainty Analysis for Engineers36 So we cannot reject the null hypothesis and the supplier is considered acceptable

Third Variation Sample size increased to 41 Sample mean=37.6 psi Sample standard deviation = 3.75 psi Null-variance=9 Alternative-variance>9 Use Chi-Square distribution Uncertainty Analysis for Engineers37

Solution Uncertainty Analysis for Engineers38 So we reject the null hypothesis and the supplier is not acceptable

Confidence Intervals In addition to mean, standard deviation, etc., confidence intervals can help us characterize populations For example, the mean gives us a best estimate of the expected value of the population, but confidence intervals can help indicate the accuracy of the mean Confidence interval is defined as the range within which a parameter will lie – within a prescribed probability Uncertainty Analysis for Engineers39

CI of the Mean First, we’ll assume the variance is known The central limit theorem states that the pdf of the mean of n individual observations from any distribution with finite mean and variance approaches a normal distribution as n approaches infinity Uncertainty Analysis for Engineers40

CI of the Mean Uncertainty Analysis for Engineers41  Is CDF of standard normal variate

Example Measure strength of rebar 25 samples Mean=37.5 psi Standard deviation=3 psi Find 95% confidence interval for mean Uncertainty Analysis for Engineers42

Solution Uncertainty Analysis for Engineers43 So the mean of the strength falls between 36.3 and 38.7 with a 95% confidence level

The Script mu=37.5 sig=3 n=25 alpha=0.05 ka=-norminv(1-alpha/2) k1ma=-ka cil=mu+ka*sig/sqrt(n) ciu=mu-ka*sig/sqrt(n) Uncertainty Analysis for Engineers44

Variance Not Known What if the variance of the population (  ) is not known? That is, we only know variance of sample. Let s=standard deviation of sample We can show that does not conform to a normal distribution, especially for small n Uncertainty Analysis for Engineers45

Variance Not Known We can show that this quantity follows a Student’s t-distribution with n-1 degrees of freedom (f) Uncertainty Analysis for Engineers46

Example Measure strength of rebar 25 samples Mean=37.5 psi s=3.5 psi Find 95% confidence interval for mean Uncertainty Analysis for Engineers47

Script Result is 36.06, xbar=37.5; s=3.5; n=25; alpha=0.05; ka=-tinv(1-alpha/2,n-1); kb=-tinv(alpha/2,n-1); cil=xbar+ka*s/sqrt(n) ciu=xbar+kb*s/sqrt(n) Uncertainty Analysis for Engineers48

One-Sided Confidence Limit Sometimes we only care about the upper or lower bounds Lower Upper Uncertainty Analysis for Engineers49

Example 100 steel specimens – measure strength Mean=2200 kgf; s=220 kgf Specify 95% confidence limit of mean Assume  =s=220 kgf 1-  =0.95;  =0.05 Uncertainty Analysis for Engineers50 Manufacturer has 95% confidence that yield strength is at least 2164 kgf

Example Now only 15 steel specimen Mean=2200 kgf; s=220 kgf Specify 95% confidence limit of mean Uncertainty Analysis for Engineers51 Manufacturer has 95% confidence that yield strength is at least 2100 kgf

Confidence Interval of Variance Uncertainty Analysis for Engineers52

Example 25 storms, sample variance for measured runoff is 0.36 in 2 Find upper 95% confidence limit for variance So, we can say, with 95% confidence, that the upper bound of the variance of the runoff is in 2 and the upper bound of the standard deviation is 0.79 in Uncertainty Analysis for Engineers53

Script var=0.36 n=25 alpha=0.05 c=chi2inv(alpha,n-1) ci=1/c*var*(n-1) si=sqrt(ci) Uncertainty Analysis for Engineers54

Measurement Theory Suppose we are measuring distances d 1, d 2, …, d n are measured distances Distance estimate is Standard error is ◦ s=standard deviation of sample ◦ d is the expected value of the mean Uncertainty Analysis for Engineers55