Chapter 4. Elements of Statistics # brief introduction to some concepts of statistics # descriptive statistics inductive statistics(statistical inference)

Slides:



Advertisements
Similar presentations
Inference in the Simple Regression Model
Advertisements

Chapter 4: Basic Estimation Techniques
“Students” t-test.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
PSY 307 – Statistics for the Behavioral Sciences
Chapter 10 Simple Regression.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
Chapter 11 Multiple Regression.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
Chapter 2 Simple Comparative Experiments
Inferences About Process Quality
SIMPLE LINEAR REGRESSION
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
SIMPLE LINEAR REGRESSION
AM Recitation 2/10/11.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Review for Exam 2 (Ch.6,7,8,12) Ch. 6 Sampling Distribution
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
© Copyright McGraw-Hill 2004
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Analysis of Experimental Data; Introduction
1 Estimation of Population Mean Dr. T. T. Kachwala.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Beginning Statistics Table of Contents HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Virtual University of Pakistan
Inference about the slope parameter and correlation
Chapter 4: Basic Estimation Techniques
Chapter 9 Hypothesis Testing.
Chapter 4 Basic Estimation Techniques
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
Hypotheses and test procedures
Other confidence intervals
Basic Estimation Techniques
3. The X and Y samples are independent of one another.
Chapter 2 Simple Comparative Experiments
Basic Estimation Techniques
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Chapter 9 Hypothesis Testing.
Interval Estimation and Hypothesis Testing
Simple Linear Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Presentation transcript:

Chapter 4. Elements of Statistics # brief introduction to some concepts of statistics # descriptive statistics inductive statistics(statistical inference) # Classification of the field of statistics i) Sampling theory ii) Estimation theory iii) Hypothesis testing iv) Curve fitting or Regression v) Analysis of variance

4.2 Sampling Theory–the Sample Mean How many samples are required for a given degree of confidence in the result? # Terminology - population N(size of population) very large or - (random) sample n(size of sample) # one of the most important quantities is the sample mean How close the sample mean might be to the average value of the population?

Let the sample have the numerical value of x 1, x 2, … x n Then, the sample mean is given by Note that we are interested in the statistical properties of arbitrary random samples rather than any particular sample. That is, the sample mean becomes a random variable. Therefore, it is appropriate to denote the sample mean as

We want the mean value of the sample mean close to the true mean value of the population the mean value of the sample mean = the true mean value of the population The sample mean is a unbiased estimate of the true mean. But, this is not sufficient to indicate whether the sample mean is a good estimator of the true population mean.

The variance of the sample mean ? N n (population sampling.) Var mean square of - square of the mean

: statistically indep. Var (!)

Where is the true variance of the population As n =>, Variance => 0, Which means that large sample sizes lead to a better estimate * : 1)N N sampling with replacement

2)N replace Var N-> N = n 0 ( !) `Two examples : pp163 ~165

4.3 Sampling Theory – The sample Variance The population variance is needed for determining the sample size required to achieve a desired variance of the sample mean (see eq. 4-4) Definition(Sample Variance): The expected value of the sample variance can be derived easily using not the true variance, that is, a biased estimate rather than an unbiased one

Now, we redefine the sample variance for having an unbiased estimate of the population variance : Note that these hold for very large N, that is, N=. How about when the population size is not large?

# When N is not large, the expected value of S 2 is given by For obtaining an unbiased estimate, we redefine # The variance of the estimates of the variance : the variance of S 2 : the variance of : where is the 4th central moment of the population

4.4 Sampling Distributions & Confidence Intervals what is the probability that the estimates are within specified bounds? p,d,f 2, sample mean ! normalized sample mean Xi Gaussian and independent => Gaussian (0,1)

X i not Gaussian n=> Z asymptotically Gaussian by the central limit theorem (n n30 ; A rule of thumb) H.W) Solve the problems in chap.4; 4-2.1, 4-2.5, 4-3.1, 4-4.1, 4-5.1, 4-6.1

No longer Gaussian => Student s t distribution with n-1 d.of f. p

`pdf of student s t distribution Where the gamma heavier tails (n 30) n any = ! integer

( ) confidence interval ? interval estimate ( ) q- percent confidence interval (q/100 )

k q pdf. k p (q k )

) q=95% -> (q=99% !)

: q from PDF F Prob. Distribution for Student s + function (See Appendix F or Table 4-2 page 172 for v = 8 )

4.5 Hypothesis Testing The question arises; How does one decide to accept or reject a given hypothesis when the sample size and the confidence level are specified?

Two steps; i) to make some hypothesis about the population ii) to determine if the observed sample confirms or rejects this hypothesis.

Two tests; one-sided or two-sided. The average life time of the light bulb >= 1000 hours 100ohms resisters too high or too low

One-sided test ) A capacitor manufacturer claims that a mean value of breakdown voltage >= 300 V a sample of 100 capacitors –> 99% confidence level is used ) Is the manufacturer s claim valid? ) We would reject the hypothesis!

Normalized r, v, Z 99%

99.5% – accept the hypothesis less likely more severe requirement

(level of significance) (100% - ) more severe!

) sample size=9, no longer Gaussian -> Student s + distribution v=n-1=8 dof 99%, – accept the hypothesis

a small sample size t heavier tail t distribution more likely to exceed the critical value small size less reliable(less severe) than large size tests

Two-sided test ) A manufacture of Zener diodes claims that the true mean breakdown voltage = 10V ) hypothesis : the true accepts or rejects? 100 samples -> 95%

) Rejected! z is outside the interval,

) 9 samples t is inside the interval, accepted! –Less severe than a large sample test

4.6 Curve Fitting and Linear Regression ( ), x y. 1 (linear) or 2 (correlation analysis) x y.

–Scatter diagram ( ) data -n samples

-Curve fitting to find a mathematical relationship regression curve (equation) ; resulting curve

-What is the best fit? In a least squares sense –Let be the errors between the regression curve and the scatter diagram – minimum. – the type of equation to be fitted to the data n smoothing

Linear regression a, b ?

)

MATLAB in function, p = polyfit(y, x, n)

A second-order regression ( p.180, 4-3, 4-6)

4.7 Correlation between Two Sets of Data Two data sets correlated or not?

Linear correlation coefficient Pearson s r Usage ; useful in determining the sources of errors ) a point-to-point digital communication link BER(Bit Error Rate) link quality BER may fluctuate randomly due to wind ) error source wind ? wind 20 resulting BER correlation test r=0.891 yes!