STATISTICAL INFERENCE PART IV

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

STATISTICAL INFERENCE PART IV LOCATION AND SCALE PARAMETERS 1.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
STATISTICAL INFERENCE PART V CONFIDENCE INTERVALS 1.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Inferential Statistics & Hypothesis Testing
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Hypothesis testing Some general concepts: Null hypothesisH 0 A statement we “wish” to refute Alternative hypotesisH 1 The whole or part of the complement.
1 Introduction to Estimation Chapter Introduction Statistical inference is the process by which we acquire information about populations from.
Parameter Estimation Chapter 8 Homework: 1-7, 9, 10.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Topic 2: Statistical Concepts and Market Returns
Inferences About Means of Single Samples Chapter 10 Homework: 1-6.
Chapter 2 Simple Comparative Experiments
STATISTICAL INFERENCE PART VI
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
1 Economics 173 Business Statistics Lectures 3 & 4 Summer, 2001 Professor J. Petry.
Confidence Intervals and Hypothesis Testing - II
Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
Testing of Hypothesis Fundamentals of Hypothesis.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
STATISTICAL INFERENCE PART IV CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 1.
MeanVariance Sample Population Size n N IME 301. b = is a random value = is probability means For example: IME 301 Also: For example means Then from standard.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
© Copyright McGraw-Hill 2004
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
n Point Estimation n Confidence Intervals for Means n Confidence Intervals for Differences of Means n Tests of Statistical Hypotheses n Additional Comments.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Chapter 10: The t Test For Two Independent Samples.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter 9 Introduction to the t Statistic
Chapter Nine Hypothesis Testing.
Inference: Conclusion with Confidence
More on Inference.
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
ESTIMATION.
3. The X and Y samples are independent of one another.
Tests of Significance The reasoning of significance tests
Inference: Conclusion with Confidence
Math 4030 – 10a Tests for Population Mean(s)
Chapter 2 Simple Comparative Experiments
Hypothesis Testing: Hypotheses
CONCEPTS OF HYPOTHESIS TESTING
Chapter 9 Hypothesis Testing.
More on Inference.
CONCEPTS OF ESTIMATION
Chapter 9 Hypothesis Testing.
Interval Estimation and Hypothesis Testing
STATISTICAL INFERENCE PART III
Chapter 10: One- and Two-Sample Tests of Hypotheses:
Confidence Intervals.
Inference on the Mean of a Population -Variance Known
Presentation transcript:

STATISTICAL INFERENCE PART IV LOCATION AND SCALE PARAMETERS, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LOCATION PARAMETER Let f(x) be any pdf. The family of pdfs f(x) indexed by parameter  is called the location family with standard pdf f(x) and  is the location parameter for the family. Equivalently,  is a location parameter for f(x) iff the distribution of X does not depend on .

Example If X~N(θ,1), then X-θ~N(0,1)  distribution is independent of θ.  θ is a location parameter. If X~N(0,θ), then X-θ~N(-θ,θ)  distribution is NOT independent of θ.  θ is NOT a location parameter.

t(x1+c,…,xn+c)= t(x1,…,xn) +c LOCATION PARAMETER Let X1,X2,…,Xn be a r.s. of a distribution with pdf (or pmf); f(x; ); . An estimator t(x1,…,xn) is defined to be a location equivariant iff t(x1+c,…,xn+c)= t(x1,…,xn) +c for all values of x1,…,xn and a constant c. t(x1,…,xn) is location invariant iff t(x1+c,…,xn+c)= t(x1,…,xn) Invariant = does not change

Example Is location invariant or equivariant estimator? Let t(x1,…,xn) = . Then, t(x1+c,…,xn+c)= (x1+c+…+xn+c)/n = (x1+…+xn+nc)/n = +c = t(x1,…,xn) +c  location equivariant

Example Is s² location invariant or equivariant estimator? Let t(x1,…,xn) = s²= Then, t(x1+c,…,xn+c)= =t(x1,…,xn) Location invariant (x1,…,xn) and (x1+c,…,xn+c) are located at different points on real line, but spread among the sample values is same for both samples.

SCALE PARAMETER Let f(x) be any pdf. The family of pdfs f(x/)/ for >0, indexed by parameter , is called the scale family with standard pdf f(x) and  is the scale parameter for the family. Equivalently,  is a scale parameter for f(x) iff the distribution of X/ does not depend on .

Example Let X~Exp(θ). Let Y=X/θ. You can show that f(y)=exp(-y) for y>0 Distribution is free of θ θ is scale parameter.

SCALE PARAMETER Let X1,X2,…,Xn be a r.s. of a distribution with pdf (or pmf); f(x; ); . An estimator t(x1,…,xn) is defined to be a scale equivariant iff t(cx1,…,cxn)= ct(x1,…,xn) for all values of x1,…,xn and a constant c>0. t(x1,…,xn) is scale invariant iff t(cx1,…,cxn)= t(x1,…,xn)

Example Is scale invariant or equivariant estimator? Let t(x1,…,xn) = . Then, t(cx1,…,cxn)= c(x1+…+xn)/n = c = ct(x1,…,xn)  Scale equivariant

LOATION-SCALE PARAMETER Let f(x) be any pdf. The family of pdfs f((x) /)/ for >0, indexed by parameter (,), is called the location-scale family with standard pdf f(x) and  is a location parameter and  is the scale parameter for the family. Equivalently,  is a location parameter and  is a scale parameter for f(x) iff the distribution of (X)/ does not depend on  and.

Example 1. X~N(μ,σ²). Then, Y=(X- μ)/σ ~ N(0,1) Distribution is independent of μ and σ² μ and σ² are location-scale paramaters 2. X~Cauchy(θ,β). You can show that the p.d.f. of Y=(X- β)/ θ is f(y) = 1/(π(1+y²))  β and θ are location-and-scale parameters.

LOCATION-SCALE PARAMETER Let X1,X2,…,Xn be a r.s. of a distribution with pdf (or pmf); f(x; ); . An estimator t(x1,…,xn) is defined to be a location-scale equivariant iff t(cx1+d,…,cxn+d)= ct(x1,…,xn)+d for all values of x1,…,xn and a constant c>0. t(x1,…,xn) is location-scale invariant iff t(cx1+d,…,cxn+d)= t(x1,…,xn)

INTERVAL ESTIMATION Point estimation of : The inference is a guess of a single value as the value of . No accuracy associated with it. Interval estimation for : Specify an interval in which the unknown parameter, , is likely to lie. It contains measure of accuracy through variance.

INTERVAL ESTIMATION An interval with random end points is called a random interval. E.g., is a random interval that contains the true value of  with probability 0.95.

INTERVAL ESTIMATION An interval (l(x1,x2,…,xn), u(x1,x2,…,xn)) is called a 100 % confidence interval (CI) for  if where 0<<1. The observed values l(x1,x2,…,xn) is a lower confidence limit and u(x1,x2,…,xn) is an upper confidence limit. The probability  is called the confidence coefficient or the confidence level.

INTERVAL ESTIMATION If Pr(l(x1,x2,…,xn))= , then l(x1,x2,…,xn) is called a one-sided lower 100 % confidence limit for  . If Pr( u(x1,x2,…,xn))= , then u(x1,x2,…,xn) is called a one-sided upper 100 % confidence limit for  .

METHODS OF FINDING PIVOTAL QUANTITIES PIVOTAL QUANTITY METHOD: If Q=q(x1,x2,…,xn) is a r.v. that is a function of only X1,…,Xn and , then Q is called a pivotal quantity if its distribution does not depend on  or any other unknown parameters (nuisance parameters). nuisance parameters: parameters that are not of direct interest

PIVOTAL QUANTITY METHOD Theorem: Let X1,X2,…,Xn be a r.s. from a distribution with pdf f(x;) for  and assume that an MLE (or ss) of  exists: If  is a location parameter, then Q=  is a pivotal quantity. If  is a scale parameter, then Q= / is a pivotal quantity. If 1 and 2 are location and scale parameters respectively, then are PQs for 1 and  2.

Note Example: If 1 and 2 are location and scale parameters respectively, then is NOT a pivotal quantity for 1 because it is a function of 2 A pivotal quantity for 1 should be a function of only 1 and X’s, and its distribution should be free of 1 and 2 .

Example X1,…,Xn be a r.s. from Exp(θ). Then, is SS for θ, and θ is a scale parameter. S/θ is a pivotal quantity. So is 2S/θ, and using this might be more convenient since this has a distribution of χ²(2n) which has tabulated percentiles.

CONSTRUCTION OF CI USING PIVOTAL QUANTITIES If Q is a PQ for a parameter  and if percentiles of Q say q1 and q2 are available such that Pr{q1 Q q2}=, Then for an observed sample x1,x2,…,xn; a 100% confidence region for  is the set of  that satisfy q1 q(x1,x2,…,xn;)q2.

EXAMPLE Let X1,X2,…,Xn be a r.s. of Exp(), >0. Find a 100 % CI for  . Interpret the result.

EXAMPLE Let X1,X2,…,Xn be a r.s. of N(,2). Find a 100 % CI for  and 2 . Interpret the results.

APPROXIMATE CI USING CLT Let X1,X2,…,Xn be a r.s. By CLT, Non-normal random sample The approximate 100(1−)% random interval for μ: The approximate 100(1 −)% CI for μ:

APPROXIMATE CI USING CLT Usually,  is unknown. So, the approximate 100(1)% CI for : Non-normal random sample When the sample size n ≥ 30, t/2,n-1~N(0,1).

Interpretation ( ) ( ) ( ) ( ) ( ) ( ) ( ) μ (unknown, but true value) ( ) ( ) ( ) ( ) ( ) ( ) ( ) μ (unknown, but true value) 90% CI  Expect 9 out of 10 intervals to cover the true μ

Graphical Demonstration of the Confidence Interval for m Confidence level 1 - a Lower confidence limit Upper confidence limit

The Confidence Interval for m ( s is known) Example: Suppose that s = 1.71 and n=100. Use a 90% confidence level. Solution: The confidence interval is

The Confidence Interval for m ( s is known) Recalculate the confidence interval for 95% confidence level. Solution: .95 .90

The Confidence Interval for m ( s is known) The width of the 90% confidence interval = 2(.28) = .56 The width of the 95% confidence interval = 2(.34) = .68 Because the 95% confidence interval is wider, it is more likely to include the value of m. With 95% confidence interval, we allow ourselves to make 5% error; with 90% CI, we allow for 10%.

The Width of the Confidence Interval The width of the confidence interval is affected by the population standard deviation (s) the confidence level (1-a) the sample size (n).

The Affects of s on the interval width 90% Confidence level Suppose the standard deviation has increased by 50%. To maintain a certain level of confidence, a larger standard deviation requires a larger confidence interval.

The Affects of Changing the Confidence Level 90% 95% Let us increase the confidence level from 90% to 95%. Larger confidence level produces a wider confidence interval

The Affects of Changing the Sample Size 90% Confidence level Increasing the sample size decreases the width of the confidence interval while the confidence level can remain unchanged.

Inference About the Population Mean when  is Unknown The Student t Distribution Standard Normal Student t

Effect of the Degrees of Freedom on the t Density Function Student t with 10 DF Student t with 2 DF Student t with 30 DF The “degrees of freedom”, (a function of the sample size) determine how spread the distribution is compared to the normal distribution.

Finding t-scores Under a t-Distribution (t-tables) Degrees of Freedom 1 2 3 4 5 6 7 8 9 10 11 12 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 t0.05, 10 = 1.812 .05 t 1.812

EXAMPLE A new breakfast cereal is test-marked for 1 month at stores of a large supermarket chain. The result for a sample of 16 stores indicate average sales of $1200 with a sample standard deviation of $180. Set up 99% confidence interval estimate of the true average sales of this new breakfast cereal. Assume normality.

ANSWER 99% CI for : (1067.3985, 1332.6015) With 99% confidence, the limits 1067.3985 and 1332.6015 cover the true average sales of the new breakfast cereal.

Checking the required conditions We need to check that the population is normally distributed, or at least not extremely nonnormal. Look at the sample histograms, Q-Q plots … There are statistical methods to test for normality

TESTS OF HYPOTHESIS A hypothesis is a statement about a population parameter. The goal of a hypothesis test is to decide which of two complementary hypothesis is true, based on a sample from a population.

TESTS OF HYPOTHESIS STATISTICAL TEST: The statistical procedure to draw an appropriate conclusion from sample data about a population parameter. HYPOTHESIS: Any statement concerning an unknown population parameter. Aim of a statistical test: test an hypothesis concerning the values of one or more population parameters.

NULL AND ALTERNATIVE HYPOTHESIS NULL HYPOTHESIS=H0 E.g., a treatment has no effect or there is no change compared with the previous situation. ALTERNATIVE HYPOTHESIS=HA E.g., a treatment has a significant effect or there is development compared with the previous situation.

TESTS OF HYPOTHESIS Sample Space, A: Set of all possible values of sample values x1,x2,…,xn. (x1,x2,…,xn) A Parameter Space, : Set of all possible values of the parameters. =Parameter Space of Null Hypothesis Parameter Space of Alternative Hypothesis = 0  1 H0:0 H1: 1

Not Reject H0 if (x1,x2,…,xn)C’ TESTS OF HYPOTHESIS Critical Region, C is a subset of A which leads to rejection region of H0. Reject H0 if (x1,x2,…,xn)C Not Reject H0 if (x1,x2,…,xn)C’ A test defines a critical region A test is a rule which leads to a decision to fail to reject or reject H0 on the basis of the sample information.

TEST STATISTIC AND REJECTION REGION TEST STATISTIC: The sample statistic on which we base our decision to reject or not reject the null hypothesis. REJECTION REGION: Range of values such that, if the test statistic falls in that range, we will decide to reject the null hypothesis, otherwise, we will not reject the null hypothesis.

TESTS OF HYPOTHESIS If the hypothesis completely specify the distribution, then it is called a simple hypothesis. Otherwise, it is composite hypothesis. =(1, 2) H0:1=3f(x;3, 2) H1:1=5f(x;5, 2) Composite Hypothesis If 2 is known, simple hypothesis.

Tests are based on the following principle: TESTS OF HYPOTHESIS H0 is False H0 is True Reject H0 Do not reject H0 Type I error P(Type I error) =  Correct Decision Type II error P(Type II error) =  1- 1- Tests are based on the following principle: Fix , minimize . ()=Power function of the test for all . = P(Reject H0)=P((x1,x2,…,xn)C)

TESTS OF HYPOTHESIS Type I error=Rejecting H0 when H0 is true

PROCEDURE OF STATISTICAL TEST Determining H0 and HA. Choosing the best test statistic. Deciding the rejection region (Decision Rule). Conclusion.

HYPOTHESIS TEST FOR POPULATION MEAN,   KNOWN AND X~N(, 2) OR LARGE SAMPLE CASE: Two-sided Test Test Statistic Rejecting Area H0:  = 0 HA:   0 Reject H0 if z < z/2 or z > z/2. /2 /2 1- -z/2 z/2 Reject H0 Reject H0 Do not reject H0

HYPOTHESIS TEST FOR POPULATION MEAN,  One-sided Tests Test Statistic Rejecting Area 1. H0:  = 0 HA:  > 0 Reject H0 if z > z. 2. H0:  = 0 HA:  < 0 Reject H0 if z < z.  1- z Do not reject H0 Reject H0  1- - z Do not reject H0 Reject H0