The two sample problem.

Slides:



Advertisements
Similar presentations
Agenda of Week V Review of Week IV Inference on MV Mean Vector One population Two populations Multi-populations: MANOVA.
Advertisements

Lecture 14 chi-square test, P-value Measurement error (review from lecture 13) Null hypothesis; alternative hypothesis Evidence against null hypothesis.
“Students” t-test.
Discrimination and Classification. Discrimination Situation: We have two or more populations  1,  2, etc (possibly p-variate normal). The populations.
Statistical Inferences Based on Two Samples
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
3.3 Toward Statistical Inference. What is statistical inference? Statistical inference is using a fact about a sample to estimate the truth about the.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Sample size computations Petter Mostad
Multivariate Distance and Similarity Robert F. Murphy Cytometry Development Workshop 2000.
Chap 11-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 11 Hypothesis Testing II Statistics for Business and Economics.
Horng-Chyi HorngStatistics II_Five43 Inference on the Variances of Two Normal Population &5-5 (&9-5)
1/45 Chapter 11 Hypothesis Testing II EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008.
Chapter Topics Types of Regression Models
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Chapter 9 Hypothesis Testing.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
1 (Student’s) T Distribution. 2 Z vs. T Many applications involve making conclusions about an unknown mean . Because a second unknown, , is present,
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Correlation. The sample covariance matrix: where.
Measures of Regression and Prediction Intervals
AM Recitation 2/10/11.
Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 
Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Repeated Measures Designs. In a Repeated Measures Design We have experimental units that may be grouped according to one or several factors (the grouping.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 11 Inferences About Population Variances n Inference about a Population Variance n.
Marginal and Conditional distributions. Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution.
MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations.
: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation 1 Montri Karnjanadecha ac.th/~montri.
Repeated Measures Designs. In a Repeated Measures Design We have experimental units that may be grouped according to one or several factors (the grouping.
Multivariate Analysis of Variance
Discrimination and Classification. Discrimination Situation: We have two or more populations  1,  2, etc (possibly p-variate normal). The populations.
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
§2.The hypothesis testing of one normal population.
Module 25: Confidence Intervals and Hypothesis Tests for Variances for One Sample This module discusses confidence intervals and hypothesis tests.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Sampling and Sampling Distributions
Statistical Inferences for Population Variances
Inference about the slope parameter and correlation
Ch5.4 Central Limit Theorem
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Chapter 4. Inference about Process Quality
Confidence Intervals and Hypothesis Tests for Variances for One Sample
CH 5: Multivariate Methods
Inference for the mean vector
John Loucks St. Edward’s University . SLIDES . BY.
Chapter 10 Two-Sample Tests.
Determining the distribution of Sample statistics
Chapter 11 Inferences About Population Variances
Bootstrap Confidence Intervals using Percentiles
CONCEPTS OF ESTIMATION
Using the Tables for the standard normal distribution
Inference about Two Means: Independent Samples
Comparing Populations
The Multivariate Normal Distribution, Part 2
Summary of Tests Confidence Limits
A graphical explanation
Multivariate Statistical Methods
CHAPTER 10 Comparing Two Populations or Groups
(Approximately) Bivariate Normal Data and Inference Based on Hotelling’s T2 WNBA Regular Season Home Point Spread and Over/Under Differentials
Chapter 10 Two-Sample Tests
Chapter-1 Multivariate Normal Distributions
Determination of Sample Size
Interval Estimation Download this presentation.
Presentation transcript:

The two sample problem

Univariate Inference Let x1, x2, … , xn denote a sample of n from the normal distribution with mean mx and variance s2. Let y1, y2, … , ym denote a sample of n from the normal distribution with mean my and variance s2. Suppose we want to test H0: mx = my vs HA: mx ≠ my

The appropriate test is the t test: The test statistic: Reject H0 if |t| > ta/2 d.f. = n + m -2

The multivariate Test Let denote a sample of n from the p-variate normal distribution with mean vector and covariance matrix S. Let denote a sample of m from the p-variate normal distribution with mean vector and covariance matrix S. Suppose we want to test

Hotelling’s T2 statistic for the two sample problem if H0 is true than has an F distribution with n1 = p and n2 = n +m – p - 1

Thus Hotelling’s T2 test We reject

Simultaneous inference for the two-sample problem Hotelling’s T2 statistic can be shown to have been derived by Roy’s Union-Intersection principle

Thus

Thus

Thus Hence

Thus form 1 – a simultaneous confidence intervals for

Example Annual financial data are collected for firms approximately 2 years prior to bankruptcy and for financially sound firms at about the same point in time. The data on the four variables x1 = CF/TD = (cash flow)/(total debt), x2 = NI/TA = (net income)/(Total assets), x3 = CA/CL = (current assets)/(current liabilties, and x4 = CA/NS = (current assets)/(net sales) are given in the following table.

The data are given in the following table:

A graphical explanation Hotelling’s T2 test A graphical explanation

Hotelling’s T2 statistic for the two sample problem

is the test statistic for testing:

Hotelling’s T2 test X2 Popn A Popn B X1

Univariate test for X1 X2 Popn A Popn B X1

Univariate test for X2 X2 Popn A Popn B X1

Univariate test for a1X1 + a2X2 Popn A Popn B X1

A graphical explanation Mahalanobis distance A graphical explanation

Euclidean distance

Mahalanobis distance: S, a covariance matrix

Hotelling’s T2 statistic for the two sample problem

Case I X2 Popn A Popn B X1

Case II X2 Popn A Popn B X1

In Case I the Mahalanobis distance between the mean vectors is larger than in Case II, even though the Euclidean distance is smaller. In Case I there is more separation between the two bivariate normal distributions