Unit 7 Today we will look at: Normal distributions

Slides:



Advertisements
Similar presentations
Introduction to Statistics
Advertisements

Significance Testing Chapter 13 Victor Katch Kinesiology.
1 Difference Between the Means of Two Populations.
Introduction to Hypothesis Testing
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Chapter Sampling Distributions and Hypothesis Testing.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Chapter 9 Hypothesis Testing.
Lecture 6. Hypothesis tests for the population mean  Similar arguments to those used to develop the idea of a confidence interval allow us to test the.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Chapter 10 Hypothesis Testing
Overview Definition Hypothesis
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Fundamentals of Hypothesis Testing: One-Sample Tests
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
Significance Testing Statistical testing of the mean (z test)
Significance Tests: THE BASICS Could it happen by chance alone?
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Outline Sampling Measurement Descriptive Statistics:
Chapter Nine Hypothesis Testing.
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests
Virtual University of Pakistan
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 10 Hypothesis Testing 1.
Step 1: Specify a null hypothesis
Introduction to Statistics: Probability and Types of Analysis
Assumptions For testing a claim about the mean of a single population
P-values.
Inference and Tests of Hypotheses
CHAPTER 6 Statistical Inference & Hypothesis Testing
The Normal Distribution
Chapters 20, 21 Hypothesis Testing-- Determining if a Result is Different from Expected.
Chapter 12 Tests with Qualitative Data
Distribution of the Sample Means
Sampling Distributions and The Central Limit Theorem
Hypothesis Testing: Hypotheses
Overview and Basics of Hypothesis Testing
Tests of significance: The basics
Hypothesis Tests for a Population Mean,
Hypothesis Testing Summer 2017 Summer Institutes.
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
The Normal Probability Distribution
Chapter 9 Hypothesis Testing.
Chapter 9 Hypothesis Testing.
LESSON 20: HYPOTHESIS TESTING
Problems: Q&A chapter 6, problems Chapter 6:
The Normal Distribution
Chapter Nine Part 1 (Sections 9.1 & 9.2) Hypothesis Testing
Hypothesis tests for the difference between two proportions
Virtual University of Pakistan
Hypothesis Tests for a Standard Deviation
Power Section 9.7.
The Normal Curve Section 7.1 & 7.2.
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and
Sampling Distributions and The Central Limit Theorem
Chapter 9 Hypothesis Testing: Single Population
The Normal Distribution
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

Unit 7 Today we will look at: Normal distributions Sampling distributions of the mean Hypothesis testing again: the z-test One sample z-test Two sample z-test Unit 7 Today we will look at:

Normal Distribution The normal distribution is bell shaped The mean is the central value about which the data values are clustered. The standard deviation is a measure of the spread of the data. The larger the standard deviation the more spread out the data is.

Normal Distribution So the normal distribution has location specified by the population mean, μ and spread specified by the population standard deviation, σ

Normal Distribution Normal distribution with different locations

Normal Distribution Normal distribution with different spreads

Which of the following variables do you think are distributed normally? The earnings of male employees in a company The heights of M140 students The measurement error of repeated measurements of the same quantity The number of heads obtained when tossing a coin a certain number of times The age of individuals in a human population The mean age of (randomly selected) groups of individuals in a human population

Which of the following variables do you think are distributed normally? The earnings of male employees in a company The heights of M140 students The measurement error of repeated measurements of the same quantity The number of heads obtained when tossing a coin a certain number of times The age of individuals in a human population The mean age of (randomly selected) groups of individuals in a human population

Sampling distribution of the sample mean What does it refer to? Choose one of the following options: the distribution of the various sample sizes the distribution of the different possible values of the sample mean the distribution of the values of the individuals in the population the distribution of the data values in a given sample none of the above

Is this distribution a normal distribution? Sampling distribution of the sample mean What does it refer to? Choose one of the following options: the distribution of the various sample sizes the distribution of the different possible values of the sample mean the distribution of the values of the individuals in the population the distribution of the data values in a given sample none of the above Is this distribution a normal distribution? Let us find out!!

Sampling Distribution of the mean of a sample size 2

Sampling Distribution of the mean of a sample size 3

Sampling distributions of the mean with different sample sizes. What do you observe happens to the distribution as the sample size increases?

The sampling distributions rise more and more sharply Sampling distributions of the mean with different sample sizes. The sampling distributions rise more and more sharply to a mode, but the mean is generally of a similar value The distributions become more and more compressed i.e. the standard deviation decreases.

Example: Is it normal? How would you describe this distribution? Sampling distribution of means based on earnings data in UK in 2011 No Is it normal? How would you describe this distribution? Right - skewed

Sampling distribution of the sample mean

Terminology Used Sample Population Mean: 𝑥 = 𝑥 𝑛 𝜇= 𝑥 𝑁 𝑥 = 𝑥 𝑛 𝜇= 𝑥 𝑁 Standard deviation : 𝑠= 𝑥− 𝑥 2 𝑛−1 𝜎= 𝑥−𝜇 2 𝑁−1

Normal Distribution (recap)

Standard Normal Distribution

Transforming to Standard Normal Distribution Consider the normal distribution of a variable x with mean μ = 10 and standard deviation σ = 2

Transforming to Standard Normal Distribution Shift the whole distribution to the left so that the mode occurs at zero v = x - 10

Transforming to Standard Normal Distribution Then we have distribution of v with standard deviation 2 Divide v by 2 to get distribution with standard deviation 1

Transforming to Standard Normal Distribution

Question 1 For the normal distributions with the following values of μ and σ, write down the formula to transform x to z a) μ = 25, σ = 4 𝑧= 𝑥−𝜇 𝜎 𝑧= 𝑥−25 4

Question 1 For the normal distributions with the following values of μ and σ, write down the formula to transform x to z b) μ = 75, σ = 20 𝑧= 𝑥−𝜇 𝜎 𝑧= 𝑥−75 20

Question 1 For the normal distributions with the following values of μ and σ, write down the formula to transform x to z c) μ = 2, σ = 0.5 𝑧= 𝑥−𝜇 𝜎 𝑧= 𝑥−2 0.5

Here x= 85, μ = 80 and σ = 4, so using the equation: z = 𝑥 − 𝜇 𝜎 Question 2 The mean number of seeds in a packet of beetroot seeds of a certain variety is 80, and the standard deviation is 4. a) What is the z value corresponding to a packet of seeds containing 85 seeds? 𝑧= 𝑥−𝜇 𝜎 Here x= 85, μ = 80 and σ = 4, so using the equation: z = 𝑥 − 𝜇 𝜎 z = 85 −80 4 = 5 4 = 1.25 Where is this on the normal distribution? To the right of 𝜇

Here x= 70, μ = 80 and σ = 4, so using the equation: z = 𝑥 − 𝜇 𝜎 Question 2 The mean number of seeds in a packet of beetroot seeds of a certain variety is 80, and the standard deviation is 4. b) What is the z value corresponding to a packet of seeds containing 70 seeds? 𝑧= 𝑥−𝜇 𝜎 Here x= 70, μ = 80 and σ = 4, so using the equation: z = 𝑥 − 𝜇 𝜎 z = 70 −80 4 = −10 4 = - 2.5 Where is this on the normal distribution? To the left of 𝜇

Question 3 The distribution of the wing span of house sparrows, wp, is normal with μ = 23cm and standard deviation σ = 0.67 cm. a) Transform this to a standard normal distribution 𝑧= 𝑥−𝜇 𝜎 𝑧= 𝑤𝑝−𝜇 𝜎 𝑧 = 𝑤𝑝−23 0.67

Question 3 b) Complete the sentence A wingspan of 24cm is standard deviations the mean wingspan of . 𝑧= 𝑥−𝜇 𝜎 𝑧= 24−23 0.67 =1.5 So a wingspan of 24cm is one and a half standard deviations above the mean wingspan of 23cm

Mean & Standard deviation of Sampling distribution of the sample mean This result holds generally for all sampling distributions, no matter what the population distribution and no matter what sample size is involved.

Central Limit Theorem

Question 4 Find the standard deviation of the mean for a sample with the following values: μ = 80, σ = 4, for the following sample sizes: 10 SE = 𝜎 𝑛 Here μ = 80, σ = 4 and n = 10 So SE = 𝜎 𝑛 = 4 10 = 1.26 (𝑢𝑝𝑡𝑜 2𝑑𝑝)

Question 4 Find the standard deviation of the mean for a sample with the following values: μ = 80, σ = 4, for the following sample sizes: b) 20 SE = 𝜎 𝑛 Here μ = 80, σ = 4 and n = 20 So SE = 𝜎 𝑛 = 4 20 = 0.89 (𝑢𝑝𝑡𝑜 2𝑑𝑝)

Question 4 Find the standard deviation of the mean for a sample with the following values: μ = 80, σ = 4, for the following sample sizes: c) 100 SE = 𝜎 𝑛 Here μ = 80, σ = 4 and n = 100 So SE = 𝜎 𝑛 = 4 100 = 0. 4

What can we deduce from these results? Question 4 n SE 10 1.26 20 0.89 100 0.4 What can we deduce from these results? As the sample size increases, the standard error of the mean decreases, i.e. the distribution becomes more compressed. We hope the results are more accurate

Question 5 Suppose that a given set of M140 final results have a mean of 72 marks and a standard deviation of 15 marks. a) Find the standard deviation of the sample mean results of samples of 30 students. SE = 𝜎 𝑛 Here n= 20 and σ = 15, so the standard deviation of the sample mean score is 𝜎 𝑛 = 15 30 = 2.738… = 2.74(upto 2dp)

Recall Central Limit Theorem Question 5 Suppose that a given set of M140 final results have a mean of 72 marks and a standard deviation of 15 marks. b) Given the standard error of the means is 𝜎 𝑛 =3.35, what is approximate distribution of the mean for sample of size 30? Recall Central Limit Theorem Sampling distribution of the mean score is approximately Normal with mean μ = 72 and standard deviation σ/√n = 2.74

One Sample z - test Four stages of hypothesis testing Set up the hypothesis that we wish to test Determine the sampling distribution of a test statistic Ascertain how unlikely the observed value of the test statistic is Interpret the results Under the assumption that the null hypothesis is true If the test statistic turns out to have a very unlikely value, then either: a very unusual event has happened, or the sample has provided evidence against the correctness of null hypothesis

Alternative hypothesis One Sample z – test: hypothesis The hypothesis that the population mean is equal to A is known as the Null hypothesis; denoted by H0 Null hypothesis H0: μ = A Alternative hypothesis H1: μ ≠ A

One Sample z – test: hypothesis Example: Let’s go back to M140 students in Question 5: A given set of M140 final results have a mean of 72 marks and a standard deviation of 15 marks. Write down the null and alternative hypothesis. H0: The mean mark for students on M140 is equal to 72 H0: μ = 72 H1: The mean mark for students on M140 is not equal to 72 H1:μ ≠ 72

One Sample z – test: Test Statistic If σ is unknown but the sample size is 25 or more z = 𝑥 −𝐴 𝐸𝑆𝐸 where ESE = 𝑠 𝑛 where 𝑥 is the sample mean and s is the sample standard deviation

𝑧= 𝑥 −𝐴 𝑆𝐸 = 70 −72 15/ 600 =−3.265… ̴ - 3.27 SE = 𝞼 𝑛 = 15 600 Question 6 Suppose there are 600 students on the Feb presentation and their mean score is 𝑥 =70 and assume their standard deviation is σ = 15. Calculate the test statistic. σ is known z = 𝑥 −𝐴 𝑆𝐸 where SE = 𝞼 𝑛 SE = 𝞼 𝑛 = 15 600 𝑧= 𝑥 −𝐴 𝑆𝐸 = 70 −72 15/ 600 =−3.265… ̴ - 3.27 Interpret the result

Critical Values & Critical Region One Sample z – test: Critical Values & Critical Region We know if the null hypothesis, H0, is true, then z should follow the standard normal distribution. So this distribution has a mean of zero. So if the value of z given by our data is too extreme i.e. very large in size ( + or - ) it would suggest that H0 is false. We perform the test at 5% and 1% significance level

Critical Values & Critical Region One Sample z – test: Critical Values & Critical Region At 5% significance level At 1% significance level The values defining the inner ends of the critical region are the critical values

One Sample z – test: Analysis & Conclusion Strong evidence against H0 Moderate evidence against H0 Either little or weak evidence against H0

Let’s go back to M140 students in Q5 & Q6 A given set of M140 final results have a mean of 72 marks and a standard deviation of 15 marks. Suppose there are 600 students on the Feb presentation and their mean score is 𝑥 =70 Null & Alternate hypotheses: H0: μ = 72 H1: μ ≠ 72 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 𝑖𝑠 𝑧 ̴ - 3.27 Interpret the result So there is strong evidence to reject the null hypothesis We conclude that the mean scores for this group of M140 students does appear to be different from 72 marks. And indeed seems to be lower.

Let Minitab do the work!

Let minitab do the work!

Minitab Results! Remember the p-value?

One Sample z – test: Recap

One Sample z – test: Recap Procedure Pg 140 Unit 7

The null and alternate hypothesis are: H0: μ = 22 H1: μ ≠ 22 Question 7 A sample of 55 cuckoo eggs found in the nests of wrens had a sample mean of 21.13mm and a sample deviation of 2.4 a) Investigate whether the mean of this sample differs from the population mean, believed to be 22mm σ is unknown z = 𝑥 −𝐴 𝐸𝑆𝐸 where ESE = 𝑠 𝑛 The null and alternate hypothesis are: H0: μ = 22 H1: μ ≠ 22 Sample mean and standard deviation are: 𝑥 = 21.13, s = 2.4 So ESE = 𝑠 𝑛 = 2.4 55 Therefore z = 𝑥 −𝐴 𝐸𝑆𝐸 = 21.13 −22 2.4/ 55 = -2.688 … = - 2.69

What evidence does this provide? Question 7 z = -2.69 What evidence does this provide? Since -2.69 ≤ -2.58, H0 is rejected at the 1% significance level. Therefore there is strong evidence against H0 . Thus there is strong evidence that the mean length of cuckoo eggs is not 22mm. Since z is negative, the length of cuckoo eggs is more likely to be less than 22mm.

How would this result differ if the sample Question 7 A sample of 55 cuckoo eggs found in the nests of wrens had a sample mean of 21.13mm and a sample standard deviation of 2.4 Investigate whether the mean of this sample differs from the population mean, believed to be 22mm How would this result differ if the sample standard deviation had been 8.7? σ is unknown z = 𝑥 −𝐴 𝐸𝑆𝐸 where ESE = 𝑠 𝑛 The null and alternate hypothesis are: H0: μ = 22 H1: μ ≠ 22 Sample mean and standard deviation are: 𝑥 = 21.13, s = 8.7 So ESE = 𝑠 𝑛 = 8.7 55 Therefore z = 𝑥 −𝐴 𝐸𝑆𝐸 = 21.13 −22 8.7/ 55 = - 0.741… = - 0.74

What evidence does this provide? Question 7 z = -0.74 What evidence does this provide? Since -1.96 ≤ - 0.74 ≤ 1.96, we do not reject H0 at 5% significance level. Therefore there is little evidence against H0 . Thus there is strong evidence that the mean length of cuckoo eggs is 22mm, confirming the value given for the population mean

Two Sample z – test is used to analyse the difference in locations between two populations, say G and B The null hypothesis is: H0: μG = μB Or equivalently H0: μG - μB = 0 The alternative hypothesis is: H1: μG ≠ μB Or equivalently H0: μG - μB ≠ 0

Sampling distribution of the difference between two means:

Mean and Standard Deviation Sampling distribution of the difference between two means: Mean and Standard Deviation

Two Sample z – test: Test Statistic Note: If σA and σB are not available then the sample standard deviation values sG and sB are used

Two Sample z – test: Procedure Pg 149 Unit 7

One Sample z – test: Recap Again if σA and σB are not available then the sample standard deviation values sA and sB are used Here the value of SE is calculated as follows: SE = σA 2 nA + σB 2 nB where nAand nBare the sample sizes and σA and σB are the population standard deviations. ESE = sA 2 nA + sB 2 nB The test statistic is z = xA − xB 𝑆𝐸 where xA and xB are the sample means z = xA − xB 𝐸𝑆𝐸

Null Hypothesis H0: μf = μm Question 8 Suppose we wanted to find out if males or females performed better on iCMA41 in M140. Suggest possible hypotheses. Null Hypothesis H0: μf = μm (mean mark for females is equal to mean mark for males) Alternate Hypothesis H1: μf ≠ μm (mean mark for females is not equal to mean mark for males)

Question 9 Suppose there are 700 females, 500 males and the sample mean score for the females is 74 and the sample mean score for the males is 72. The sample standard deviation for the females is 16 and the sample standard deviation for the males is 18. Calculate the test statistic and interpret the result. 𝐸𝑆𝐸= 𝑠 𝑓 2 𝑛 𝑓 + 𝑠 𝑚 2 𝑛 𝑚 = 16 2 700 + 18 2 500 ≈1.0068 𝑆𝑜 𝑧= 𝑥 𝑓 − 𝑥 𝑚 𝐸𝑆𝐸 = 74−72 1.0068 ≈ 1.99

What evidence does this provide? z = 1.99 What evidence does this provide? Question 9 We reject H0 at the 5% significance level but not at the 1% significance level. So there is moderate evidence to reject the null hypothesis. It seems that females are indeed better at iCMA41!!

Question 10 Mean, x St dev, s Sample size, n Species A 128.4 6.99 32 Species B 122.8 10.72 29 Carry out a hypothesis test to investigate whether the joint sizes are the same for both species Null Hypothesis H0: μA = μB Alternate Hypothesis H1: μA ≠ μB = 𝑠 𝐴 2 𝑛 𝐴 + 𝑠 𝐵 2 𝑛 𝐵 𝐸𝑆𝐸 = 6.99 2 32 + 10.72 2 29 ≈2.3429 𝑆𝑜, 𝑧 = 128.4 −122.8 2.3429 ≈2.39 = 𝑥 𝐴 − 𝑥 𝐵 𝐸𝑆𝐸

What does this result mean? z = 2.39, What does this result mean? Question 10 Since 1.96 ≤ 2.39 < 2.58, then the H0 is rejected at the 5% significance level, but not at the 1% level So we can reject the null hypothesis at the 5% level but not at the 1% level, so there is evidence against the null hypothesis, but not very strong evidence. There is moderate evidence that the joint sizes are different. There is weak evidence that the joint sizes are the same, in fact the joints of species A may be larger than those of species B

True or false? The sampling distribution of the sample mean is always a normal distribution 99.7% of the population values for a normally distributed variable fall within three standard deviations of the mean The z value represents the number of standard deviations from the mean The mean and median of a normally distributed variable may be different Any normal distribution is symmetric about the mean The curve touches the x-axis at approximately 4 standard deviations below and above the mean

True or false?  

Creating a sampling distribution of the mean Let the variable x be the number shown when a six sided die is thrown What is the population mean and standard deviation

Creating a sampling distribution of the mean A six sided die is rolled What is the population mean and standard deviation? There are six possible outcomes: 1, 2, 3, 4, 5, 6. So mean 𝞵= 1+2+3+4+5+6 6 = 3.5

Creating a sampling distribution of the mean We know the mean 𝞵= 3.5. We use it to calculate the deviations and squared deviations from mean 1 -2.5 6.25 2 -1.5 2.25 3 -0.5 0.25 4 0.5 5 1.5 6 2.5 1 2 3 4 5 6

Creating a sampling distribution of the mean 1 2 3 4 5 6 -2.5 6.25 2 -1.5 2.25 3 -0.5 0.25 4 0.5 5 1.5 6 2.5

Creating a sampling distribution of the mean Let the variable x be the number shown when a six sided die is thrown. How many possible samples of size 2 are there? The possible samples and the resulting sample means are shown in the table below:

Resulting sampling distribution of the mean

Resulting sampling distribution of the mean

Creating a sampling distribution of the mean

Creating a sampling distribution of the mean

Creating a sampling distribution of the mean

Creating a sampling distribution of the mean We observe the distribution becomes more and more symmetric and bell shaped. This symmetric bell-shaped distribution is called Normal Distributions. peaked and compressed about this peak.

Where would z = 2.39 be on the normal distribution diagram?