Normal Distribution The shaded area is the probability of z > 1.

Slides:



Advertisements
Similar presentations
4.3 NORMAL PROBABILITY DISTRIBUTIONS
Advertisements

Chapter 7, Sample Distribution
THE CENTRAL LIMIT THEOREM
Estimation of Means and Proportions
T HE ‘N ORMAL ’ D ISTRIBUTION. O BJECTIVES Review the Normal Distribution Properties of the Standard Normal Distribution Review the Central Limit Theorem.
“Students” t-test.
Statistical Inferences Based on Two Samples
One sample means Testing a sample mean against a population mean.
Lecture 5 1 Continuous distributions Five important continuous distributions: 1.uniform distribution (contiuous) 2.Normal distribution  2 –distribution[“ki-square”]
Commonly Used Distributions
Chapter 6 Sampling and Sampling Distributions
Sampling: Final and Initial Sample Size Determination
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Sampling distributions. Example Take random sample of students. Ask “how many courses did you study for this past weekend?” Calculate a statistic, say,
Continuous Random Variables and Probability Distributions
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Standard Normal Distribution
Class notes for ISE 201 San Jose State University
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Central Limit Theorem The Normal Distribution The Standardised Normal.
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Chapter 11: Random Sampling and Sampling Distributions
Continuous Probability Distribution  A continuous random variables (RV) has infinitely many possible outcomes  Probability is conveyed for a range of.
Use of Quantile Functions in Data Analysis. In general, Quantile Functions (sometimes referred to as Inverse Density Functions or Percent Point Functions)
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
1 Normal Random Variables In the class of continuous random variables, we are primarily interested in NORMAL random variables. In the class of continuous.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Identify.
Chapter 10 – Sampling Distributions Math 22 Introductory Statistics.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Exploratory Data Analysis Observations of a single variable.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Report Writing. A report should be self-explanatory. It should be capable of being read and understood without reference to the original project description.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Sampling and estimation Petter Mostad
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Introduction to Inference Sampling Distributions.
Chapter 18 Sampling Distribution Models *For Means.
Ch4: 4.3The Normal distribution 4.4The Exponential Distribution.
Normal Distributions. Probability density function - the curved line The height of the curve --> density for a particular X Density = relative concentration.
THE NORMAL DISTRIBUTION
The normal distribution
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Describing Location in a Distribution
Chapter 8: Estimating with Confidence
Chapter 7 Sampling Distributions.
Chapter 7: Sampling Distributions
CI for μ When σ is Unknown
Chapter 7 Sampling Distributions.
The normal distribution
Warmup To check the accuracy of a scale, a weight is weighed repeatedly. The scale readings are normally distributed with a standard deviation of
Chapter 8: Estimating with Confidence
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
CHAPTER 15 SUMMARY Chapter Specifics
MATH 2311 Section 4.2.
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
Central Limit Theorem: Sampling Distribution.
Chapter 8: Estimating with Confidence
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Basic Practice of Statistics - 3rd Edition The Normal Distributions
MATH 2311 Section 4.2.
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Normal Distribution The shaded area is the probability of z > 1

The normal distribution is actually a family of distributions, all with the same shape and parameterised by mean , and standard deviation . It is usually defined by a reference member of the family which is used to define other members. This reference member has  =0 and  =1.

Definition: A random variable Z has a normal (or Gaussian) distribution with mean 0 and standard deviation 1, if and only if its distribution function Ф(z) (defined by p(Z  z) ) is given by we write Z ~ N(0, 1) and say that Z has a standard normal distribution

Definition: A random variable X has a normal (or Gaussian) distribution with mean  and standard deviation , if and only if we write X ~ N( ,  2 ) and say that X has a normal distribution

The normal distribution is symmetric about its mean . In particular, if Z ~ N(0, 1), then p(Z ≤ -z) = p(Z ≥ z) i.e. Ф(-z) +Ф(z) = 1 for all z

Whatever the values of  and , the area between  - 2  and  + 2  is always 0.95 (95%).

Similarly, Whatever the values of  and , the area between  -  and  +  is always 0.68 (68%).

Example It has been suggested IQ scores follow a normal distribution with mean 100 and standard deviation 15. Find the probability that any person chosen at random will have (a) An IQ less than 70 (b) An IQ greater than 110 (c) An IQ between 70 and 110.

In R, The function dnorm gives the density of the normal distribution. Generally more useful, though, is pnorm, which gives the cumulative distribution function.

So in the IQ example, the probability of an IQ less than 70 is: > pnorm(70,100,15) [1] > Approximately

And the probability of an IQ less than 110 is: > pnorm(110,100,15) [1] >

Thus, the probability of an IQ more than 110 is > t=pnorm(110,100,15) > 1-t [1] > Approximately

Finally, for the probability of an IQ between 70 and 110, carry out a subtraction. > pnorm(110,100,15) - pnorm(70,100,15) [1] > Approximately

Alternatively,

> pnorm(0.6667) - pnorm(-2) [1] > These are the converted variables in the standardised normal (z) scales. The answer is, of course, the same.

z = -2 z =0.6667

The Central Limit Theorem

Let X 1, X 2 ………. X n be independent identically distributed random variables with mean µ and variance σ 2. Let S = X 1,+ X 2+ ………. +X n Then elementary probability theory tells us that E(S) = nµ and var(S) = nσ 2. The Central Limit Theorem (CLT) further states that, provided n is not too small, S has an approximately normal distribution with the above mean nµ, and variance nσ 2.

In other words, S approx ~ N(nµ, nσ 2 ) The approximation improves as n increases. We will use R to demonstrate the CLT.

Let X 1,X 2 ……X 6 come from the Uniform distribution, U(0,1) 01 1

For any uniform distribution on [A,B], µ is equal to and variance, σ 2, is equal to So for our distribution, µ= 1/2 and σ 2 = 1/12

The Central Limit Theorem therefore states that S should have an approximately normal distribution with mean nµ (i.e. 6 x 0.5 = 3) and var nσ 2 (i.e. 6 x 1/12 = 0.5) This gives standard deviation In other words, S approx ~ N(3, )

Generate results in each of six vectors for the uniform distribution on [0,1] in R. > x1=runif(10000) > x2=runif(10000) > x3=runif(10000) > x4=runif(10000) > x5=runif(10000) > x6=runif(10000) >

Let S = X 1,+ X 2+ ………. +X 6 > s=x1+x2+x3+x4+x5+x6 > hist(s,nclass=20) >

Consider the mean and standard deviation of S > mean(s) [1] > sd(s) [1] > This agrees with our earlier calculations

A method of examining whether the distribution is approximately normal is by producing a normal Q-Q plot. This is a plot of the sorted values of the vector S (the “data”) against what is in effect a idealised sample of the same size from the N(0,1) distribution.

If the CLT holds good, i.e. if S is approximately normal, then the plot should show an approximate straight line with intercept equal to the mean of S (here 3) and slope equal to the standard deviation of S (here 0.707).

> qqnorm(s) >

From these plots it seems that agreement with the normal distribution is very good, despite the fact that we have only taken n = 6, i.e. the convergence is very rapid!

Application Confidence Intervals for Mean

Suppose that the random variables Y 1,Y 2, …………Y n model independent observations from a distribution with mean µ and variance σ 2. Then is the sample mean.

Now by the CLT This is because µ is replaced by µ/n and σ by σ /n (for means)

Recall from Statistics 2 that, if σ 2 is estimated by the sample variance, s 2, an approximate confidence interval for µ is given by: Here y is the observed sample mean, and z is proportional to the level of confidence required. _

So for 95% confidence an approximate interval for µ is given by: 2 is approximate - an accurate value can be obtained from tables or by using the qnorm function on R.

> qnorm(0.975) [1] > qnorm(0.995) [1] > qnorm(0.025) [1] >

Thus in R, an approximate 95% confidence interval for the mean µ is given by > mean(y)+c(-1,1)*qnorm(0.975)*sqrt(var(y)/length(y)) where y is the vector of observations. A more accurate confidence interval, allowing for the fact that s 2 is only an estimate of σ 2,is given by use of the function t.test.