Normal Distributions (aka Bell Curves, Gaussians) Spring 2010.

Slides:



Advertisements
Similar presentations
For Explaining Psychological Statistics, 4th ed. by B. Cohen
Advertisements

The Normal Distribution
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Normal distribution. An example from class HEIGHTS OF MOTHERS CLASS LIMITS(in.)FREQUENCY
Chapter 6: Standard Scores and the Normal Curve
Theoretical Probability Distributions We have talked about the idea of frequency distributions as a way to see what is happening with our data. We have.
Standard Normal Table Area Under the Curve
Chapter 9: The Normal Distribution
2-5 : Normal Distribution
The Normal Distribution
PSY 307 – Statistics for the Behavioral Sciences
1.2: Describing Distributions
Did you know ACT and SAT Score are normally distributed?
14.4 The Normal Distribution
The Normal Distributions
Chapter 11: Random Sampling and Sampling Distributions
STANDARD SCORES AND THE NORMAL DISTRIBUTION
Objectives (BPS 3) The Normal distributions Density curves
Measures of Central Tendency
Normal Distributions.
Looking at Data - Distributions Density Curves and Normal Distributions IPS Chapter 1.3 © 2009 W.H. Freeman and Company.
1.3 Psychology Statistics AP Psychology Mr. Loomis.
Z Scores and The Standard Normal Curve
The Normal Distribution The “Bell Curve” The “Normal Curve”
Section 7.1 The STANDARD NORMAL CURVE
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
Chapter 6 Probability. Introduction We usually start a study asking questions about the population. But we conduct the research using a sample. The role.
Chapter 9 – 1 Chapter 6: The Normal Distribution Properties of the Normal Distribution Shapes of Normal Distributions Standard (Z) Scores The Standard.
Copyright © 2011 Pearson Education, Inc. Putting Statistics to Work.
Stat 1510: Statistical Thinking and Concepts 1 Density Curves and Normal Distribution.
Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
Chapter 5 The Normal Curve. Histogram of Unemployment rates, States database.
Chapter 5 The Normal Curve. In This Presentation  This presentation will introduce The Normal Curve Z scores The use of the Normal Curve table (Appendix.
Copyright © 2012 by Nelson Education Limited. Chapter 4 The Normal Curve 4-1.
Chapter 6 The Normal Curve. A Density Curve is a curve that: *is always on or above the horizontal axis *has an area of exactly 1 underneath it *describes.
Some probability distribution The Normal Distribution
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.2 Probabilities for Bell-Shaped Distributions.
AP Review #3: Continuous Probability (The Normal Distribution)
© 2010 Pearson Prentice Hall. All rights reserved. CHAPTER 12 Statistics.
Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.
Chapter 6 The Normal Distribution. 2 Chapter 6 The Normal Distribution Major Points Distributions and area Distributions and area The normal distribution.
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter The Normal Probability Distribution 7.
Central Tendency & Dispersion
IPS Chapter 1 © 2012 W.H. Freeman and Company  1.1: Displaying distributions with graphs  1.2: Describing distributions with numbers  1.3: Density Curves.
The Normal Curve & Z Scores. Example: Comparing 2 Distributions Using SPSS Output Number of siblings of students taking Soc 3155 with RW: 1. What is the.
THE NORMAL DISTRIBUTION AND Z- SCORES Areas Under the Curve.
Modeling Distributions
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Bell-Shaped Curves and Other Shapes Chapter 8.
The Normal distribution and z-scores
STA Lecture 151 STA 291 Lecture 15 – Normal Distributions (Bell curve)
Lecture 9 Dustin Lueker. 2  Perfectly symmetric and bell-shaped  Characterized by two parameters ◦ Mean = μ ◦ Standard Deviation = σ  Standard Normal.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 16, 2009.
Normal Distribution S-ID.4 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages.
Chapter 131 Normal Distributions. Chapter 132 Thought Question 2 What does it mean if a person’s SAT score falls at the 20th percentile for all people.
Z-scores, normal distribution, and more.  The bell curve is a symmetric curve, with the center of the graph being the high point, and the two sides on.
1 Copyright © 2014, 2012, 2009 Pearson Education, Inc. Chapter 5 The Standard Deviation as a Ruler and the Normal Model.
Density Curves & Normal Distributions Textbook Section 2.2.
Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
THE NORMAL DISTRIBUTION
The Normal Distributions.  1. Always plot your data ◦ Usually a histogram or stemplot  2. Look for the overall pattern ◦ Shape, center, spread, deviations.
Continuous random variables
Chapter 4: The Normal Distribution
Chapter 6 The Normal Curve.
The Normal Probability Distribution
Measuring location: percentiles
STA 291 Spring 2008 Lecture 9 Dustin Lueker.
Presentation transcript:

Normal Distributions (aka Bell Curves, Gaussians) Spring 2010

You’ve probably all seen a bell curve…

The Normal distribution is common Lots of real data follows a normal shape. For example 1) Many/most biometric measurements (heights, femur lengths, skull diameters, etc.) 2) Scores on many standardized exams (IQ tests) are forced into a normal shape before reporting 3) Many quality control measurements, if you take the log first, have a normal shape.

When sampling from a normal Normal distributions are typically characterized by two numbers, their mean or “expected value” which corresponds to the peak, and their “standard deviation” which is the distance from the mean to the inflection point. Large standard deviations result in “spread out” normals. Small standard deviations result in “strongly peaked” distributions.

Mean (µ) and Standard deviation (σ) for a normal distribution µ (here 25) this distance is σ height is about 60% of peak here

Two normals, corresponding to different standard deviations. Mean=100, std.dev = 16 Mean=100, std.dev = 4

The EDA of Normal distributions The central location of a normal distribution is given by the mean μ. (The median is also μ). The spread is given by the standard deviation σ. (The interquartile range is 1.35σ, but you do NOT need to know that) Normal distributions are symmetric and typically have few, if any, outliers. If your data has a lot of outliers, but is otherwise symmetric and unimodal, it may have a “t” distribution (discussed later in class).

Probabilities from a Normal distribution Normal distributions have a nice property that, knowing the mean (μ) and standard deviation (σ), we can tell how much data will fall in any region. Examples – the normal distribution is symmetric, so 50% of the data is smaller than μ and 50% is larger than μ.

More Normal Probabilities It is always true that about 68% of the data appears within 1 standard deviation of the mean (so about 68% of the data appears in the region μ±σ)

Yet more normal probabilities It is also true about 95% of the appears within 2 standard deviation of the mean, and about 99.7% of the data appear within 3 standard deviations of the mean (so it’s VERY rare to go beyond 3 standard deviations. In quality control applications, one often is interested in “6-sigma”. Note 6 standard deviations includes % of the data.

95% within 2 standard deviations, 99.7% within 3 standard deviations

Computing more general probabilities Suppose you want to know how much data appears within 1.5 standard deviations of the mean, or how much data appears between 1.3 and 1.7 standard deviations of the mean.

Another way There is another way of computing normal probabilities that is 1) the way it used to be done, back in pre-handy-computer days, 2) useful for understanding more about the normal distribution The number of standard deviations an observation is from the mean is called the Z- score for that observation.

Z-score examples If μ=100 and σ=16 (this is true of IQ scores in the U.S.), then an observation X=125 is 25 points above the mean, which corresponds to 25/16 = standard deviations above the mean. If general, a Z-score for an observation X is Z=(X-μ)/σ Observations above the mean get positive Z- scores, observations below the mean get negative Z-scores.

Computing probabilities with Z-scores Fortunately, the Z-score is all you need to know to compute probabilities from a normal distribution. The reason is that Z-scores map directly to percentiles. For each Z-score Stata can provide the percentile. For example, if the Z-score is 1, the percentile is 84.13%. If the Z-score is 2.3, then the percentile is 98.93%

Using a Z-table Z-table shown separately in class and illustrated.

Probabilities between Z-scores Again, IQ scores are normally distributed with mean 100 and standard deviation 16. How many people have IQ scores between 90 and 120? Compute the corresponding Z-scores. For 90, the Z- score is (90-100)/16 = For 120, the Z-score is ( )/16 = Find the corresponding percentiles (SAS). The percentile for Z=1.25 is 89.43%. The percentile for Z=(-0.625) is 26.6%. The amount between these is – = 62.83%

Comparing observations from different normal distributions The central idea is that a Z-score corresponds to a percentile for the observations. If you have observations from multiple normal distributions, you can compute the Z- score for each observations and compare which has the “better” score.

Example Suppose you have two students, one with a 23 on the ACT (mean 22 and standard deviation 3) and another with a 1220 on the SAT (mean 900 and standard deviation 250). The Z-score for the student with the ACT is (23-22)/3 = 0.33 while the Z-score for the student with the SAT is ( )/250 = The student with the SAT performed much better (relative to peers on the exam).

Review The mean and standard deviation of a normal is all you need to know to compute any percentage. Z-scores map directly to percentiles. A “Z-table” provides these percentiles. To find the probability below a value, compute the Z- score and look up the percentile. To find the probability above a value, find the Z- score and percentile, then take 1 minus that percentile (e.g. if the probability of being below is 0.24, the probability above is 0.76) To find the probability between two values, find the Z-scores and percentiles for each value, then take the difference.

Example Suppose you have data which is normally distribution with mean 70 and standard deviation 4 (this describes U.S. male heights in inches). – What proportion of data is less than 70? Half the data is always below the mean, so this is 50%. – What proportion of data is between 68 and 74? The Z-scores for 68 and 74 are (68-70)/4 = (-0.50) and (74-70)/4 = 1.00, corresponding to percentiles from the table of 84.13% and 30.85%. The difference between these percentiles is 84.13% % = 53.28% – What proportion of data is greater than 76? The Z-score for 76 is (76-70)/4 = 1.5 which corresponds to a percentile of 93.32%. Remember that percentiles are always in terms of “less than the value”. To get greater than, we subtract from 100% to get 6.68% as our answer.

Example One manufacturing plant (plant A) produces bolts whose lengths are normally distribution with mean 2 inches and standard deviation inches. Another plant (plant B) produces bolts whose lengths are normally distributed with mean 1.99 inches and standard deviation inches. For the bolts to be usable, their length must be between 1.98 and 2.02 inches. Which plant has the higher percentage of usable bolts? (this gets to the issue that spread may be more important than central location in some quality control examples).

Example continued Plant A is N(2.00,0.009) Plant B is N(1.99,0.003) For each plant, we want the percentage of bolts between 1.98 and 2.02 inches. Thus, we need the Z-scores for 1.98 and 2.02 from each plant.

Example continued Plant A is N(2.00,0.009). Z-scores for 1.98 and 2.02 are Z=( )/0.009=(-2.22) and Z=( )/0.009)=2.22. These correspond to percentiles of and Thus, the probability is = Plant B is N(1.99,0.003). Z-scores for 1.98 and 2.02 are Z=( )/0.003=(-3.33) and Z=( )/0.003=10. These correspond to percentiles of and (for values off the chart, use 0 or 1). Thus, the probability is = Thus, even though Plant A averages a “perfect bolt”, the increased spread makes them worse overall.

Percentiles back to Z-scores We can go both directions between Z-scores and percentiles. Each Z-score corresponds to a percentile. Similarly, each percentile has a corresponding Z-score. For example, what IQ score has 80% of the people below it?

Going from percents to Z-scores In the U.S., IQ scores are normally distributed with mean 100 and standard deviation 16. What is the 80 th percentile of IQs? In other words, for what IQ do 80% of the people fall below it? The first step is to find the Z-score corresponding to 80% (look for the number closest to in the BODY of the table, then find the corresponding Z- score) That Z-score is Z=0.84

Z-scores back to IQ scores The Z-score corresponding to 80% is 0.84 Remember the Z-score is Z=(X-μ)/σ This formula can be reversed to find X=σZ + μ Our Z=0.84, μ=100, and σ=16, so X = (0.84)(16) = So 80% of people have an IQ score below

New Example What about the middle 50% of IQ scores? What percentiles does this correspond to? To get the middle 50%, we need to stretch from the 25 th percentile to the 75 th percentile. The 25 th percentile corresponds to Z-score of (-0.67), while the 75 th percentile corresponds to a Z-score of 0.67 These correspond to IQ’s of (-0.67)(16)+100 = and (0.67)(16) =

Related example What about the middle 95% of values (removing 2.5% from each tail) Quick answer is that we’ve already said going 2 standard deviations in either direction contains approximately 95% of the values, which would correspond to IQs between 68 and 132. Exact answer, however, is that we need the 2.5% and 97.5% percentiles. These correspond to and Going 1.96 standard deviations in either direction from the mean 100 means the middle 95% of IQ scores fall between and

Another example What about the middle 99% of values? We need to find the 0.5 th and 99.5 th percentiles (removing one half a percent from each tail, leaving 99% in the middle) These are and 2.576, which would correspond to IQ scores between (-2.576)(16) = and (2.576) =