Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Normal Distribution: Comparing Apples and Oranges

Similar presentations


Presentation on theme: "The Normal Distribution: Comparing Apples and Oranges"— Presentation transcript:

1 The Normal Distribution: Comparing Apples and Oranges
Topics: Essentials Normal Curves Standard Normal Non-Standard to Standard z-score Empirical Rule Assessing normality Since the beginning of our “visit” with probability, we’ve discussed only discrete variables and their corresponding probability distributions. We now want to take a look at a very special continuous probability distribution called The Normal Distribution. You may be more familiar with the term bell-curve. Recall: continuous variables (height, weight, values in between which other values can “fit”). The Normal Distribution is probably the most important distribution in statistics. Variables whose distributions exhibit approximately this shape, are called Normally Distributed Variables. In practice, it is very unusual to have exactly the shape of a normal curve, but if a variable’s distribution is shaped roughly like a normal curve, then we say that the variable is approximately normally distributed. A normal distribution is completely determined by two parameters: the mean, mu, and the standard deviation, sigma. It is symmetric about mu, and its spread or variability depends on the value of sigma. In particular, sigma determines the shape of the curve (flatter vs. more peaked) Since there can be infinitely many values for mu and sigma, there can be infinitely many normal curves. Let’s look:

2 Essentials: Normal Distribution (I’m normal...or am I?)
Be able to identify normal and approximately normal distributions. Know the characteristics of the Standard Normal. Be able to use the Standard Normal table. Empirical Rule and the Standard Normal. Transforming Non-Standard distributions to the Standard Normal.

3 Three Normal Distributions
Point out symmetry about mu, and the shape change that occurs as the value of sigma changes. Now let’s look at an example of a set of data that is normally distributed.

4 Frequency and Relative Frequency Distributions for Heights
This is the distribution of heights for a group of 3264 female students at a Midwestern college. Pay attention in particular to the 67 – 68 inch class, and its corresponding relative frequency. Remember that we can express relative frequency as a percent. Here, 7.35% of the 3264 students are between 67 and 68 inches tall.

5 Relative Frequency Histogram for a Normally Distributed Variable
We are now looking at the same data in the form of a relative frequency histogram. It is shown with a superimposed normal curve, and as we can see the distribution has roughly the shape of a normal curve, therefore we say the distribution is approximately normal. The mean of this distribution is mu = 64.4, and the standard deviation of this distribution is sigma = 2.4. Looking a little closer at this figure, we might see that the percentage of female students whose height lies within any specified range can be approximated by the corresponding area under the curve For example, the percentage of students who are between 67 and 68 inches tall is 7.35% (.0735). This also equals the area of the shaded bar (rectangle) in this slide since the bar has a height of and a width of 1. Recall formula for area of a rectangle is length times width. This brings us to a key fact: percentages for a normally distributed variable are approx. equal to the areas under its associated normal curve. For example: 50% of the outcomes will occupy 50% of the area under the curve. We now want to recall two things: First, we said that there can be infinitely many normal distributions, and second, from earlier work, we said that given a variable with an approximately bell-shaped distribution, we could standardize a value using the formula z = (x – mu)/sigma, and what we would obtain (called a z-score) would be a measure of how far that value was from the mean in terms of the standard deviation of the distribution. The distribution we obtain when we standardize, is called The Standard Normal Distribution, the graph of which is called The Standard Normal Curve. The Standard Normal Distribution is a special type of normal distribution in that it has a mean of 0, and a standard deviation of 1.

6 The Standard Normal Curve

7 Properties of the Standard Normal Curve
1. The Standard Normal Distribution has a mean of 0 and a standard deviation of 1. 2. The total area under the curve is equal to 1. 3. The Standard Normal Curve extends indefinitely in both directions, approaching, but never touching the horizontal axis. 4. The Standard Normal Curve is symmetric about 0; that is, the part of the curve to the left of 0 is a mirror image of the part of the curve to the right of it. 5. Most of the area under the curve lies between -3 and 3 (99.74%). In order that we might work with any normally distributed variable with a given mean and standard deviation, we standardize values as we learned how to do earlier, and then use the Standard Normal Curve. Let’s look.

8 Normal CurveStandard Normal Curve
Once we have standardized a value(s), we will use a table called The Standard Normal Table to obtain the area under the curve, that it occupies. This area can also be expressed as a percentage or probability as we will see. For now, let’s take a look at the Standard Normal Table in our text, and learn how to interpret it.

9 Standardizing Normal Distributions

10 The Empirical Rule Revisited
Recall: our discussion of the Empirical Rule which states that almost all possible values of a bell-shaped distribution (which we now have a name for…The Normal distribution), lie +/- 3 standard deviations of the mean mu, however theoretically, the curve extends indefinitely in both directions and never touches down on the axis. An example of a normally distributed variable is shown on pg. 332 and 333 in your text.

11 Assessing Normality Pearson’s Index of Skewness (I) – The closer to a value of zero, the less skewed, or more normal, the data set. Recall that if I lies between -1 and +1 the distribution is considered to be approximately normally distributed. . So far in this class, I’ve told you that the variables we’ve been working with are normally distributed variables. But suppose you have your own set of data; you want to know if it is normally distributed, and there is not a statistician to be found???? How can you determine this? You could construct a histogram of the observations, and observe the shape of the distribution (remember, it should be roughly bell-shaped if it is normally distributed). A serious drawback here is that a relatively large sample size is needed to make a fairly accurate determination of shape. For a small sample, it is difficult to decide whether or not a bell shape is present using a histogram. The same goes for boxplots, dotplots, and stem-and-leaf plots. For small samples, we need a more sensitive technique.


Download ppt "The Normal Distribution: Comparing Apples and Oranges"

Similar presentations


Ads by Google