Presentation is loading. Please wait.

Presentation is loading. Please wait.

The normal distribution

Similar presentations


Presentation on theme: "The normal distribution"— Presentation transcript:

1 The normal distribution
The normal curve: Shape defined by the standard deviation and the mean Most frequency distributions have most of the values or observations situated around the mean, with fewer and fewer observations towards the extremes of the range of values. If n is large, the frequency polygons of many biological data distributions are “bell shaped” and look like the figure here. There are two variable constants which define the curve – the Standard Deviation and the Mean. The mathematician Gauss discovered that distributions of this shape, estimated from large samples of measurements drawn from a single population, often agree well with a model of this form. This is the most important distribution in statistics for two main reasons: We see it when a variable is measured for a large number of nominally identical objects, and when the variation may be assumed to be caused by a number of factors, each exerting a small positive or negative random influence on an individual object.

2 The normal distribution Various weights of a group of female students
WEIGHT (kg) FREQUENCY 20 30 40 50 60 70 80 As an example, we can look at the weight of a group of female students, where the variation is caused by many factors such as age, diet, exercise, heights of parents, bone structure and so on. The properties of the normal distribution have very important applications in the statistical theory of drawing conclusions from sample data, about populations from which the samples were drawn.  Not all continuous variables are normally distributed however - there are also rectangular or continuous uniform distribution etc

3 The normal distribution
The equation of the normal curve Allows us to compute the height of the curve y for a given value of x Y This is the equation that describes the normal curve. It allows us to compute the height of the curve (Y) for a given value of X (data point).

4 The normal distribution
As the term (x-) becomes smaller, y becomes larger and is at a maximum when x = . X PROBABILITY 0.0 0.1 0.2 0.3 0.4 -5 -4 -3 -2 -1 1 2 3 4 5 s = 1 s = 1.5 s = 2 This results in a symmetrical curve in which the apex occurs when x is equal to the population mean. In reality, these parameters may vary, continuously generating an infinite variety of normal curves, because there are two parameters in the equation, mu and sigma. So for any given sigma there are an infinite number of normal curves, possibly depending on mu. And therefore, for any given mean (u), an infinity of normal curves is possible each with a different value of sigma. Here the graph shows normal curves for u = 0 and sigma = 1, 1.5 and 2

5 The normal distribution
As the term (x-) becomes smaller, y becomes larger and is at a maximum when x = . X PROBABILITY 0.0 0.1 0.2 0.3 0.4 -5 -4 -3 -2 -1 1 2 3 4 5 m = 0 m = 1 m = 2 Likewise, for any given sigma there are an infinite number of normal curves, possibly depending on mu. Here the graph shows normal curves for sigma = 1 and mu = 0, 1, and 2.

6 The normal distribution
A normal curve is symmetrical Axis of symmetry passes through the baseline where x =  (one of the parameters of the curve) A normal curve is symmetrical, with the axis of symmetry passing through the baseline where x = , in other words, through one of the parameters of the curve. The standard normal distribution is a normal distribution with a mean of zero and a standard deviation of 1. Theoretically the two tails never touch the horizontal axis. Theoretically, two tails never touch the horizontal axis.

7 The normal distribution
Vertical axis of the distribution re-scaled by dividing by the number of observations - becomes a probability distribution If the vertical axis of the distribution is re-scaled by dividing by the number of observations it effectively becomes a probability distribution or, strictly a probability density. The total probability encompassed by the density is then also 1. The total probability encompassed by the density is 1.

8 The normal distribution
Total area under the curve is 100%: The area bounded by one standard deviation on either side of the central axis is approximately 68.26% of the total area. If we say that the total area under the curve is 100% then one of the mathematical properties of the normal curve is that the area bounded by one standard deviation on either side of the central axis is approximately 68.26% of the total area. Or we can say that if the total area is equal to 1, that each standard deviation on either side of the central axis (or the mean) is times 2, which is equal to

9 Normal distribution Probability determination example
A normal distribution of values Mean = 50 Standard deviation = 15 What is the probability of finding a value greater than 75? Because we have the exact mathematical equation describing the normal distribution, we can determine probabilities (or proportions) of the normal distribution quite easily. In order to do so, we need a value for x, the mean, the standard deviation for the normal distribution, and a table of “proportions of the normal curve”, which you have in your text book by Zar, table B.2 “proportions of the curve (one-tailed)” on page 483 of the second edition. We will use the following example: we have a normal distribution of values, with a mean of 50 and a standard deviation of 15. We are going to ask “what is the probability of finding a value greater than 75?”.

10 Xi - X Zi S Normal distribution Z (standard) distribution
To convert a data point (Xi) to a Z score (Zi): Zi = Xi - X S Since there is an infinite family of normal distributions, we need a way to find the probability without having to have a different table for each possible distribution. So what we do is convert our data point (which is 75) into what is called a “Z score” or a “standard score”. To convert a data point to a z score, we use the formula we have shown here. We call the Z score a standard score because it has no units. Applying the formula will always produce a transformed distribution with a mean of zero and a standard deviation of 1. However, the shape of the distribution will not be affected by the transformation.

11 Normal distribution Z (standard) distribution
Xi = 75, mean = 50, standard deviation = 15 Z score in example is 1.67 Table value = Probability = In the example, Xi = 75, the mean = 50, and the standard deviation is 15. Our Z score is therefore 75 minus 50, divided by 15, which = Now we go to table B.2 in Zar, where we have the normal distribution of Z scores. Go down the column in the table labelled “Z” to 1.6, then go to the column under “7”. This will give you the proportion of the normal distribution more extreme than an absolute value of Z = The table value is The proportion of the normal distribution greater than 1.67 is therefore The chances (or probability) therefore of finding a value greater than 75 is

12 Normal distribution Z (standard) distribution
Area under the two tails: ( x 2) = 0.095 1 – = 0.905 0.0475 0.905 The proportion greater than 1.67 is , but the proportion less than minus 1.67 is also If the proportion less than minus 1.67 is , and the proportion greater than 1.67 is , then the proportion between minus 1.67 and 1.67 can be easily calculated if you remember that the area under the graph (total proportion) always has to be 1. The two tails together are ( x 2) which we then subtract from 1 to get an answer of


Download ppt "The normal distribution"

Similar presentations


Ads by Google