Download presentation
Presentation is loading. Please wait.
Published byJessica Banks Modified over 6 years ago
1
Normal Distribution Prepared by: Ameer Sameer Hamood
University of Babylon Information technology - information networks
2
Normal Distributions The most important probability distribution in statistics is the normal distribution. A normal distribution is a continuous probability distribution for a random variable, x. The graph of a normal distribution is called the normal curve. x
3
Normal Distributions Many things closely follow a Normal Distribution:
heights of people size of things produced by machines errors in measurements blood pressure marks on a test Social Network Cryptograph Telecommunications Internet Of Things
4
Properties of Normal Distributions
The mean, median, and mode are equal. The normal curve is bell-shaped and symmetric about the mean. The total area under the curve is equal to one. The normal curve approaches, but never touches the X axis as it extends farther and farther away from the mean.
5
Properties of Normal Distributions
* symmetry about the center * 50% of values less than the mean and 50% greater than the mean
6
The normal (Gaussian) distribution
“µ” – the Greek letter “mu,” which is the Mean “σ” – the Greek letter “sigma,” which is the Standard Deviation Note: In a normal distribution, only 2 parameters are needed, namely μ and σ2
7
PARAMETER The normal distribution can be completely specified by two parameters: 1. Mean 2. Standard deviation If the mean and standard deviation are known, then one essentially knows as much as if one had access to every point in the data set.
8
PARAMETER A normal distribution can have any mean and any positive standard deviation. Inflection points The mean gives the location of the line of symmetry. Inflection points 3 6 1 5 4 2 x 3 6 1 5 4 2 9 7 11 10 8 x Mean: μ = 3.5 Standard deviation: σ 1.3 Mean: μ = 6 Standard deviation: σ 1.9 The standard deviation describes the spread of the data.
9
Which curve has the greater mean?
Example: Which curve has the greater mean? Which curve has the greater standard deviation? 3 1 5 9 7 11 13 A B x Answer: 1- The line of symmetry of curve A occurs at x = 5. The line of symmetry of curve B occurs at x = 9. Curve B has the greater mean. 2- Curve B is more spread out than curve A, so curve B has the greater standard deviation.
10
Curves with different means, different standard deviations
PARAMETER Curves with different means, same standard deviation Curves with different means, different standard deviations
11
The Standard Normal Distribution
It makes life a lot easier for us if we standardize our normal curve, with a mean of zero and a standard deviation of 1 unit. If we have the standardized situation of μ = 0 and σ = 1, then we have: Standard Normal Curve μ = 0, σ = 1
12
The Standard Normal Distribution
The normal random variable of a standard normal distribution is called a standard score or a z-score. Every normal random variable X can be transformed into a z score via the following equation: z = (X - μ) / σ z is the "z-score" (Standard Score) X is a normal random variable(x is the value to be standardized), μ is the mean of X, and σ is the standard deviation of X.
13
Example Say μ=2 and σ =1/3 in a normal distribution.
The graph of the normal distribution is as follows:
14
The following graph represents the same information, but it has been standardized so that μ = 0 and σ = 1
15
Standard normal vs Normal Distribution
16
Normal Distributions empirical rule
Because of its unique bell shape, probabilities for the normal distribution follow the empirical rule or the rule. Clearly, given a normal distribution, most outcomes will be within 3 standard deviations of the mean. This figure illustrates all three components of the Empirical Rule. The reason that so many (about 68%) of the values lie within 1 standard deviation of the mean in the Empirical Rule is because when the data are bell-shaped, the majority of the values are mounded up in the middle, close to the mean (as the figure shows). Adding another standard deviation on either side of the mean increases the percentage from 68 to 95, which is a big jump and gives a good idea of where “most” of the data are located. Most researchers stay with the 95% range (rather than 99.7%) for reporting their results, because increasing the range to 3 standard deviations on either side of the mean (rather than just 2) doesn’t seem worthwhile, just to pick up another 4.7% of the values. The Empirical Rule tells you about what percentage of values are within a certain range of the mean. These results are approximations only, and they only apply if the data follow a normal distribution. However, the Empirical Rule is an important result in statistics because the concept of “going out about two standard deviations to get about 95% of the values” is one that you see mentioned often with confidence intervals and hypothesis tests.
17
mean and standard Example
Example At the New Age Information Corporation, the ages of all new employees hired during the last 5 years are normally distributed. Within this curve, 95.4% of the ages, centered about the mean, are between 24.6 and 37.4 years. Find the mean age and the standard deviation of the data.
18
mean and standard Example
Solution: As was seen in Example 95.4% implies a span of 2 standard deviations from the mean. The mean age is symmetrically located between -2 standard deviations (24.6) and +2 standard deviations (37.4). From 31 to 37.4 (a distance of 6.4 years) is 2 standard deviations. Therefore, 1 standard deviation is (6.4)/2 = 3.2 years.
19
mean and standard Example
Questions 95% of students at school weigh between 62 kg and 90 kg. Assuming this data is normally distributed, what are the mean and standard deviation? Answer : The mean is halfway between 62 kg and 90 kg: Mean = (62 kg + 90 kg)/2 = 76 kg 95% is 2 standard deviations either side of the mean (a total of 4 standard deviations) so: 1 standard deviation = (90 kg - 62 kg)/4 = 28 kg/4 = 7 kg
20
Questions A machine produces electrical components % of the components have lengths between cm and cm. Assuming this data is normally distributed, what are the mean and standard deviation? Answer : The mean is halfway between cm and cm: Mean = (1.176 cm cm)/2 = cm 99.7% is 3 standard deviations either side of the mean (a total of 6 standard deviations) so: 1 standard deviation = (1.224 cm cm)/6 = cm/6 = cm
21
Examples We can take any Normal Distribution and convert it to The Standard Normal Distribution. Example: Travel Time A survey of daily travel time had these results (in minutes): 26, 33, 65, 28, 34, 55, 25, 44, 50, 36, 26, 37, 43, 62, 35, 38, 45, 32, 28, 34 The Mean is 38.8 minutes, and the Standard Deviation is 11.4 minutes. Convert the values to z-scores ("standard scores"). To convert 26: first subtract the mean: = -12.8, then divide by the Standard Deviation: -12.8/11.4 = -1.12 So 26 is Standard Deviations from the Mean Here are the first three conversions Original Value Calculation Standard Score (z-score) 26 ( ) / 11.4 = ( ) / 11.4 = ( ) / 11.4 =
22
Examples Answer
23
Examples
24
Normal Probabilities We are often interested in the probability that z takes on values between z0 and z1
25
What is the probability of an infant weighing more than 5000g?
26
Finding z-Scores Example:
Find the z-score that corresponds to a cumulative area of Appendix B: Standard Normal Table Find the z-score by locating in the body of the Standard Normal Table. The values at the beginning of the corresponding row and at the top of the column give the z-score. The z-score is 2.78.
27
Examples Edge Perspectives in Social Network
28
Examples Discovery Method in the Internet of Things
29
Examples Discovery Method in the Internet of Things
30
Examples Anomaly detection and a simple algorithm with probabilistic approach.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.