HW solutions are on the web. See website for how to calculate probabilities with minitab, excel, and TI calculators.
Determining normal probabilities: Suppose X has a normal distribution with mean 5 and std dev 2. Notation X~N(5,4) [notation uses N(mean,variance)] What’s the probability that X is less than 4?
7 Pr(X<4) = area under curve to left of x=4 Normal Density
What’s Pr(X < 4)? Draw (previous page) Center and scale: –Pr(X<4) = Pr( (X-5)/2 < (4-5)/2 ) = Pr( Z < -1/2 ) Look up (appendix 1) = Pr(Z<-1/2) =
“Centering and Scaling?” Suppose X~N(mu,sigma^2). Why does (X-mu)/sigma have a N(0,1) distribution? (X-mu) part is “centering” and /sigma part is “scaling”. Idea: All normal distributions have the same shape. Centering and scaling just relabels the x and y axes. The area under the curve (and the probabilities) remains the same.
Pr(X<4) = area under curve to left of x=4 Pr(Z<-0.5) = area under curve to left of -0.5 Same area as above x density Centered and scaled X Density
Example 2 X ~ N(2,9) What’s Pr(1<X<4)? Want area in between these bars First let’s do this with the tables.
Pr(1<X<4) =Pr[(1-2)/3<Z<(4-2)/3] (where Z~N(0,1)) =Pr[Z<(4-2)/3] –Pr[Z<(1-2)/3] =Pr(Z < 2/3) – Pr(Z < -1/3) = = EVEN IF YOU’LL ALWAYS USE CALCULATORS, MINITAB, EXCEL, OR MATLAB TO DO THESE PROBLEMS, YOU’LL NEED TO KNOW HOW TO DO CALCULATIONS LIKE THE ONES ABOVE… Purpose of all this is to get to an expression that only uses Pr(Z<a) where Z~N(0,1). All because tables have Pr(Z<a).
Using excel or minitab, the only step that is necessary is to get the probability in terms of CDFs (i.e. Pr[X <= k]). Pr(1<X<4) = Pr(X<4) – Pr(X<1), where X ~N(2,3 2 ) = – = (Do demo in class) Three probabilities to memorize: Pr(Z < 2.33) = 99% Pr(Z < 1.96) = 97.5% Pr(Z < 1.28) = 90% Remember: Z~N(0,1) “the standard normal”
Later in the course, we will need to be able to do things like the following: Let X~N(10,16). Find an a such that Pr(X < a) = Plot of x versus Pr(X<x) when X~N(10,4 2 ) a is this number here Probabilities are on this axis
Let X~N(10,16). Find an a such that Pr(X < a) = Pr[(X-10)/4 < (a-10)/4] = =Pr(Z < (a-10)/4] =0.80 Using the table “backwards” we find that Pr(Z < 0.84) = 0.80 As a result, (a-10)/4 = 0.84 So, a = This is called an inverse probability problem.
The Normal Distribution is Pervasive Examples of things that are normally distributed: –Heights, weights, abilities, many, many other measurements –In general, when a quantity is the result of a combination of many factors and influences, samples of that quantity are very likely to be approximately normally distributed. –Why?
A store keeps track of the average amount spent by people each day. Let X i = average amount spent on day i It turns out that there is a good reason to believe that X i has a normal distribution!!! This reason is the CENTRAL LIMIT THEOREM
Central Limit Theorem Let X 1,…,X n be n independent random variables each with constant mean and constant variance 2. Then, as n gets large, (X 1 +…+X n )/n ~ N( , 2 /n) and (X 1 +…+X n ) ~ N(n , n 2 )
What does “large n” mean What “large” is depends on the distribution of X i. If X i ’s are already normal, then the result is true for any n If X i ’s have a symmetric distribution, then n at least 3 is probably large enough If X i ’s have a skewed distribution, the n of 20 or 30 is probably large enough
Example Suppose the amount of potassium in a banana is normally distributed with mean 630mg and standard deviation 40mg. You eat 3 bananas a day. Let T = amount of potassium you eat. What is the probability that T < 1800mg? By central limit theorem, T~N[3*630,3*(40 2 )]. Want Pr(T < 1800) = Pr[ (T-1890)/(sqrt(3)*40) < (1800 – 1890)/(sqrt(3)*40)] = Pr[ Z < -1.33] = (see p672-3 in book for table or use Calculator or Excel or Minitab)
Area under curve to left of line is Pr(T < 1800)