Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revision ‘ Do students in my class play too much computer game on lecture nights?’ This was a question that Dr Goy wondered with respect to her statistics.

Similar presentations


Presentation on theme: "Revision ‘ Do students in my class play too much computer game on lecture nights?’ This was a question that Dr Goy wondered with respect to her statistics."— Presentation transcript:

1 Revision ‘ Do students in my class play too much computer game on lecture nights?’ This was a question that Dr Goy wondered with respect to her statistics students. How many students did she pick in her study? What % of the class is not playing computer game on lecture nights? What is the probability of the class playing at most 2 hours of computer game on lecture nights? What is the probability that of the class is playing at least 4 hours of computer game on lecture nights? Hours0123456 Number2320321

2 Weeks 6 and 7 Continuous Random Variables and Probability Distributions Statistics for Social Sciences

3 Chapter Goals After completing this chapter, you should be able to: Explain what a continuous random variable is Translate normal distribution problems into standardized normal distribution problems Find probabilities using a normal distribution table Find a value give the probability for an event Note - This topic forms the foundation for subsequent topics in relation to statistical inference and hypothesis testing.

4 Continuous Probability Distributions A continuous random variable is a variable that can assume any value in an interval. The characteristics are exactly like continuous quantitative data. thickness of an item time required to complete a task price of a house height, in inches These can potentially take on any value, depending only on the ability to measure accurately. Assign a probability to an interval and not to a particular value. For instance P(a x); P(a  x) Note under continuous random variable, P(x=a)=0

5 Probability Density Function The probability density function, f(x), of random variable X has the following properties: 1. f(x) > 0 for all values of x 2. The area under the probability density function f(x) over all values of the random variable X is equal to 1.0 3. The probability that X lies between two values is the area under the density function graph between the two values

6 Probability Density Function 4. The cumulative density function F(x 0 ) is the area under the probability density function f(x) from the minimum x value up to x 0 where x m is the minimum value of the random variable x

7 Probability as an Area ab x f(x) Paxb( ) ≤ Shaded area under the curve is the probability that X is between a and b ≤ Paxb( ) << = (Note that the probability of any individual value is zero) NOTE: P(x=56)=0; P(x=18)=0

8 ‘Bell Shaped’ Symmetrical Mean, Median and Mode are Equal Location is determined by the mean, μ Spread is determined by the standard deviation, σ The random variable has an infinite theoretical range: +  to   Mean = Median = Mode x f(x) μ σ The Normal Distribution

9 Relationship between dispersion and standard deviation – Empirical Rule When a distribution is symmetric with a mean  =50 and standard deviation  =4, the area under the normal curve would be 68% if X ranges between 46 and 54 (1 s.d. away from mean) cover 95.5% of the area if 2 s.d. away from mean (42 - 58) the area under the normal curve is 99.7%, X would be 3 s.d. away from mean (38 - 62)

10 Many Normal Distributions There are an unlimited number of normal distributions. The probability density function (area) is influenced by  and  which affects the shape of normal distribution. By varying the parameters μ and σ, we obtain different normal distributions. The area (probability) covers under the normal curve will change as well

11 The Normal Probability Density Function The area under the normal curve can be determined using the formula for the normal probability density function. Wheree = the mathematical constant approximated by 2.71828 π = the mathematical constant approximated by 3.14159 μ = the population mean σ = the population standard deviation x = any value of the continuous variable,  < x < 

12 The Standard Normal Distribution Time consuming and meaningless to find the probability for every conceivable combinations values of  and  for continuous random variable. OUR TASK OUR TASK - Transform the continuous random variable distribution (x) into a unique normal distribution i.e. standard normal distribution with mean  =0 and  =1 The standard normal distribution is the normal distribution of the standard variable z (called “standard score” or “z-score”).

13 The Normal Distribution Shape x f(x) μ σ Changing μ shifts the distribution left or right. Changing σ increases or decreases the spread. Given the mean μ and variance σ we define the normal distribution using the notation

14 The Standardized Normal Any normal distribution (with any mean and variance combination) can be transformed into the standardized normal distribution (Z), with mean 0 and variance 1 Need to TRANSFORM X UNITS INTO Z units by subtracting the mean of X and dividing by standard deviation of X Z f(Z) 0 1

15 Example If X is distributed normally with mean of 100 and standard deviation of 50, the Z value for X = 200 is This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100. If X=80, Z= (80-100)/50 = -0.4 If X = 300, Z = (250-100)/50 = 3 If X=100, Z= (100-100)/50 = 0.

16 Probability as Area Under the Curve f(X) X μ 0.5 The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below

17 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 5-17

18 Z 0-2.00 Example: P(Z < -2.00) = 0.0228.0228 The Standardized Normal Table -1.36 Z.0869 P(Z < -1.36) = 0.0869 Remember: P(Z > 1.36) = 0.0869. WHY? This value is from std normal distribution table P(Z<-0.46)=0.3264 P(Z<-3.18)=0.0007

19 Let the random variable Z follow a standard normal distribution. Find (i) P(Z<-2.06) (ii) P(Z<-0.70) (iii) P(Z>0.70) vi. P(Z>2.38)  If the probability is 0.70. Z must be less than what number? (in other words, [P(Z<?)=0.70])  P(Z<a)=0.25. What is the value of a?

20 Appendix Table IV – cumulative normal distribution The Standardized Normal table in the textbook (Appendix Table IV) shows values of the cumulative normal distribution function For a given Z-value a, the table shows F(a) (the area under the curve from negative infinity to a ) Z 0 a

21 Using cumulative standard normal distribution table P(Z<1.84) = 0.9671 total probabilities = 0.9671 1.84 Z P(Z<0.93)=0.8238 P(Z<3.07)=0.9992 P(Z<1.65)=0.9505 How about: P(Z 1.65)=

22 The Standardized Normal Table What is P(Z>2.00)? Knowing the area for P(Z>2.00), we can work for P(Z 2.00) = 1-0.0228 =0.9772. Alternatively, P(Z>2.00)= P(Z<-2.00) = 0.0228 Find P(Z 3.12) Z 0 2.00.9772 Example: If P(Z < 2.00) =.9772

23 FINDING PROBABILITY BETWEEN ANY 2 POINTS For instance, find the area between P(-1.75< Z < 0.85) 1.75 0.85 Z P( -1.36 < Z< 2.14) = P(Z<2.14) – P(Z<-1.36) = 0.9838 – 0.0869 = 0.8969

24 P(-2.10 < Z<2.15)= P(Z<2.15) – P(Z<-2.10) = 0.9842 – 0.0179 = 0.9663 P (1.42<Z<3.00) = P(Z<3.00) – P(Z<1.42) = 0.9987 – 0.9222 = 0.0765 P(-1.95 < Z< -1.00) = P(Z<-1.00) – P(Z<-1.95) = 0.1587 – 0.0256 = 0.1331 P(1.64<Z<2.75) = P(-0.91<Z<0.73)= P(Z<-1.45)=

25 continuous random variable and the probability between 2 points If X~N( ,  2 ), to find the probability of continuous random variable in a range between values X 1 and X 2 i.e. P(X 1 < X < X 2 ) transform each value of X through standard normal Z~N(0,1) for any values of  and  For X 1 find z 1 based on the give formula and from X 2 get z 2.  P(X 1 < X < X 2 ) = P(z 1 <Z< z 2 ) = P(Z<z 2 ) – P(Z<z 1 )

26 Finding Normal Probabilities between 2 points x bμa x bμa x bμa

27 Finding Normal Probabilities ab x f(x) Z µ 0

28 Application of standardised Normal distribution 2 types will be asked in the exam: Given the interval of x, find the probability Given the probability, find the value of x.

29 Consider the intelligence quotient (IQ) scores for people. IQ scores are normally distributed, with a mean of 100 and a standard deviation of 16. If a person is picked at random, what is the probability that his or her IQ is between 100 and 115? That is, what is P(100 < x < 115)? When x = 100, Z= (100-100)/16=0 When X=115, Z= (115-100)/16=0.94

30 P(100 < x < 115) = P(0.00 < z < 0.94) = 0.8264 – 0.5000 = 0.3264 Thus, the probability is 0.3264 that a person picked at random has an IQ between 100 and 115.

31 Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(X < 8.6) Z 0.12 0 X 8.6 8 μ = 8 σ = 10 μ = 0 σ = 1 P(X < 8.6)P(Z < 0.12) Finding Normal Probabilities

32 Solution: Finding P(Z < 0.12) Z 0.12 zF(z).10.5398.11.5438.12.5478.13.5517 F(0.12) = 0.5478 Standardized Normal Probability Table (Portion) 0.00 = P(Z < 0.12) P(X < 8.6)

33 Upper Tail Probabilities Suppose X is normal with mean 8.0 and standard deviation 5.0. Now Find P(X > 8.6) X 8.6 8.0

34 Upper Tail Probabilities Now Find P(X > 8.6)… Z 0.12 0 Z 0.5478 0 1.000 1.0 - 0.5478 = 0.4522 P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12) = 1.0 - 0.5478 = 0.4522

35 If X is normally distributed with mean 100 and standard deviation 16, find a. P(X>127) b. P(X<95) c. P(86<X<112) d. P(X  k)= 0.0018

36 Let x be a continuous random variable that has a normal distribution with a mean of 50 and standard deviation of 10. Convert the following (a and b) x values into z values. a. P(50<x<55) b. P(35<x<55)

37 Find the areas under a normal distribution curve with  =20 and  =4 Area between x=20 and x=27 Area between x=10 and x=17 Suppose the amount spent by students on textbooks has approximately a bell-shaped distribution. The mean (  ) amount spent was RM300 and the standard deviation (  ) is RM100. Calculate the percentage (probability) of students spent more than RM350. (Hint: you need to determine the z-score first)

38 Let x be normally distributed random variable with mean=10 and standard deviation=2. Find the probability that x will lie between 11 and 13.6. The salaries of MBA graduates who joined the marketing services averaged RM45000, with a standard deviation of RM22500. If these salaries were normally distributed, what proportion of MBA graduates had salaries in exceeding RM48000?

39 Finding the X value for a Known Probability Steps to find the X value for a known probability: 1. Find the Z value for the known probability 2. Convert to X units using the formula:

40 Finding the X value for a Known Probability Example: Suppose X is normal with mean 8.0 and standard deviation 5.0. Now find the X value so that only 20% of all values are below this X X ?8.0.2000 Z ? 0

41 Find the Z value for 20% in the Lower Tail 20% area in the lower tail is consistent with a Z value of -0.84 Standardized Normal Probability Table (Portion) X ?8.0.20 Z -0.84 0 1. Find the Z value for the known probability zF(z).82.7939.83.7967.84.7995.85.8023.80

42 Finding the X value 2. Convert to X units using the formula: So 20% of the values from a distribution with mean 8.0 and standard deviation 5.0 are less than 3.80

43 The weights of ripe watermelons grown at Aminah’s farm are normally distributed with a standard deviation of 2.8Ib. Find the weight of Aminah’s ripe watermelons if only 3% weigh less than 15Ib. The mass of a mango taken from Ali’s estate is known to be normally distributed with mean 820g and standard deviation 100g. Find the probability that a mango chosen at random will have a mass of at least 700g.

44 In a sample of 120 workers a factory, the mean daily income is RM11.35 and with a standard deviation of RM3.03. Find the percentage of workers getting wages between RM9 to RM17 in the whole factory assuming that wages are normally distributed. How many workers earn between RM9-17? A company, sells 5000 batteries in a year, guarantees them for a life of 24 months. The life of battery is normally distributed with mean 34 months and standard deviation 5 months. Find the number of batteries have to be replaced under guarantee.

45 In a large class, suppose that your instructor tells you that you need to obtain a grade in the top 10% of your class to get an A on a particular exam. From past experience she is able to estimate that the mean and standard deviation on this exam will be 72 and 13, respectively. What will be the minimum grade needed to obtain an A? (Assume that the grades will be approximately normally distributed) Determine the minimum mark to pass if you were told 25% of students fail the paper.

46 Melons are sold in three sizes: small, medium and large. The weights follow a normal distribution with mean 450 grams and standard deviation 120 grams. Melons weighing less than 350 grams are classified as small. (i) Find the proportion of melons which are classified as small. (ii) The rest of the melons are divided in equal proportions between medium and large. Find the weight above which melons are classified as large.

47 Assuming Ali received 45 marks for SSF1063 while Mary received 72 points for SSF1073. Given the mean mark for SSF1063 is 38 and SSF1073 is 65 while the standard deviations are 8 and 14 respectively, does it imply that Mary had a better grade than Ali? Explain.

48 Assessing Normality Not all continuous random variables are normally distributed It is important to evaluate how well the data is approximated by a normal distribution Few ways to assess the normality assumption Graphical presentation Numerical way

49 Look for skewness value. Analyse → Descriptive statistics →frequencies →choose a var →statistics pick skewness under distribution, click on charts and pick histogram and show normal curve on histogram The skewness of a normal distribution is zero. If the skewness for a data distribution is within the range of -1 to +1, it is considered it to be normal.

50 Using explore command under descriptive statistics Analyse→ Descriptive statistics → explore → click the button marked 'Plots...' and select the option 'normality plots with tests'. This does a test called the Kolmogorov- Smirnov test. If this is significant (p<0.05) then the distribution is not normal. Choose histogram as well. Under explore command, you will have skewness result as well.

51

52

53 The Normal Probability Plot Normal probability plot Arrange data from low to high values Find cumulative normal probabilities for all values Examine a plot of the observed values vs. cumulative probabilities (with the cumulative normal probability on the vertical axis and the observed data values on the horizontal axis) Evaluate the plot for evidence of linearity

54 The Normal Probability Plot A normal probability plot for data from a normal distribution will be approximately linear: 0 100 Data Percent


Download ppt "Revision ‘ Do students in my class play too much computer game on lecture nights?’ This was a question that Dr Goy wondered with respect to her statistics."

Similar presentations


Ads by Google