Presentation is loading. Please wait.

Presentation is loading. Please wait.

V. Katch Movement Science 250 1 Review Application of the Normal Distribution.

Similar presentations


Presentation on theme: "V. Katch Movement Science 250 1 Review Application of the Normal Distribution."— Presentation transcript:

1 V. Katch Movement Science 250 1 Review Application of the Normal Distribution

2 V. Katch Movement Science 250 2  Continuous random variable  Infinitely many values, and those values are often associated with measurements on a continuous scale with no gaps or interruptions  Normal distribution - If a continuous random variable has distribution that is symmetric and bell-shaped we call it a normal distribution Overview

3 V. Katch Movement Science 250 3  Continuous random variable  Normal distribution Curve is bell shaped and symmetric µ Score Normal Curve Formula Overview x - µ 2 y = 1 2  e  2  ( )

4 V. Katch Movement Science 250 4 Definition Standard Normal Deviation a normal probability distribution that has a mean of 0 and a standard deviation of 1

5 V. Katch Movement Science 250 5 Characteristics of the Normal Curve The curve is bell-shaped and symmetrical. The mean, median, and mode are all equal. The highest frequency is in the middle of the curve. The frequency gradually tapers off as the scores approach the ends of the curve. The curve approaches, but never meets, the abscissa at both high and low ends.

6 V. Katch Movement Science 250 6 Standard Normal Distribution µ = 0  = 1 0 x z

7 V. Katch Movement Science 250 7 To find: z Score the distance along horizontal scale of the standard normal distribution; refer to the leftmost column and top row of Table Area the region under the curve; refer to the values in the body of Table

8 V. Katch Movement Science 250 8 x - 3s x - 2s x - s x x + 2s x + 3s x + sx + s 68% within 1 standard deviation 34% 95% within 2 standard deviations 99.7% of data are within 3 standard deviations of the mean 0.1% 2.4% 13.5% The Empirical Rule Standard Normal Distribution: µ = 0 and  = 1

9 V. Katch Movement Science 250 9 P(a < z < b) denotes the probability that the z score is between a and b P(z > a) denotes the probability that the z score is greater than a P (z < a) denotes the probability that the z score is less than a Notation

10 V. Katch Movement Science 250 Means and Proportions as Random Variables Chapter 9

11 V. Katch Movement Science 250 What happens if we regard a summary statistic for an entire random sample as a random variable

12 V. Katch Movement Science 250 If we take a random sample and find the proportion who have a certain trait, that proportion is the numerical outcome of a random event: the sample proportion is itself a random variable

13 V. Katch Movement Science 250 By understanding how a sample mean behaves as a random variable, we can begin to understand the population from which it came.

14 V. Katch Movement Science 250 Work backwards (from sample to pop) ask question about pop collect sample measure/test answer question about sample based on stats: infer about pop

15 V. Katch Movement Science 250 15 Understanding Dissimilarity Among Samples Suppose most samples were likely to provide an answer that is within ±10% of the population. Then we also know the population answer should be within ±10% of whatever our specific sample value. => Have a good guess about the population value based on sample value. Key: Need to understand what kind of dissimilarity we should expect to see in various samples from the same population.

16 V. Katch Movement Science 250 16 Statistics and Parameters A statistic is a numerical value computed from a sample. Its value may differ for different samples. e.g. sample mean, sample standard deviations, and sample proportion. A parameter is a numerical value associated with a population. Considered fixed and unchanging. e.g. population mean , population standard deviation , and population proportion .

17 V. Katch Movement Science 250 17 Statistics and Parameters For categorical variables, statistics associated with a sample include the number or proportion of the sample who fall into various categories Both the frequency and percentage are statistics associated with samples

18 V. Katch Movement Science 250 18 Sampling Distributions The distribution of possible values of a statistic for repeated samples of the same size from a population is called the sampling distribution of the statistic. Each new sample taken => sample statistic will change. Many statistics of interest have sampling distributions that are approximately normal distributions

19 V. Katch Movement Science 250 19 Example Sampling Dist - Mean Hours Sleep Survey of n = 190 college students. “How many hours of sleep did you get last night?” Sample mean = 7.1 hours. If we repeatedly took samples of 190 and each time computed the sample mean, the histogram of the resulting sample mean values would look like this histogram.

20 V. Katch Movement Science 250 20 Sampling Distributions for Sample Proportions Suppose (unknown to us) 40% of a population carry the gene for a disease, (p = 0.40). We will take a random sample of 25 people from this population and count X = number with gene. Although we expect (on average) to find 10 people (40%) with the gene, we know the number will vary for different samples of n = 25. In this case, X is a binomial random variable with n = 25 and p = 0.4.

21 V. Katch Movement Science 250 21 Many Possible Samples Four possible random samples of 25 people: Sample 1: X =12, proportion with gene =12/25 = 0.48 or 48%. Sample 2: X = 9, proportion with gene = 9/25 = 0.36 or 36%. Sample 3: X = 10, proportion with gene = 10/25 = 0.40 or 40%. Sample 4: X = 7, proportion with gene = 7/25 = 0.28 or 28%. Note: Each sample gave a different answer, which did not always match the population value of 40%. Although we cannot determine whether one sample will accurately reflect the population, statisticians have determined what to expect for most possible samples.

22 V. Katch Movement Science 250 22 The Normal Curve Approximation Rule for Sample Proportions Let p = population proportion of interest or binomial probability of success. Let = sample proportion or proportion of successes. If numerous random samples or repetitions of the same size n are taken, the distribution of possible values of is approximately a normal curve distribution with Mean = p Standard deviation = s.d.( ) = This approximate distribution is sampling distribution of.

23 V. Katch Movement Science 250 23 The Normal Curve Approximation Rule for Sample Proportions Normal Approximation Rule can be applied in two situations: Situation 1: A random sample is taken from a population. Situation 2: A binomial experiment is repeated numerous times. In each situation, three conditions must be met: Condition 1: The Physical Situation There is an actual population or repeatable situation. Condition 2: Data Collection A random sample is obtained or situation repeated many times. Condition 3: The Size of the Sample or Number of Trials The size of the sample or number of repetitions is relatively large, np and np(1-p) must be at least 5 and preferable at least 10.

24 V. Katch Movement Science 250 24 Examples for which Rule Applies Election Polls: to estimate proportion who favor a candidate; units = all voters. Television Ratings: to estimate proportion of households watching TV program; units = all households with TV. Consumer Preferences: to estimate proportion of consumers who prefer new recipe compared with old; units = all consumers. Testing ESP: to estimate probability a person can successfully guess which of 5 symbols on a hidden card; repeatable situation = a guess.

25 V. Katch Movement Science 250 25 Example: Possible Sample Proportions Favoring a Candidate Suppose 40% all voters favor Candidate X. Pollsters take a sample of n = 2400 voters. Rule states the sample proportion who favor X will have approximately a normal distribution with Histogram at right shows sample proportions resulting from simulating this situation 400 times. mean = p = 0.4 and s.d.( ) =

26 V. Katch Movement Science 250 26 s.d.( ) =. Estimating the Population Proportion from a Single Sample Proportion In practice, we don’t know the true population proportion p, so we cannot compute the standard deviation of, In practice, we only take one random sample, so we only have one sample proportion. Replacing p with in the standard deviation expression gives us an estimate that is called the standard error of. s.e.( ) =. If = 0.39 and n = 2400, then the standard error is 0.01. So the true proportion who support the candidate is almost surely between 0.39 – 3(0.01) = 0.36 and 0.39 + 3(0.01) = 0.42.

27 V. Katch Movement Science 250 27 Estimating the Population Proportion from a Single Sample Proportion Review Value of interest is proportion falling into one category of a categorical variable or the probability of success in a binomial experiment. If we know the size of the sample and the magnitude of the true proportion, we can determine an interval of values that was likely to cover the sample proportion.

28 V. Katch Movement Science 250 28 What to Expect of Sample Means Suppose we are interested in estimating the mean of a quantitative variable in a population of millions of people. If we sample 25 people and compute the mean of the sample, how close will that sample mean be to the population mean? Each time we take a sample we will get a different sample mean. Can we say anything about what we expect those means to be?

29 V. Katch Movement Science 250 29 What to Expect of Sample Means Suppose we want to estimate the mean weight loss for all who attend clinic for 10 weeks. Suppose (unknown to us) the distribution of weight loss is approximately Mean = 8 pounds, SD = 5 pounds). We will take a random sample of 25 people from this population and record for each X = weight loss. We know the value of the sample mean will vary for different samples of n = 25. What do we expect those means to be?

30 V. Katch Movement Science 250 30 Many Possible Samples Four possible random samples of 25 people: Sample 1: Mean = 8.32 pounds, standard deviation = 4.74 lbs. Sample 2: Mean = 6.76 pounds, standard deviation = 4.73 lbs. Sample 3: Mean = 8.48 pounds, standard deviation = 5.27 lbs. Sample 4: Mean = 7.16 pounds, standard deviation = 5.93 lbs. Note: Each sample gave a different answer, which did not always match the population mean of 8 pounds. Although we cannot determine whether one sample mean will accurately reflect the population mean, statisticians have determined what to expect for most possible sample means.

31 V. Katch Movement Science 250 31 The Normal Curve Approximation Rule for Sample Means Let  = mean for population of interest. Let  = standard deviation for population of interest. Let = sample mean. If numerous random samples of the same size n are taken, the distribution of possible values of is approximately a normal curve distribution with Mean =  Standard deviation = s.d.( ) = This approximate distribution is sampling distribution of.

32 V. Katch Movement Science 250 32 The Normal Curve Approximation Rule for Sample Means Normal Approximation Rule can be applied in two situations: Situation 1: The population of measurements of interest is bell-shaped and a random sample of any size is measured. Situation 2: The population of measurements of interest is not bell-shaped but a large random sample is measured. Note: Difficult to get a Random Sample? Researchers usually willing to use Rule as long as they have a representative sample with no obvious sources of confounding or bias.

33 V. Katch Movement Science 250 33 Examples for which Rule Applies Average Weight Loss: to estimate average weight loss; weight assumed bell-shaped; population = all current and potential clients. Average Age At Death: to estimate average age at which left-handed adults (over 50) die; ages at death not bell-shaped so need n  30; population = all left-handed people who live to be at least 50. Average Student Income: to estimate mean monthly income of students at university who work; incomes not bell-shaped and outliers likely, so need large random sample of students; population = all students at university who work.

34 V. Katch Movement Science 250 34 Example: Hypothetical Mean Weight Loss Suppose the distribution of weight loss is approximately µ= 8 pounds, s.d. = 5 lbs) and we will take a random sample of n = 25 clients. Rule states the sample mean weight loss will have a normal distribution with Histogram at right shows sample means resulting from simulating this situation 400 times. mean =  = 8 pounds and s.d.( ) = pound Empirical Rule: It is almost certain that the sample mean will be between 5 and 11 pounds. -3 sd+3 sd

35 V. Katch Movement Science 250 35 In practice, the population standard deviation  is rarely known, so we cannot compute the standard deviation of µ, so we use the sd of the sample. s.d.( ) =. Standard Error of the Mean In practice, we only take one random sample, so we only have the sample mean and the sample standard deviation s. Replacing  with s in the standard deviation expression gives us an estimate that is called the standard error of. s.e.( ) =. For a sample of n = 25 weight losses, the standard deviation is s = 4.74 pounds. So the standard error of the mean is 0.948 pounds.

36 V. Katch Movement Science 250 36 Increasing the Size of the Sample Suppose we take n = 100 people instead of just 25. The standard deviation of the mean would be For samples of n = 25, sample means are likely to range between 8 ± 3 pounds => 5 to 11 pounds. For samples of n = 100, sample means are likely to range only between 8 ± 1.5 pounds => 6.5 to 9.5 pounds. s.d.( ) = pounds. Larger samples result in more accurate estimates of population values than smaller samples.

37 V. Katch Movement Science 250 37 Sampling for a Long, Long Time: The Law of Large Numbers LLN: the sample mean will eventually get “close” to the population mean  no matter how small a difference you use to define “close.” LLN = peace of mind to casinos, insurance companies. Eventually, after enough gamblers or customers, the mean net profit will be close to the theoretical mean. Price to pay = must have enough $ on hand to pay the occasional winner or claimant.

38 V. Katch Movement Science 250 38 The Central Limit Theorem states that if n is sufficiently large, the sample means of random samples from a population with mean  and finite standard deviation  are approximately normally distributed with mean  and standard deviation. Technical Note: The mean and standard deviation given in the CLT hold for any sample size; it is only the “approximately normal” shape that requires n to be sufficiently large. Central Limits Theorem

39 V. Katch Movement Science 250 39 Central Limit Theorem 1. The distribution of sample x will, as the sample size increases, approach a normal distribution. 2. The mean of the sample means will be the population mean µ. 3. The standard deviation of the sample means will approach  n Conclusions:

40 V. Katch Movement Science 250 40 Practical Rules Commonly Used: For samples of size n larger than 30, the distribution of the sample means can be approximated reasonably well by a normal distribution. The approximation gets better as the sample size n becomes larger. If the original population is itself normally distributed, then the sample means will be normally distributed for any sample size n (not just the values of n larger than 30).

41 V. Katch Movement Science 250 41 Notation the mean of the sample means the standard deviation of sample mean  (often called standard error of the mean) µ x = µ x =x =  n

42 V. Katch Movement Science 250 42 0 10 20 0 1 2 3 4 5 6 7 8 9 Distribution of 200 digits Distribution of 200 digits from Social Security Numbers (Last 4 digits from 50 students) Frequency

43 V. Katch Movement Science 250 43 15959479571595947957 838132713838132713 6382361534663823615346 468552649468552649 4.75 4.25 8.25 3.25 5.00 3.50 5.25 4.75 5.00 26225027852622502785 3773444513637734445136 737338376737338376 2619578640726195786407 4.00 5.25 4.25 4.50 4.75 3.75 5.25 3.75 4.50 6.00 SSN digits x

44 V. Katch Movement Science 250 44 0 10 0 1 2 3 4 5 6 7 8 9 Distribution of 50 Sample Means for 50 Students Frequency 5 15

45 V. Katch Movement Science 250 45 Sampling Distribution Rule As the sample size increases, the sampling distribution of sample means approaches a normal distribution.

46 V. Katch Movement Science 250 46 Example: Given the population of women has normally distributed weights with a mean of 143 lb and a standard deviation of 29 lb, a.) if one woman is randomly selected, the probability that her weight is greater than 150 lb. is 0.4052.  = 143 150  = 29 00.24 0.0948 0.5 - 0.0948 = 0.4052 - Katch table 1-0.5948=0.4052- Utts table Z=150-143/29 Z = 0.24

47 V. Katch Movement Science 250 47 Example: Given the population of women has normally distributed weights with a mean of 143 lb and a standard deviation of 29 lb, if 36 different women are randomly selected, the probability that their mean weight is greater than 150 lb is 0.0735.  x = 143 150  x  = 4.83333 01.45 0.4265 0.5 - 0.4265 = 0.0735 z = 150-143 = 1.45 29 36

48 V. Katch Movement Science 250 48 a.) if one woman is randomly selected, find the probability that her weight is greater than 150 lb. P(x > 150) = 0.4052 b.) if 36 different women are randomly selected, their mean weight is greater than 150 lb. P(x > 150) = 0.0735 It is much easier for an individual to deviate from the mean than it is for a group of 36 to deviate from the mean. Example: Given the population of women has normally distributed weights with a mean of 143 lb and a standard deviation of 29 lb,

49 V. Katch Movement Science 250 49 Example: California Decco Winnings California Decco lottery game: mean amount lost per ticket over millions of tickets sold is  = $0.35; standard deviation  = $29.67 => large variability in possible amounts won/lost, from net win of $4999 to net loss of $1. mean (loss) =  = $0.35and s.d.( ) = Empirical Rule: The mean loss is almost surely between $0.08 and $0.62 => total loss for the 100,000 tickets is likely between $8,000 to $62,000! There are better ways to invest $100,000. Suppose store sells 100,000 tickets in a year. CLT => distribution of possible sample mean loss per ticket is approximately normal with …

50 V. Katch Movement Science 250 50 Mini Test Assume that the population of human body temp has a mean = 98.6 (SD=0.62). If sample of size n=106 is randomly selected, find the probability of getting a mean of 98.2 or lower)

51 V. Katch Movement Science 250 51 µ  x = µ = 98.6 (by assumption)  x =  n = 0.62/  106 = 0.0602197 Z = x - µ x /  x = 98.20 - 98.6/ 0.0602197 Z = -6.64 Look up on table to find P (-6.64) = 0.0001 We conclude: Solution:

52 V. Katch Movement Science 250 52 Sampling Distribution for Any Statistic Every statistic has a sampling distribution, but the appropriate distribution may not always be normal, or even approximately bell-shaped. Construct an approximate sampling distribution for a statistic by actually taking repeated samples of the same size from a population and constructing a relative frequency histogram for the values of the statistic over the many samples. This is not always possible.

53 V. Katch Movement Science 250 53 Student’s t-Distribution For small sample sizes the approximations for CLT does not hold - the standardized statistics do not exactly conform to the standard normal distribution… so we can use a different standard distribution to approximate the sample distribution. We use the t-distribution to approximate the normal distribution. T-distribution has a bell shape with a mean = 0; the sd is slightly different than 1.0, but close.

54 V. Katch Movement Science 250 54 Student t Distributions for n = 3 and n = 12

55 V. Katch Movement Science 250 55 If the distribution of a population is essentially normal, then the distribution of  is essentially a Student t Distribution for all samples of size n, and is used to find critical values denoted by t  /2. t = x - µ s n Student t Distribution

56 V. Katch Movement Science 250 56 Student’s t-Distribution:Replacing  with s If the sample size n is small, this standardized statistic will not have a N(0,1) distribution but rather a t-distribution with n – 1 degrees of freedom (df). Dilemma: we generally don’t know  (pop SD). Using s we have:

57 V. Katch Movement Science 250 57 Corresponds to the number of sample values that can vary after certain restrictions have been imposed on all data values df = n – 1 Degrees of Freedom ( df )

58 V. Katch Movement Science 250 58 Using the Normal and t Dist

59 V. Katch Movement Science 250 59 1.The sample is a simple random sample. 2.Either the sample is from a normally distributed population, or n > 30. Use Student t distribution  Not Known Assumptions

60 V. Katch Movement Science 250 60 Important Properties of the Student t Dist 1.The Student t distribution is different for different sample sizes 2.The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples. 3.The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0). 4.The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a   = 1). 5.As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.

61 V. Katch Movement Science 250 61 Example. Standardized Mean Weights Claim: mean weight loss is  = 8 pounds. Sample of n =25 people gave a sample mean weight loss of = 8.32 pounds and a sample standard deviation of s = 4.74 pounds. Is the sample mean weight loss of 8.32 pounds reasonable to expect if  = 8 pounds? The sample mean of 8.32 is only about one-third of a standard error above 8, which is consistent with a population mean weight loss of 8 pounds.

62 V. Katch Movement Science 250 62 Statistical Inference [ make conclusions about populations based on samples] Confidence Intervals: uses sample data to provide an interval of values that the researcher is confident covers the true value for the population. Hypothesis Testing or Significance Testing: uses sample data to attempt to reject the hypothesis that nothing interesting is happening, i.e. to reject the notion that chance alone can explain the sample results.


Download ppt "V. Katch Movement Science 250 1 Review Application of the Normal Distribution."

Similar presentations


Ads by Google