Presentation is loading. Please wait.

Presentation is loading. Please wait.

BNAD 276: Statistical Inference in Management Winter, 2015

Similar presentations


Presentation on theme: "BNAD 276: Statistical Inference in Management Winter, 2015"— Presentation transcript:

1 BNAD 276: Statistical Inference in Management Winter, 2015
Welcome Green sheet Seating Chart

2 Daily group portfolios
Beginning of each lecture (first 5 minutes) Meet in groups of 3 or 4 Quiz one another on class material Discuss the questions and determine the correct answer for each question Five copies (one for each group member – and typed) multiple choice questions based on lecture Include 4 options (a, b, c, and d) Include a name and describe a person in a certain situation Margaret was interested in taking a Statistics course. It is likely she was interested in studying which of the following? a. economic theories of communism b. theological perspectives of life after death c. musical compositions of the 12th century d. statistical techniques and inference They can be funny or serious, and must be clear and have only one correct answer.

3 Please start portfolios

4 . Homework Assignment Go to D2L - Click on “Content”
Click on “Interactive Online Homework Assignments”

5 66% chance of getting admitted
What is probability 1. Empirical probability: relative frequency approach Number of observed outcomes Number of observations Probability of getting into an educational program Number of people they let in 400 66% chance of getting admitted Number of applicants 600 Probability of getting a rotten apple Number of rotten apples 5 5% chance of getting a rotten apple Number of apples 100

6 1. Empirical probability: relative frequency approach
What is probability 1. Empirical probability: relative frequency approach “There is a 20% chance that a new stock offered in an initial public offering (IPO) will reach or exceed its target price on the first day.” “More than 30% of the results from major search engines for the keyword phrase “ring tone” are fake pages created by spammers.” 10% of people who buy a house with no pool build one. What is the likelihood that Bob will? Number of observed outcomes Number of observations Probability of hitting the corvette Number of carts that hit corvette Number of carts rolled 182 = .91 200 91% chance of hitting a corvette

7 = = 2. Classic probability: a priori probabilities based on logic
rather than on data or experience. All options are equally likely (deductive rather than inductive). Likelihood get question right on multiple choice test Chosen at random to be team captain Lottery Number of outcomes of specific event Number of all possible events In throwing a die what is the probability of getting a “2” Number of sides with a 2 1 16% chance of getting a two = Number of sides 6 In tossing a coin what is probability of getting a tail Number of sides with a 1 1 50% chance of getting a tail = Number of sides 2

8 3. Subjective probability: based on someone’s personal
judgment (often an expert), and often used when empirical and classic approaches are not available. Likelihood that company will invent new type of battery Likelihood get a ”B” in the class 60% chance that Patriots will play at Super Bowl There is a 5% chance that Verizon will merge with Sprint Bob says he is 90% sure he could swim across the river

9 Approach Example Empirical Classical Subjective
There is a 2 percent chance of twins in a randomly-chosen birth Classical There is a 50 % probability of heads on a coin flip. Subjective There is a 5% chance that Verizon will merge with Sprint

10 The probability of event A [denoted P(A)], must lie
The probability of an event is the relative likelihood that the event will occur. The probability of event A [denoted P(A)], must lie within the interval from 0 to 1: 0 < P(A) < 1 If P(A) = 0, then the event cannot occur. If P(A) = 1, then the event is certain to occur.

11 The probabilities of all simple events must sum to 1
P(S) = P(E1) + P(E2) + … + P(En) = 1 For example, if the following number of purchases were made by credit card: 32% debit card: 20% cash: 35% check: 13% Sum = 100% P(credit card) = .32 P(debit card) = .20 P(cash) = .35 P(check) = .13 Sum = 1.0 Probability

12 Probability of getting into an educational program
What is the complement of the probability of an event The probability of event A = P(A). The probability of the complement of the event A’ = P(A’) A’ is called “A prime” Complement of A just means probability of “not A” P(A) + P(A’) = 100% P(A) = 100% - P(A’) P(A’) = 100% - P(A) Probability of getting a rotten apple 5% chance of “rotten apple” 95% chance of “not rotten apple” 100% chance of rotten or not Probability of getting into an educational program 66% chance of “admitted” 34% chance of “not admitted” 100% chance of admitted or not

13 Two mutually exclusive characteristics: if the occurrence of any one of them automatically implies the non-occurrence of the remaining characteristic Two events are mutually exclusive if they cannot occur at the same time (i.e. they have no outcomes in common). Two propositions that logically cannot both be true. No Warranty Warranty For example, a car repair is either covered by the warranty (A) or not (B).

14 Collectively Exhaustive Events
Events are collectively exhaustive if their union is the entire sample space S. Two mutually exclusive, collectively exhaustive events are dichotomous (or binary) events. For example, a car repair is either covered by the warranty (A) or not (B). No Warranty Warranty

15 Satirical take on being “mutually exclusive”
No Warranty Warranty Recently a public figure in the heat of the moment inadvertently made a statement that reflected extreme stereotyping that many would find highly offensive. It is within this context that comical satirists have used the concept of being “mutually exclusive” to have fun with the statement. Decent , family man Arab Transcript: Speaker 1: “He’s an Arab” Speaker 2: “No ma’am, no ma’am. He’s a decent, family man, citizen…”

16 ∩ Union versus Intersection Union of two events means
Event A or Event B will happen P(A B) Intersection of two events means Event A and Event B will happen Also called a “joint probability” P(A ∩ B)

17 The union of two events: all outcomes in the
sample space S that are contained either in event A or in event B or both (denoted A  B or “A or B”).  may be read as “or” since one or the other or both events may occur.

18 The union of two events: all outcomes contained either in event A or in event B or both (denoted A  B or “A or B”). What is probability of drawing a red card or a queen? what is Q  R? It is the possibility of drawing either a queen (4 ways) or a red card (26 ways) or both (2 ways).

19 P(Q) = 4/52 (4 queens in a deck) 2/52
Probability of picking a Queen Probability of picking a Red 4/52 26/52 P(Q) = 4/52 (4 queens in a deck) 2/52 P(R) = 26/52 (26 red cards in a deck) P(Q  R) = 2/52 (2 red queens in a deck) Probability of picking both R and Q When you add the P(A) and P(B) together, you count the P(A and B) twice. So, you have to subtract P(A  B) to avoid over-stating the probability. P(Q  R) = P(Q) + P(R) – P(Q  R) = 4/ /52 – 2/52 = 28/52 = or 53.85%

20 ∩ Union versus Intersection Union of two events means
Event A or Event B will happen P(A B) Intersection of two events means Event A and Event B will happen Also called a “joint probability” P(A ∩ B)

21 The intersection of two events: all outcomes contained in both event A and event B (denoted A  B or “A and B”) What is probability of drawing red queen? what is Q  R? It is the possibility of drawing both a queen and a red card (2 ways).

22 Poodles and Labs: Mutually Exclusive
If two events are mutually exclusive (or disjoint) their intersection is a null set (and we can use the “Special Law of Addition”) P(A ∩ B) = 0 Intersection of two events means Event A and Event B will happen Examples: mutually exclusive If A = Poodles If B = Labradors Poodles and Labs: Mutually Exclusive (assuming purebred)

23 Poodles and Labs: Mutually Exclusive
If two events are mutually exclusive (or disjoint) their intersection is a null set (and we can use the “Special Law of Addition”) P(A ∩ B) = 0 Dog Pound Intersection of two events means Event A and Event B will happen Examples: If A = Poodles If B = Labradors (let’s say 10% of dogs are poodles) (let’s say 15% of dogs are labs) What’s the probability of picking a poodle or a lab at random from pound? P(A B) = P(A) +P(B) P(poodle or lab) = P(poodle) + P(lab) P(poodle or lab) = (.10) + (.15) = (.25) Poodles and Labs: Mutually Exclusive (assuming purebred)

24 Practice 2 5 = .40 Based on apriori probability – all options equally likely – not based on previous experience or data Based on expert opinion - don’t have previous data for these two companies merging together Based on frequency data (Percent of rockets that successfully launched)

25 Practice Based on apriori probability – all options equally likely – not based on previous experience or data 30 100 = .30 Based on frequency data (Percent of times at bat that successfully resulted in hits) Based on frequency data (Percent of times that pages that are “fake”)

26 Practice 5 = .10 50 Based on frequency data (Percent of students who
successfully chose to be Economics majors)

27 . Homework Assignment

28 Raw scores, z scores & probabilities
Distance from the mean (z scores) convert convert Raw Scores (actual data) Proportion of curve (area from mean) 68% z = -1 z = 1 We care about this! What is the actual number on this scale? “height” vs “weight” “pounds” vs “test score” We care about this! “percentiles” “percent of people” “proportion of curve” “relative position” 68% Raw Scores (actual data) Proportion of curve (area from mean) z = -1 z = 1 Distance from the mean (z scores) convert convert

29 z table Formula Normal distribution Raw scores z-scores probabilities
Have z Find raw score Z Scores Have z Find area z table Formula Have area Find z Area & Probability Have raw score Find z Raw Scores

30 Raw scores, z scores & probabilities
Notice: 3 types of numbers raw scores z scores probabilities Mean = 50 Standard deviation = 10 z = -2 z = +2 If we go up two standard deviations z score = +2.0 and raw score = 70 If we go down two standard deviations z score = -2.0 and raw score = 30

31 Always draw a picture! Homework worksheet

32 1 .6800 1 sd 1 sd 28 30 32 Homework worksheet .6800 also fine: 68%
z =-1 z = 1 28 30 32

33 2 .9500 2 sd 2 sd 26 28 30 32 34 Homework worksheet .9500
also fine: 95% also fine: .9500 2 sd 2 sd z =-2 z = 2 26 28 30 32 34

34 3 .9970 3 sd 3 sd 24 26 28 30 32 34 36 Homework worksheet .9970
also fine: 99.7% also fine: .9970 3 sd 3 sd z =-3 z = 3 24 26 28 30 32 34 36

35 4 .5000 24 26 28 30 32 34 36 Homework worksheet .5000 also fine: 50%
z = 0 24 26 28 30 32 34 36

36 5 .4332 24 26 28 30 32 34 36 Homework worksheet z = 33-30 z = 1.5
Go to table .4332 2 5 also fine: % .4332 z = 1.5 24 26 28 30 32 34 36

37 z = 33-30 2 z = 1.5 Go to table .4332 Add area Lower half = .9332 6 also fine: % .9332 .4332 .5000 z = 1.5 24 26 28 30 32 34 36

38 7 .4332 .0668 24 26 28 30 32 34 36 Homework worksheet z = 33-30 = 1.5
= 1.5 Go to table .4332 Subtract from .5000 = .0668 2 7 also fine: 6.68% .4332 .0668 z = 1.5 24 26 28 30 32 34 36

39 z = 29-30 2 = -.5 Go to table .1915 Add to upper Half of curve = .6915 8 also fine: % .6915 .1915 .5000 z = -.5 24 26 28 30 32 34 36

40 = 25-30 2 = -2.5 .4938 Go to table = 31-30 2 =.5 .1915 Go to table = .6853 9 also fine: % .6853 .1915 .4938 z =-2.5 z = .5 24 26 28 30 32 34 36

41 z = 27-30 2 = -1.5 Go to table .4332 Subtract From .5000 = .0668 10 also fine: 6.68% .5000 .0668 .4332 z =-1.5 24 26 28 30 32 34 36

42 z = 27-30 2 = -1.5 Go to table .4332 Add lower Half of curve = .9932 11 also fine: % .9332 .5000 .4332 z =-1.5 24 26 28 30 32 34 36

43 z = 32-30 2 = 1.0 Go to table .3413 Subtract from .5000 = .0068 12 .5000 also fine: % .1587 .3413 z =1 24 26 28 30 32 34 36

44 13 24 26 28 30 32 34 36 50th percentile = median 30 In a normal curve
Median= Mean = Mode z =0 24 26 28 30 32 34 36

45 28 32 14 .6800 1 sd 1 sd z =-1 z = 1 24 26 28 30 32 34 36

46 z table provides area from mean to score
x = mean + z σ = 30 + (.74)(2) = 31.48 77th percentile Find area of interest = .2700 Find nearest z = .74 15 .2700 .7700 z table provides area from mean to score .5000 31.48 z =.74 24 ? 30 36

47 z table provides area from mean to score
13th percentile Find area of interest = .3700 Find nearest z = -1.13 x = mean + z σ = 30 + (-1.13)(2) = 27.74 16 Note: =.50 z table provides area from mean to score .3700 .1300 z =-1.13 27.74 ? 24 30 36

48 Please use the following distribution with
a mean of 200 and a standard deviation of 40. 80 120 160 200 240 280 320

49 17 .6800 1 sd 1 sd 160 200 240 .6800 also fine: 68% also fine: .6826
z =-1 z = 1 160 200 240

50 18 .9500 2 sd 2 sd 120 200 280 .9500 also fine: 95% also fine: .9544
z =-2 z = 2 120 200 280

51 19 .9970 3 sd 3 sd 80 200 320 .9970 also fine: 99.7% also fine: .9974
z =-3 z = 3 80 200 320

52 Go to table = = .75 .2734 40 20 also fine: % .2734 z =.75 80 120 160 200 240 280 320

53 Go to table Add to upper Half of curve z = 40 = -.5 .1915 = .6915 22 also fine: % .5000 .1915 .6915 z =-.5 80 120 160 200 240 280 320

54 z = 40 = 0.9 Go to table .3159 Subtract from .5000 = .1841 23 .3159 also fine: % .1841 z =.9 80 120 160 200 240 280 320

55 z = 40 = -.2 .0793 Go to table = .2881 z = 40 =.55 .2088 Go to table 24 also fine: % .2881 .2088 .0793 z =-.2 z =.55 80 120 160 200 240 280 320

56 = .9693 z = = 1.875 Go to table Add area Lower half .4693 or .4699 = .9699 40 25 Please note: If z-score rounded to 1.88, then percentile = 96.99% also fine: % .9693 .4693 .5000 z =1.875 80 120 160 200 240 280 320

57 Add area Lower half = .0089 = .0087 z = 40 z = 2.375 Go to table .4911 or .4913 26 Please note: If z-score rounded to 2.38, then area = .0087 also fine: 0.89% .4911 .0089 z =2.375 80 120 160 200 240 280 320

58 z = 40 = -1.75 .4599 Add to upper Half of curve Go to table = .9599 27 also fine: % .9599 .5000 .4599 z =-1.75 80 120 160 200 240 280 320

59 40 z = = -1.75 .4599 Subtract from .5000 = .0401 Go to table 28 .0401 .4599 .5000 also fine: 4.01% z =-1.75 80 120 160 200 240 280 320

60 z table provides area from mean to score
x = mean + z σ = (2.33)(40) = 293.2 99th percentile Find area of interest = .4900 Find nearest z = 2.33 29 .4900 .9900 z table provides area from mean to score .5000 293.2 z =2.33 80 ? 120 160 200 240

61 z table provides area from mean to score
33rd percentile Find area of interest = .1700 Find nearest z = .44 x = mean + z σ = (-.44)(40) = 182.4 30 z table provides area from mean to score Note: =.50 .3300 .1700 182.4 z =-.44 ? 80 200 240 280 320

62 z table provides area from mean to score
40th percentile Find area of interest = .1000 Find nearest z = -.25 x = mean + z σ = (-.25)(40) = 190 31 z table provides area from mean to score Note: =.50 182.4 .1000 .4000 z =-.25 ? 80 200 240 280 320

63 z table provides area from mean to score
67th percentile Find area of interest = .1700 Find nearest z = .44 x = mean + z σ = (.44)(40) = 217.6 32 z table provides area from mean to score .1700 z =.44 80 200 217.6 ? 320

64 Mean of 30 and standard deviation of 2
. Try this one: Please find the (2) raw scores that border exactly the middle 95% of the curve Mean of 30 and standard deviation of 2 Go to table .4750 nearest z = 1.96 mean + z σ = 30 + (1.96)(2) = 33.92 Go to table .4750 nearest z = -1.96 mean + z σ = 30 + (-1.96)(2) = 26.08 .9500 .475 .475 Additional practice 26.08 33.92 24 ? 28 30 32 ? 36

65 Mean of 30 and standard deviation of 2
. Try this one: Please find the (2) raw scores that border exactly the middle 99% of the curve Mean of 30 and standard deviation of 2 Go to table .4750 nearest z = 2.58 mean + z σ = 30 + (2.58)(2) = 35.16 Go to table .4750 nearest z = -2.58 mean + z σ = 30 + (-2.58)(2) = 24.84 .9900 .495 .495 Additional practice 24.84 ? 35.16 28 32 ? 30

66 Finding the interval that borders the middle of the curve
. Finding the interval that borders the middle of the curve Which is wider? Please find the raw scores that border the middle 95% of the curve Please find the raw scores that border the middle 99% of the curve

67 z table Formula Normal distribution Raw scores z-scores probabilities
Have z Find raw score Z Scores Have z Find area z table Formula Have area Find z Area & Probability Have raw score Find z Raw Scores

68 Comments on Dan Gilbert Reading

69 Law of large numbers: As the number of measurements
increases the data becomes more stable and a better approximation of the true (theoretical) probability As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate.

70 Law of large numbers: As the number of measurements
increases the data becomes more stable and a better approximation of the true signal (e.g. mean) As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out) With only a few people any little error is noticed (becomes exaggerated when we look at whole group) With many people any little error is corrected (becomes minimized when we look at whole group)

71 Sampling distributions of sample means
versus frequency distributions of individual scores Distribution of raw scores: is an empirical probability distribution of the values from a sample of raw scores from a population Eugene X X X X X X Frequency distributions of individual scores derived empirically we are plotting raw data this is a single sample X Melvin X X X X X X Take a single score Repeat over and over x x x Population x x Preston x x x

72 important note: “fixed n”
Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population important note: “fixed n” Sampling distributions of sample means theoretical distribution we are plotting means of samples Take sample – get mean Repeat over and over Population Mean for 1st sample

73 important note: “fixed n”
Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population important note: “fixed n” Sampling distributions of sample means theoretical distribution we are plotting means of samples Take sample – get mean Repeat over and over Population Distribution of means of samples

74 Sampling distribution: is a theoretical probability distribution of
the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Eugene Frequency distributions of individual scores derived empirically we are plotting raw data this is a single sample X X Melvin X X X X X X X X X X X Sampling distributions sample means theoretical distribution we are plotting means of samples 23rd sample 2nd sample

75 Sampling distribution for continuous distributions
Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Sampling Distribution of Sample means Melvin X Eugene 23rd sample X 2nd sample

76 Sampling distribution: is a theoretical probability distribution of
the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Notice: Standard Error of the Mean (SEM) is smaller than SD – especially as n increases Eugene X X X X X X X Melvin X X µ= 100 X X Mean = 100 X X σ = 3 100 Standard Deviation = 3 23rd sample An example of a sampling distribution of sample means 2nd sample µ = 100 Mean = 100 Standard Error of the Mean = 1 = 1 100

77 Central Limit Theorem x will approach µ
Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population As n ↑ x will approach µ Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population As n ↑ curve will approach normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ curve variability gets smaller X

78 Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Related but not identical Related but not identical Proposition 1: If sample size (n) is large enough (e.g. ~30) the mean of the sampling distribution will approach the mean of the population Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true (theoretical) probability. Larger sample sizes tend to be associated with stability. As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate.

79 population population population n = 2 n = 5 n = 4 n = 30 n = 5 n = 25
Take sample (n = 5) – get mean Proposition 2: If sample size (n) is large enough (e.g. 100), the sampling distribution of means will be approximately normal, regardless of the shape of the population Repeat over and over Population population population population sampling distribution n = 2 sampling distribution n = 5 sampling distribution n = 4 sampling distribution n = 30 sampling distribution n = 5 sampling distribution n = 25

80 Central Limit Theorem Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

81 Central Limit Theorem Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

82 Central Limit Theorem Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

83 Central Limit Theorem Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. X

84 Central Limit Theorem x will approach µ
Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population As n ↑ x will approach µ Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population As n ↑ curve will approach normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ curve variability gets smaller X

85 Central Limit Theorem Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. X

86 Central Limit Theorem x will approach µ
Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population As n ↑ x will approach µ Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population As n ↑ curve will approach normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ curve variability gets smaller X

87 Animation for creating sampling distribution of sample means
Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Animation for creating sampling distribution of sample means Distribution of single sample Eugene Melvin Sampling Distribution of Sample means Sampling Distribution of Sample means Mean for sample 12 Mean for sample 7

88 Sampling distribution for continuous distributions
Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Sampling Distribution of Sample means Melvin X Eugene 23rd sample X 2nd sample

89 Central Limit Theorem x will approach µ
Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population As n ↑ x will approach µ Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population As n ↑ curve will approach normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ curve variability gets smaller X

90 Writing Assignment: Writing a letter to a friend
. Writing Assignment: Writing a letter to a friend Imagine you have a good friend (pick one). This is a good friend whom you consider to be smart and interested in stuff generally. They are teaching themselves stats (hoping to test out of the class) but need your help on a couple ideas. For this assignment please write your friend/mom/dad/ favorite cousin a letter answering these two questions: (Feel free to use diagrams and drawings if you think that can help) Dear Friend, 1. I’m struggling with this whole Central Limit Theorem idea. Could you describe for me the difference between a distribution of raw scores, and a distribution of sample means? 2. I also don’t get the “three propositions of the Central Limit Theorem”. They all seem to address sample size, but I don’t get how sample size could affect these three things. If you could help explain it, that would be really helpful.

91 distribution of sample means?
Imagine you have a good friend (pick one). This is a good friend whom you consider to be smart and interested in stuff generally. They are teaching themselves stats (hoping to test out of the class) but need your help on a couple ideas. For this assignment please write your friend/mom/dad/ favorite cousin a letter answering these two questions: (Feel free to use diagrams and drawings if you think that can help) Dear Friend, 1. I’m struggling with this whole Central Limit Theorem idea. Could you describe for me the difference between a distribution of raw scores, and a distribution of sample means? 2. I also don’t get the “three propositions of the Central Limit Theorem”. They all seem to address sample size, but I don’t get how sample size could affect these three things. If you could help explain it, that would be really helpful. .

92 .

93 Comments on Dan Gilbert Reading

94 Standard Error of the Mean (SEM)
. Standard Error of the Mean (SEM) Building towards Confidence Intervals Reviewing the components for calculating confidence intervals: How to calculate confidence intervals Simply finding the raw score that is a certain distance from the mean that is associated with an area under the curve When to use confidence intervals: when you are estimating (guessing) a single number by providing likely range that the number appears in The relevance of the Central Limit Theorem When we are predicting a value we will use the standard error of the mean (rather than the standard deviation)

95 Building towards confidence intervals
Normal distribution Raw scores z-scores probabilities Have z Find raw score Z Scores Have z Find area z table Formula Have area Find z Area & Probability Have raw score Find z Raw Scores Building towards confidence intervals Part 1

96 . Writing Assignment - Try this one: Please find the (2) raw scores that border exactly the middle 95% of the curve Mean of 100 and standard deviation of 5 Go to table .4750 nearest z = 1.96 mean + z σ = (1.96)(5) = Go to table .4750 nearest z = -1.96 mean + z σ = (-1.96)(5) = 90.20 .9500 .475 .475 Building towards confidence intervals 90.2 109.8 85 ? 95 105 ? 100 115 Part 1

97 . Try this one: Please find the (2) raw scores that border exactly the middle 95% of the curve Mean of 30 and standard deviation of 2 Go to table .4750 nearest z = 1.96 mean + z σ = 30 + (1.96)(2) = 33.92 Go to table .4750 nearest z = -1.96 mean + z σ = 30 + (-1.96)(2) = 26.08 .9500 .475 .475 Building towards confidence intervals 26.08 ? 33.92 28 32 ? 30 Part 1

98 . Try this one: Please find the (2) raw scores that border exactly the middle 99% of the curve Mean of 30 and standard deviation of 2 Go to table .4950 nearest z = 2.58 mean + z σ = 30 + (2.58)(2) = 35.16 Go to table .4950 nearest z = -2.58 mean + z σ = 30 + (-2.58)(2) = 24.84 .9900 .495 .495 Building towards confidence intervals 24.84 ? 35.16 28 32 ? 30 Part 1

99 Finding the interval that borders the middle of the curve
. Finding the interval that borders the middle of the curve Which is wider? Please find the raw scores that border the middle 95% of the curve Building towards confidence intervals Please find the raw scores that border the middle 99% of the curve Part 1

100 Remember Confidence Intervals
. Remember Confidence Intervals 95% Confidence Interval: We can be 95% confident that the estimated score really does fall between these two scores Please find the raw scores that border the middle 95% of the curve 99% Confidence Interval: We can be 99% confident that the estimated score really does fall between these two scores Building towards confidence intervals Please find the raw scores that border the middle 99% of the curve Part 1

101 Building towards confidence intervals
. Building towards confidence intervals Part 2

102 Remember Confidence Intervals
Standard Error of the Mean (SEM) Remember Confidence Intervals Confidence Intervals (based on z): We are using this to estimate a value such as a population mean, with a known degree of certainty with a range of values The interval refers to possible values of the population mean. Subjective vs Empirical We can be reasonably confident that the population mean falls in this range (90%, 95%, or 99% confident) In the long run, series of intervals, like the one we figured out will describe the population mean about 95% of the time. Can actually generate CI for any confidence level you want – these are most common Building towards confidence intervals Part 3

103 Building towards confidence intervals
Confidence Intervals (based on z): A range of values that, with a known degree of certainty, includes an unknown population characteristic, such as a population mean How can we make our confidence interval smaller? Decrease variability by increasing sample size Decrease variability through more careful assessment and measurement practices (minimize noise) Decrease level of confidence . 95% Building towards confidence intervals 95% Part 3

104 Standard Error of the Mean (SEM)
. Standard Error of the Mean (SEM) Building towards Confidence Intervals We now know all components of actually calculating confidence intervals: When to use confidence intervals: when you are estimating (guessing) a single number by providing likely range that the number appears in How to calculate confidence intervals Simply finding the raw score that is a certain distance from the mean that is associated with an area under the curve The relevance of the Central Limit Theorem When we are predicting a value we will use the standard error of the mean (rather than the standard deviation)

105 We will be using this same logic for “confidence intervals”
Mean = 50 Standard deviation = 10 Find the scores for the middle 95% ? ? 95% x = mean ± (z)(standard deviation) 30.4 69.6 ? .9500 .4750 Please note: We will be using this same logic for “confidence intervals” 1) Go to z table - find z score for for area .4750 z = 1.96 2) x = mean + (z)(standard deviation) x = 50 + (-1.96)(10) x = 30.4 30.4 3) x = mean + (z)(standard deviation) x = 50 + (1.96)(10) x = 69.6 69.6 Scores capture the middle 95% of the curve

106 is captured by the scores 48.04 – 51.96
Confidence intervals Mean = 50 Standard deviation = 10 n = 100 s.e.m. = 1 ? ? 95% Find the scores for the middle 95% 48.04 51.96 For “confidence intervals” same logic – same z-score But - we’ll replace standard deviation with the standard error of the mean ? .9500 .4750 standard error of the mean σ n = 10 = 100 x = mean ± (z)(s.e.m.) x = 50 + (1.96)(1) x = x = 50 + (-1.96)(1) x = 95% Confidence Interval is captured by the scores – 51.96

107 σ n Find a 95% Confidence Interval for this distribution = √ √
mean = 121 standard deviation= 15 n = 25 Find a 95% Confidence Interval for this distribution raw score = mean + (z score)(standard error) standard error of the mean σ n = 15 = = 3 25 raw score = mean ± (z score)(sem) x = x ± (z)(σx) X = 121 ± (1.96)(3) = 121 ± 5.88 Please notice: We know the standard deviation and can calculate the standard error of the mean from it (115.12, ) confidence interval

108 Confidence intervals ? ? Tell me the scores that border exactly
95% ? Tell me the scores that border exactly the middle 95% of the curve We know this raw score = mean ± (z score)(s.d.) Construct a 95 percent confidence interval around the mean Similar, but uses standard error the mean based on population s.d. raw score = mean ± (z score)(s.e.m.)

109 Confidence intervals Level of Alpha 1.96 = .05 1.64 = .10 2.58 = .01
Construct a 95 percent confidence interval around the mean Tell me the scores that border exactly the middle 95% of the curve - use z score of 1.96 Level of Alpha 1.96 = .05 1.64 = .10 90% 2.58 = .01 z scores for different levels of confidence How do we know which z score to use?

110 Confidence Interval of 95% Has and alpha of 5% α = .05
Critical z -2.58 Critical z 2.58 Confidence Interval of 99% Has and alpha of 1% α = .01 99% Area outside confidence interval is alpha Critical z -1.96 Critical z 1.96 Confidence Interval of 95% Has and alpha of 5% α = .05 95% Area in the tails is called alpha Critical z -1.64 Critical z 1.64 Confidence Interval of 90% Has and alpha of 10% α = . 10 90% Area associated with most extreme scores is called alpha

111 Thank you! See you next time!!


Download ppt "BNAD 276: Statistical Inference in Management Winter, 2015"

Similar presentations


Ads by Google