Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agenda n Probability n Sampling error n Hypothesis Testing n Significance level.

Similar presentations


Presentation on theme: "Agenda n Probability n Sampling error n Hypothesis Testing n Significance level."— Presentation transcript:

1 Agenda n Probability n Sampling error n Hypothesis Testing n Significance level

2 Tossing a Coin If you toss a fair coin, what are the chances to get a head?

3 Probability Likelihood of getting Heads is 1 in 2 Number of trials (# of tosses) Frequency of occurrence (# getting Heads) Probability (p)= Probability of getting Heads=.5

4 Probability n Probability (p) ranges between 1 and 0 n p = 1 means that the event would occur in every trial n p = 0 means the event would never occur in any trial n The closer the probability is to 1, the more likely that the event will occur n The closer the probability is to 0, the less likely the event will occur

5 Probability Probability ( p ) % Every time Half the time 1 in 10 5 in 100 1 in 100 1.5.1.05.01 100% 50% 10% 5% 1%

6 Tossing a Coin # of times getting heads 0 5 10 # of people If a large number of people toss a fair coin 10 times, we will get a bell-shape distribution with the mean of 5.

7 Tossing a Coin # of times getting heads 0 5 10 # of people Normal Distribution (Bell-Curve)  Symmetrical  Largest # of cases at the mean  Few extreme cases

8 Simple Classical Probability n Probability of getting a particular outcome (e.g. Head of a coin) when each outcome has an equal chance to occur n Probability in a whole group n “Probability” usually refers to simple classical probability

9 Conditional Probability n Conditional probability refers to the probability of one event given that another event has occurred. n If knowledge of one event helps to predict the outcome of another event, these two events are dependent n If knowledge of one event does not help to predict the outcome of another event, these two events are independent

10 n Probability of purchasing a Gucci bag n Only 5 in 100 shoppers actually buy a bag n Probability (p)=.05

11 Example: Conditional Probability Event 1: Purchase of a Gucci bag Event 2: Possession of Ferragamo shoes

12 Example: Conditional Probability n Of shoppers who have Ferragamo shoes, 20 in 100 buy a bag n Conditional probability =.2 n Of shoppers who do not have Ferragamo shoes, 1 in 100 buy a bag n Conditional probability =.01

13 Conditional Probability 80% 20% Own 99% Not Buy 1% Buy Not Own Gucci Bag Ferragamo Shoes

14 Example: Heights n All students at School n Mean height = 5.8 ft.  Draw groups of 100 students  If your selection is random, what is the expected mean height of the group? PopulationSample

15 Example: Heights # of samples 5.8 Sample mean 5.8 5.7 5.6 5.5 5.4 5.9 6.0 6.1 = Expected Value

16 Example: Height at two Schools n Draw a sample from School A and School B A B  Measure heights of students in a sample from each school  = ? Sample A Sample B

17 Example: Height at two Schools n Mean height of Sample A is 5.9 and Mean height of Sample B is 5.7 A B  Chances of getting the mean of 5.7 and the mean of 5.9 from the same type of population are high  = ? Sample A Sample B 5.75.9

18 Example: Height at two Schools 5.8 # of samples Mean 5.7Mean 5.9

19 What if… n Mean height of Sample A is 5.2 and Mean height of Sample B is 6.1 A B  Chances of getting the mean of 5.2 and the mean of 6.1 are low  = ? Sample A Sample B 5.26.1

20 Example: Height at two Schools 5.8 # of samples Mean 5.2Mean 6.1

21 Example: Height at two Schools School A  # of samples Mean 5.2 Mean 6.1 School B 

22 P >.05 means that … 95%  Means of two groups fall in 95% central area of normal distribution with one population mean Mean 1 Mean 2

23 P <.05 means that … 11 22  Means of two groups do NOT fall in 95% central area of normal distribution of one population mean, so it is more reasonable to assume that they belong to different populations

24 SIMPLE RANDOM SAMPLE Population of 40: 25% 50% Sample of 4 : Each person 1/10 chance Sample A Sample BSample C Sample D

25 Random sampling error  Random sampling error: Difference between sample characteristics and population characteristics Difference between sample characteristics and population characteristics caused by chance  Sampling bias : Difference between sample characteristics and population characteristics caused by biased (non-random) sampling

26 SYSTEMATIC SAMPLE Population of 40: 25% 50% For a sample of 4, Take every 10 th one Sample B Sample A

27 67% orange 33% white 67% orange 33% white 83% orange 17% white

28 Sample Statistics X SD n Population Parameters m s N

29 1.Infer characteristics of a population from the characteristics of the samples. 2.Hypothesis Testing 3.Statistical Significance 4.The Decision Matrix

30 Inferential Statistics n assess -- are the sample statistics indicators of the population parameters? n Differences between 2 groups -- happened by chance? n What effect do random sampling errors have on our results?

31 Null Hypothesis Says IV has no influence on DV There is no difference between the two variables. There is no relationship between the two variables.

32 Logic of Inferential Statistics n You are cautious ! n Default assumption = null hypothesis (no difference) n Assume any differences in your data are due to chance variation (sampling error) n What are the chances I would get these results if null hypothesis is true? n Only if pattern is highly unlikely (p .05) do you reject null hypothesis

33 Type I error Correct Data results are by chance (Null is true) Correct Data indicates something significant is happening (reject null) Type II error There is nothing happening except chance variation (accept the null) Data indicates something is happening (Null is false) True state Your decision: 

34 Null Hypothesis n States there is NO true difference between the groups n If sample statistics show any difference, it is due to random sampling error n Referred as H 0 n (Research Hypothesis = Ha) n If you can reject H 0, you can support Ha n If you fail to reject H 0, you reject Ha

35 Two Possible Errors Type I Error Type II Error Correct Reject H 0 Fail to reject H 0 H 0 is trueH 0 is false IN FACT … YOU…

36 Correct No fireFire Type II error No Alarm AlarmType I error

37 Correct H o (no fire) H a (fire) H o = null hypothesis = there is NO fire H a = alternative hyp. = there IS a FIRE Accept H o (no fire) Type II error Type I error Reject H o (alarm) True State

38 n What you want to know is what is going on in the population? n All you have is sample data n Your hypothesis states there is a difference between groups n Null hypothesis states there is NO difference between groups n Even though your sample data show some difference between groups, there is a chance that there is no difference in population

39 n Be conservative about your conclusion. Unless you are highly confident, don’t support your hypothesis over null hypothesis n Since you cannot be 100% sure about whether or not your conclusion is correct, you take up to 5% risk (5% chance making TYPE I Error) n Your p-value tells you the risk (i.e., probability) of TYPE I Error

40 Significance Test n Significance test examines the probability of TYPE I error (falsely rejecting H 0 ) n Significance test examines how probable it is that the observed difference is caused by random sampling error n Reject the null hypothesis if probability is <.05 (probability of TYPE I error is smaller than.05)

41 P <.05 Reject Null Hypothesis (H 0 ) Support Your Hypothesis (H a )

42 Logic of Hypothesis Testing Statistical tests used in hypothesis testing deal with the probability of a particular event occurring by chance. Are the results common or a rare occurrence if only chance is operating??? A score (or result of a statistical test) is “Significant” if score is unlikely to occur on basis of chance alone.

43 The “Level of Significance” is a cutoff point for determining significantly rare or unusual scores. Scores outside the middle 95% of a distribution are considered “Rare” when we adopt the standard “5% Level of Significance” This level of significance can be written as: p =.05 Level of Significance

44 Decision Rules Reject H o (accept H a ) when sample statistic is statistically significant at chosen p level, otherwise accept H o (reject H a ). Possible errors: You reject Null Hypothesis when in fact it is true, Type I Error, or Error of Rashness. B.You accept Null Hypothesis when in fact it is false, Type II Error, or Error of Caution.

45 SamplePopulation ? Male GPA= 3.3 Female GPA = 3.6 Inferential N=100 Parameter Ha: Female students have higher GPA than Male students at UH

46 Three Possibilities n Females really have higher GPA than Males n Females with higher GPA are disproportionately selected because of sampling bias n Females’ GPA happened to be higher in this particular sample due to random sampling error Male GPA= 3.3 Female GPA =3.6

47

48

49 What Statistics CANNOT do n Statistics canNOT think or reason. It’s only you who can think. n Statistics can NOT show causality; can show co-occurrence, which only implies causality. n Statistics is about probability, thus can NOT prove your argument. It can only support it. n We reject the null hypothesis if probability is <.05 (probability of TYPE I error is smaller than.05)

50 What Statistics CAN Do n Allow you to grasp a large picture n Examine the level of co-occurrence of different events (correlation/association) n Can support your argument by providing empirical evidence

51

52 How can I accurately tell if there is a meaningful difference between the subgroups? Question: Answer : Use Inferential Statistics techniques to help you decide if the differences you found could be due to chance, or if they a likely to reflect a true difference between the groups. Footnote : Statistical jargon: If the differences are too large to be due to chance, we say there is a Significant Difference between the groups. We also know the probability that our conclusions may be incorrect.

53 When comparing two groups on MEAN SCORES use the t-test. When you are comparing more than two groups on MEAN SCORES, you use a more complicated version of the t-test, called Analysis of Variance.

54 Males: Mean = 11.3 SD = 2.8 n = 135 Females: Mean = 12.6 SD = 3.4 n = 165 Mean scores reflect real difference between genders. Mean scores are just chance differences from a single distribution. ** Accept Ha Accept Ho p =.02

55 Married: Mean = 11.9 SD = 3.8 n = 96 Single: Mean = 12.1 SD = 4.3 n = 204 Mean scores reflect real difference between groups. Mean scores are just chance differences from a single distribution. Accept Ha **Accept Ho p =.91

56 To compare two groups on Mean Scores use the t-test. For more than 2 groups use Analysis of Variance (ANOVA) To compare survey data from Nominal or Ordinal Scales -- without a Mean Score, so use a Nonparametric Tests. Chi Square tests the difference in Frequency Distributions of two or more groups.

57 When to use various statistics n Parametric n Interval or ratio data n Non-parametric n Use with ordinal and nominal data

58 Parametric Tests n Used with data w/ mean score or standard deviation. n t-test, ANOVA and Pearson’s Correlation r. n n Use a t-test to compare mean differences between two groups (e.g., male/female and married/single). n

59 Parametric Tests n use ANalysis Of VAriance (ANOVA) to compare more than two groups (such as age and family income) to get probability scores for the overall group differences. n Use a Post Hoc Tests to identify which subgroups differ significantly from each other.

60 T-test n If p<.05, we conclude that two groups are drawn from populations with different distribution (reject H 0 ) at 95% confidence level


Download ppt "Agenda n Probability n Sampling error n Hypothesis Testing n Significance level."

Similar presentations


Ads by Google