Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.

Similar presentations


Presentation on theme: "Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do."— Presentation transcript:

1 Fitting probability models to frequency data

2 Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do two things: –Estimate a parameter with confidence interval –Test a hypothesis

3 Estimating a proportion

4 Confidence interval for a proportion* where Z = 1.96 for a 95% confidence interval * The Agresti-Couli method

5 Hypothesis testing Want to know something about a population Take a sample from that population Measure the sample What would you expect the sample to look like under the null hypothesis? Compare the actual sample to this expectation

6 weird not so weird

7 Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o

8 Binomial test

9 Test statistic For the binomial test, the test statistic is the number of successes

10 Binomial test

11 The binomial distribution

12 Binomial distribution, n = 20, p = 0.5 x

13 x Test statistic

14 P-value P-value - the probability of obtaining the data* if the null hypothesis were true *as great or greater difference from the null hypothesis

15 P-value Add up the probabilities from the null distribution Start at the test statistic, and go towards the tail Multiply by 2 = two tailed test

16 Binomial distribution, n = 20, p = 0.5 x P = 2*(Pr[16]+Pr[17]+Pr[18] +Pr[19]+Pr[20])

17 Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o

18 N =20, p 0 =0.5 This is a pain….

19 Calculating P-values By hand Use computer software like jmp, excel Use tables

20 Sample Test statistic Null hypothesis Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o

21 Discrete distribution A probability distribution describing a discrete numerical random variable

22 Discrete distribution A probability distribution describing a discrete numerical random variable Examples: –Number of heads from 10 flips of a coin –Number of flowers in a square meter –Number of disease outbreaks in a year

23  2 Goodness-of-fit test Compares counts to a discrete probability distribution

24 Hypotheses for  2 test

25 Test statistic for  2 test

26

27

28 Hypotheses for day of birth example

29 DaySun.Mon.Tues.WedThu.Fri.Sat.Total Obs.334163 475647350 Exp.50 350

30 The calculation for Sunday

31

32 The sampling distribution of  2 by simulation Frequency 22

33 Sampling distribution of  2 by the  2 distribution

34 Degrees of freedom The number of degrees of freedom specifies which of a family of distributions to use as the sampling distribution

35 Degrees of freedom for  2 test df = Number of categories - 1 - (Number of parameters estimated from the data)

36 Degrees of freedom for day of birth df = 7 - 1 - 0 = 6

37 Finding the P-value

38 Critical value The value of the test statistic where P = .

39 12.59

40

41

42 P<0.05, so we can reject the null hypothesis Babies in the US are not born randomly with respect to the day of the week.

43

44 Assumptions of  2 test No more than 20% of categories have Expected<5 No category with Expected  1

45  2 test as approximation of binomial test If the number of data points is large, then a  2 goodness-of-fit test can be used in place of a binomial test. See text for an example.

46 The Poisson distribution Another discrete probability distribution Describes the number of successes in blocks of time or space, when successes happen independently of each other and occur with equal probability at every point in time or space

47

48 Poisson distribution

49 Example: Number of goals per side in World Cup Soccer Q: Is the outcome of a soccer game (at this level) random? In other words, is the number of goals per team distributed as expected by pure chance?

50 World Cup 2002 scores

51 Number of goals for a team (World Cup 2002)

52 What’s the mean,  ?

53 Poisson with  = 1.26 XPr[X] 00.284 10.357 20.225 30.095 40.030 50.008 60.002 70 88 0

54 Finding the Expected XPr[X]Expected 00.28436.3 10.35745.7 20.22528.8 30.09512.1 40.0303.8 50.0081.0 60.0020.2 700.04 88 00.007 } Too small!

55 Calculating  2 XExpectedObserved 036.3370.013 145.7470.037 228.8270.113 312.1130.067  4 5.040.200

56 Degrees of freedom for poisson df = Number of categories - 1 - (Number of parameters estimated from the data)

57 Degrees of freedom for poisson df = Number of categories - 1 - (Number of parameters estimated from the data) Estimated one parameter, 

58 Degrees of freedom for poisson df = Number of categories - 1 - (Number of parameters estimated from the data) = 5 - 1 - 1 = 3

59 Critical value

60 Comparing  2 to the critical value So we cannot reject the null hypothesis. There is no evidence that the score of a World Cup Soccer game is not Poisson distributed.


Download ppt "Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do."

Similar presentations


Ads by Google