Presentation is loading. Please wait.

Presentation is loading. Please wait.

Refresher on tests that we know Several examples Fatal errors

Similar presentations


Presentation on theme: "Refresher on tests that we know Several examples Fatal errors"— Presentation transcript:

1 Which statistical test is best for my data? or: What do you graph, dear? What do you test, dear?
Refresher on tests that we know Several examples Fatal errors Data for you to analyze

2 Which tests and graphs fit which situations?
Independent variable (X) Dependent variable (Y) Graph Test Continuous Scatterplot Linear regression Class Bar graph t-test, ANOVA Bar graph, contingency table Chi-square Scatterplot, bar chart (Logistic regression) David Strayer, Cary Institute All of these tests assume that the data are independent and normally distributed (more on this later!)

3 Claim 1: Money can’t buy you love, but it can buy you a good ball team
Specifically, claim is that baseball teams with bigger salaries win more games than those will smaller salaries Data are average (mean) salaries and winning percentages for the 2012 baseball season

4 The data TEAM AVG SALARY winning percentage Arizona Diamondbacks
$ 2,653,029 0.5 Atlanta Braves $ 2,776,998 0.58 Baltimore Orioles $ 2,807,896 0.574 Boston Red Sox $ 5,093,724 0.426 Chicago Cubs $ 3,392,193 0.377 Chicago White Sox $ 3,876,780 0.525 Cincinnati Reds $ 2,935,843 0.599 Cleveland Indians $ 2,704,493 0.42 Colorado Rockies $ 2,692,054 0.395 Detroit Tigers $ 4,562,068 0.543 Houston Astros $ 2,332,730 0.34 Kansas City Royals $ 2,030,540 0.444 Los Angeles Angels $ 5,327,074 0.549 Los Angeles Dodgers $ 3,171,452 0.531 Miami Marlins $ 4,373,259 Milwaukee Brewers $ 3,755,920 0.512 Minnesota Twins $ 3,484,629 0.407 New York Mets $ 3,457,554 0.457 New York Yankees $ 6,186,321 0.586 Oakland Athletics $ 1,845,750 Philadelphia Phillies $ 5,817,964 Pittsburgh Pirates $ 2,187,310 0.488 San Diego Padres $ 1,973,025 0.469 San Francisco Giants $ 3,920,689 Seattle Mariners $ 2,927,789 0.463 St. Louis Cardinals $ 3,939,316 Tampa Bay Rays $ 2,291,910 0.556 Texas Rangers $ 4,635,037 Toronto Blue Jays $ 2,696,042 0.451 Washington Nationals $ 2,623,746 0.605 The data

5 How is this claim best evaluated? -graph and statistical analysis

6 How is this claim best evaluated? -graph and statistical analysis
Scatter plot

7 How is this claim best evaluated? -graph and statistical analysis
Scatter plot, Linear regression

8 Conclusion Money can’t buy you a winning ball team, either

9 Claim 2: Eels control crayfish populations
Specifically, claim is that crayfish population densities are lower in streams where eels are present Background: dietary studies show that eels eat a lot of crayfish, and old Swedish stories suggest that eels eliminate crayfish Data are crayfish densities (count along transects, snorkelling) in local streams with and without eels

10 The data River Site Crayfish (no./m^2) eels Croton Green Chimneys
3.225 PEP 0.119 Delaware Buckingham 0.25 1 Callicoon Hankins 0.109 Mongaup Pond Eddy 0.067 Neversink Bridgeville 0.233 TNC Shawangunk Mount Hope 4.53 Ulsterville 1.1 Webatuck Levin 0.812 Shope 1.719 Still Point 1.4

11 How is this claim best evaluated? -graph and statistical analysis

12 How is this claim best evaluated? -graph and statistical analysis
Bar graph

13 How is this claim best evaluated? -graph and statistical analysis
Bar graph, t-test p = 0.02

14 Conclusion Looks like streams containing eels have fewer crayfish

15 Claim 3: Human life expectancy varies among continents
Data are mean life expectancy for women in different countries

16 The data Africa Asia Americas Europe algeria 75 bangladesh 70.2
argentina 79.9 austria 83.6 cameroon 53.6 china 75.6 brazil 77.4 belgium 82.8 cote d'ivoire 57.7 india 67.6 canada 85.3 bulgaria 77.1 egypt 75.5 indonesia 71.8 chile 82.4 czech rep 81 kenya 59.2 iran 75.3 columbia 77.7 denmark 87.4 morocco 74.9 japan 87.1 mexico 79.6 estonia 80 nigeria 53.4 malaysia 76.9 peru finland 83.3 south africa 54.1 pakistan 66.9 usa 81.3 france 84.9 zimbabwe 52.7 philippines 72.6 venezuela germany 83 singapore 83.7 greece 82.6

17 How is this claim best evaluated? -graph and statistical analysis

18 How is this claim best evaluated? -graph and statistical analysis
Bar graph Note that y-axis doesn’t start at 0

19 How is this claim best evaluated? -graph and statistical analysis
Bar graph, 1-way ANOVA, p =

20 Anova: Single Factor SUMMARY Groups Count Sum Average Variance Africa 9 556.1 Asia 10 747.7 74.77 Americas 718.2 79.8 7.7875 Europe 825.7 82.57 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 2351.6 3 1.42E-07 Within Groups 34 Total 37

21 Conclusion Life expectancy of women appears to differ among continents
(The ANOVA doesn’t tell us which continents are different; further tests would be necessary to test claims about specific continents)

22 Claim 4: predators with experience eat more invasive prey
Specific claim is that sunfish from bodies of water that were invaded a long time ago will eat more zebra mussels than sunfish from recently invaded waters or waters without zebra mussels Data are from an aquarium experiment using sunfishes from rivers invaded 20 years ago, a lake that was invaded 9 years ago, and streams without zebra mussels Each aquarium contained 15 zebra mussels; the number of mussels eaten in 3 days was recorded

23 The data old invasions recent invasions uninvaded 15 5 2 14 1 12 8 3
11 6

24 How is this claim best evaluated? -graph and statistical analysis

25 How is this claim best evaluated? -graph and statistical analysis
Bar graph

26 How is this claim best evaluated? -graph and statistical analysis
Bar graph, p =

27 Anova: Single Factor SUMMARY Groups Count Sum Average Variance old invasions 12 160 recent invasions 6 64 uninvaded 9 13 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 2 8.67E-08 Within Groups 24 Total 26

28 Conclusion Fish living in places that have had zebra mussels for a long time eat more zebra mussels

29 Claim 5: Zebra mussels reduce phytoplankton biomass in the Hudson
Data are growing-season (May-Sept) means for zebra mussel population filtration rate and phytoplankton biomass in the freshwater tidal Hudson River

30 The data year ZMFR chl a 1987 0.00 17.45 1988 28.95 1989 17.25 1990 17.52 1991 0.03 25.48 1992 0.36 12.18 1993 7.10 5.04 1994 3.96 4.91 1995 4.44 5.34 1996 2.60 3.74 1997 6.50 6.89 1998 5.06 7.50 1999 2.79 6.56 2000 3.59 4.40 2001 3.02 11.47 2002 5.70 5.44 2003 2.34 4.81 2004 3.16 4.64 2005 1.51 8.84 2006 0.06 5.90 2007 4.09 4.10 2008 0.26 4.96 2009 4.08 6.71

31 How is this claim best evaluated? -graph and statistical analysis

32 How is this claim best evaluated? -graph and statistical analysis
scatterplot

33 How is this claim best evaluated? -graph and statistical analysis
Scatterplot, linear regression, … but clearly not linear

34 How is this claim best evaluated? -graph and statistical analysis
Non-linear regression (available in many statistical packages) Not really fair to choose a non-linear model after looking at the data, so think about whether your claim suggests a linear model or a non-linear one before analyzing the data

35 Conclusion Yes, it looks like zebra mussel feeding reduces phytoplankton population in the Hudson The relationship is nonlinear

36 What to do if both variables are class variables?
Status birds + mammals FW fish FW shellfish FW insects extinct (GX, GH) 1.65 2.13 6.56 1.34 Critically imperiled (G1) 3.71 11.39 20.85 2.2 Imperiled (G2) 4.89 10.89 15.6 9.09 Vulnerable (G3) 7.95 13.14 17.24 19.98 Secure (G4, G5) 81.57 62.33 39.73 67.4

37 What to do if the predictor variable is continuous but the response variable is a class variable?
Baby mussels present Baby mussels absent

38 Common fatal errors: non-independence

39 Common fatal errors: undue influence of a single point

40 Claims for you to test Large, mobile predators (i.e., crabs) reduce zebra mussel populations in the Hudson Cell phone ownership increases with income among countries Levels of dissolved oxygen affect behavior of baby mussels


Download ppt "Refresher on tests that we know Several examples Fatal errors"

Similar presentations


Ads by Google