Choosing the correct analysis
Some research questions How many times each semester do Penn State students go “home”? What percentage of Penn State students smoke cigars? Do a higher percentage of Alaskans commit suicide than non-Alaskans? How much heavier are male students than female students?
Choosing the correct analysis Depends on the type of data –continuous or categorical Depends on the number of groups –1, 2, or more Depends on the research question –Testing hypotheses: is it this? –Estimation: what is it? Depends on the assumptions made
One Group, Categorical (Binary) Data One- or two-sided hypothesis: Z-test for one proportion Two-sided hypothesis: Chi-square test Estimation: Z-interval for one proportion In Minitab: –Stat >> Basic Stat >> 1 proportion...
Examples: One Group, Binary Data Estimation (Z-interval): What proportion of students have an E in their last name? Hypothesis (Z-test): Do a majority of students work during the semester? –H 0 : p = 0.5 versus H A : p > 0.5
Two Groups, Categorical (Binary) Data One- or two-sided hypothesis: Z-test for two proportions Two-sided hypothesis: Chi-square test Estimation: Z-interval for two proportions In Minitab: –Stat >> Basic Stat >> 2 proportions … –Stat >> Tables >> Chi-Square Test...
Examples: Two Groups, Binary Data Different proportions of male students who snore and female students who snore? –Two groups: Males, Females –Binary Data: Snore or not –Determine proportion of male snorers and proportion of female snorers. Hypothesis testing: Tells us if proportions are different. Estimation: Tells us by how much the proportions differ.
More than two Groups, Categorical Data Two-sided hypothesis: Chi-square test for more than two proportions In Minitab: –Stat >> Tables >> Chi-Square Test...
Examples: > 2 Groups, Categorical Data Is rate of cigarette smoking independent of semester standing? –Four groups: Freshmen, sophomores, juniors, seniors –Categorical (binary) data: Smoker or not –Determine proportion of freshman smokers, sophomore smokers, junior smokers, senior smokers.
One group, Continuous data Hypotheses: –t-test for one mean –sign test or signed rank test for one median Estimation: t-interval for one mean In Minitab: –Stat >> Basic Stat >> 1-sample t … –Stat >> Nonparametrics >> 1-sample sign… –Stat >> Nonparametrics >> 1-sample Wilcoxon…
Examples: One Group, Continuous Data Estimation (t-interval): What is the mean length of student’s middle finger? Hypothesis (t-test): Is mean IQ larger than 100? –H 0 : = 100 versus H A : > 100 Hypothesis (sign test): Is median income greater than 40,000? –H 0 : m = 40,000 versus H A : m >40,000
Two Paired Groups, Continuous Data Hypotheses: –Paired t-test for mean difference –Sign test or signed rank test for median difference Estimation: Paired t-test for mean difference In Minitab: –Stat >> Basic Stat >> Paired t… –Stat >> Nonparametrics >> 1-sample sign… –Stat >> Nonparametrics >> 1-sample Wilcoxon…
Examples: Two Paired Groups, Continuous Data Do people’s pulse rates increase after exercise? –Two paired groups: People before, same people after –Continuous Data: Pulse rates –Determine average difference in pulse rates. Hypothesis testing: Tells us if mean difference is 0. Estimation: Tells us how much mean differs from 0.
Two Independent Groups, Continuous Data Hypotheses: Two-sample t-test for difference in means. Estimation: Two-sample t-interval for difference in means. In Minitab: –Stat >> Basic Stat >> 2-sample t...
Examples: Two Independent Groups, Continuous Data Do male and female pulse rates differ? –Two independent groups: Males, Females –Continuous data: Pulse rates –Determine difference in average pulse rates. Hypothesis testing: Tells us if difference in means is 0. Estimation: Tells us by how much the means differ.
More than two independent Groups, Continuous data Hypotheses: Analysis of variance In Minitab: –Stat >> ANOVA >> One-way... –Stat >> ANOVA >> One-way (unstacked)...
One Group, Two Continuous Variables Correlation: Does a linear relationship exist? Linear regression: What is the linear relationship?
Example: One Group, Two Measurement Variables Correlation: Does a relationship exist between number of nights out and GPA? Linear regression: If someone goes out 10 times each month, what kind of a GPA can they expect?
Choosing the correct analysis First ask: how many groups? Then: what type of data? Summarized by a proportion (percentage) or average (mean) or median? Then: hypothesis testing (“is there a difference”) or estimation (“how much”)? Then: don’t forget assumptions….
Some more research questions Do seniors earn higher semester grade point averages than freshmen? What is the relationship between amount of alcohol consumed (in ounces) and level of coordination (on a scale from 1 to 10)? Are number of skipped classes and student’s course grade linearly related?
Some more research questions Is there a difference in the percentage of NCAA football players who graduate and NCAA basketball players who graduate? How many hours per week do PSU students study outside of class? How much more prevalent is lupus in women than in men?
Some more research questions Do PSU students drink, on average, more than 1 cup of coffee per day during finals week? (During finals week, a sample of students will record how many cups of coffee they drink each day.) Is recovery time from migraine related to treatment (A, B, C)?
Some more research questions Is there a relationship between political affiliation and income? How much heavier (in pounds) are 15-year old boys than 13-year old boys? A random sample of 64 students were asked: “Do you study regularly at Pattee?”
One last research question Do Goodyear tires have better tread wear than Firestone tires? Tread wear is measured in millimeters of tread remaining after 30,000 miles. Thirty cars are selected for experiment. On each car, one Goodyear tire and one Firestone tire is placed randomly in one of two front positions.