Download presentation
Presentation is loading. Please wait.
Published byChristian Norman Modified over 9 years ago
1
Data Analysis and Surveying 101: Data Analysis and Surveying 101: Basic research methods and biostatistics as they apply to the Theresa Jackson Hughes, MPH American College Health Association December 2006
2
What we will cover today Research Methods Sampling Frame and Sampling Generalizability Bias Reliability and Validity Levels of measurement Biostatistics Statistical significance Other key terms Appropriate statistical tests Fun examples from the Spring 2005 dataset! Get excited! It’s data time!!!
3
Research Methods
4
“To do successful research, you don't need to know everything, you just need to know of one thing that isn't known.” Arthur Schawlow “That's the nature of research - you don't know what in hell you're doing.” Harold "Doc" Edgerton “If we knew what it was we were doing, it would not be called research, would it?” Albert Einstein
5
What exactly is research? “Scientific research is systematic, controlled, empirical, and critical investigation of natural phenomena guided by theory and hypotheses about the presumed relations among such phenomena.” Kerlinger, 1986 Research is an organized and systematic way of finding answers to questions
6
Important Components of Empirical Research Problem statement, research questions, purposes, benefits Theory, assumptions, background literature Variables and hypotheses Operational definitions and measurement Research design and methodology Instrumentation, sampling Data analysis Conclusions, interpretations, recommendations
7
Sampling What is your population of interest? To whom do you want to generalize your results? All students (18 and over) Undergraduates only Greeks Athletes Other Can you sample the entire population?
8
Sampling A sample is “a smaller (but hopefully representative) collection of units from a population used to determine truths about that population” (Field, 2005) Why sample? Resources (time, money) and workload Gives results with known accuracy that can be calculated mathematically The sampling frame is the list from which the potential respondents are drawn Registrar’s office Class rosters Must assess sampling frame errors
10
Types of Samples Probability (Random) Samples Simple random sample Systematic random sample Stratified random sample Proportionate Disproportionate Cluster sample Non-Probability Samples Convenience sample Purposive sample Quota
11
Sample Size Size of CampusFinal Desired N <600All students 600-2,999600 3,000-9,999700 10,000-19,999800 20,000-29,000900 ≥30,0001,000 Depends on expected response rate Average 85% for paper FINAL SAMPLE DESIRED /.85 = SAMPLE Average 25% for web FINAL SAMPLE DESIRED /.25 = SAMPLE
12
Bias and Error
13
Systematic Error or Bias: unknown or unacknowledged error created during the design, measurement, sampling, procedure, or choice of problem studied Error tends to go in one direction Examples: Selection, Recall, Social desirability Random Unrelated to true measures Example: Momentary fatigue
14
Reliability and Validity Reliability The extent to which a test is repeatable and yields consistent scores Affected by random error/bias Validity The extent to which a test measures what it is supposed to measure A subjective judgment made on the basis of experience and empirical indicators Asks "Is the test measuring what you think it’s measuring?“ Affected by systematic error/bias
15
Reliability vs. Validity In order to be valid, a test must be reliable; but reliability does not guarantee validity.
16
Levels of Measurement
17
Nominal Gender Male, Female Vaccinations Yes, No, Unsure Ordinal Personal health status Excellent, Very good, Good, Fair, Poor Last 30 days Never used, Not in last 30 days, 1-2 days, 3-5 days, 6-9 days, 10-19 days, 20-29 days, All 30 days Interval Body Mass Index (BMI) Ratio Number of drinks Number of sexual partners Perception percentages Blood alcohol concentration (BAC)
18
Biostatistics
19
“It is commonly believed that anyone who tabulates numbers is a statistician. This is like believing that anyone who owns a scalpel is a surgeon.” R. Hooke “Torture numbers, and they'll confess to anything.” Gregg Easterbrook “98% of all statistics are made up.” Author Unknown
20
Types of Statistics Descriptive statistics Describe the basic features of data in a study Provide summaries about the sample and measures Inferential statistics Investigate questions, models, and hypotheses Infer population characteristics based on sample Make judgments about what we observe
21
Descriptive Statistics Mode Median Mean Central Tendency Variation Range Variance Standard Deviation Frequency
22
Descriptive Statistics Examples Categorical Variables (Nominal/Ordinal)
23
Descriptive Statistics Examples Categorical Variables (Nominal/Ordinal)
24
Descriptive Statistics Examples Continuous Variables (Interval/Ratio)
25
Hypotheses Null hypotheses Presumed true until statistical evidence in the form of a hypothesis test indicates otherwise There is no effect/relationship There is no difference in means Alternative hypotheses Tested using inferential statistics There is an effect/relationship There is a difference in means
26
Alpha, Beta, Power, Effect Size Alpha – probability of making a Type I error Reject null when null is true Level of significance, p value Beta – probability of making a Type II error Fail to reject null when null is false Power – probability of correctly rejecting null 1 – Beta Effect Size Measure of the strength of the relationship between two variables Null is true Null is false Reject null Alpha Type I error 1 – Beta Power CORRECT REJECTION Fail to Reject null 1 – Alpha CORRECT NON- REJECTION Beta Type II error
27
Let’s test some hypotheses!!!
28
Test of the mean of one continuous variable College students report drinking an average of 5 drinks the last time they “partied”/socialized Hypotheses H o : µ = 5 H A : µ ≠ 5 Test: Two-tailed t-test Result: Reject null
29
Test of a single proportion of one categorical variable 20% of college students report their health is excellent Hypotheses H o : p = 20 H A : p ≠ 20 (one-tailed) Test: Z-test for a single proportion Result: Reject null
30
Test of a relationship between two continuous variables There is a relationship between the number of drinks students report drinking the last time they drank and the number of sex partners they have had within the last school year Hypotheses H o : ρ = 0 H A : ρ ≠ 0 Test: Pearson Product Moment Correlation Result: Reject null
31
Test of the difference between two means Men and women report significantly different numbers of sexual partners over the past 12 months Hypotheses µ 1 = µ 2 µ 1 ≠ µ 2 Test: Independent Samples t-test OR One-way ANOVA Result: Reject null
32
Test of the difference between two or more means Mean BAC reported differs across student residences Hypotheses µ 1 = µ 2 = µ 3 = µ 4 = µ 5 = µ 6 µ i ≠ µ j for at least one pair i, j Test: One-way ANOVA Result: Reject null
33
Test of the difference between two or more means
34
Test for a relationship between two categorical variables Is there an association between being a member of a fraternity/sorority and ever being diagnosed with depression? Hypotheses H o : There is no association between being a member of a fraternity/sorority and ever being diagnosed with depression. H A : There is an association between being a member of a fraternity/sorority and ever being diagnosed with depression. Test: Chi-square test for independence Result: Fail to reject null
35
Test for relationship between two categorical variables
36
Important Points to Remember An significant association does not indicate causation Statistical significance is not always the same as practical significance Multiple factors contribute to whether your results are significant It gets easier and easier as you practice!
37
Questions???
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.