Download presentation
Presentation is loading. Please wait.
Published byAlicia Campbell Modified over 8 years ago
2
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 1
3
Statistics: Unlocking the Power of Data Lock 5 DISCLAIMER This review session will NOT cover everything you need to know for the exam; it will just hit on the main points
4
Statistics: Unlocking the Power of Data Lock 5 The Big Picture Population Sample Sampling Statistical Inference Interval estimation Descriptive statistics
5
Statistics: Unlocking the Power of Data Lock 5 Sampling Population Sample GOAL: Select a sample that is similar to the population HOW? RANDOM SAMPLE!
6
Statistics: Unlocking the Power of Data Lock 5 Observational Studies A third variable that is associated with both the explanatory variable and the response variable is called a confounding variable There are almost always confounding variables in observational studies Observational studies can almost never be used to establish causation
7
Statistics: Unlocking the Power of Data Lock 5 Randomized Experiments In a randomized experiment the explanatory variable for each unit is determined randomly, before the response variable is measured Because the explanatory variable is randomly assigned, it is not associated with any other variables. Confounding variables are eliminated!!! Randomized experiments make it possible to infer causation!
8
Statistics: Unlocking the Power of Data Lock 5 Was the sample randomly selected? Possible to generalize to the population Yes Should not generalize to the population No Was the explanatory variable randomly assigned? Possible to make conclusions about causality Yes Can not make conclusions about causality No Chapter 1: Data Collection
9
Statistics: Unlocking the Power of Data Lock 5 Chapter 2: Descriptive Statistics In order to make sense of data, we need ways to summarize and visualize it Summarizing and visualizing variables and relationships between two variables is often known as descriptive statistics (also known as exploratory data analysis) Type of summary statistics and visualization methods depend on the type of variable(s) being analyzed (categorical or quantitative)
10
Statistics: Unlocking the Power of Data Lock 5 Variable(s)VisualizationSummary Statistics Categoricalbar chart, pie chart frequency table, relative frequency table, proportion Quantitativedotplot, histogram, boxplot mean, median, max, min, standard deviation, z-score, range, IQR, five number summary Categorical vs Categorical side-by-side bar chart, segmented bar chart two-way table, difference in proportions, odds ratio Quantitative vs Categorical side-by-side boxplotsstatistics by group, difference in means Quantitative vs Quantitative scatterplotcorrelation
11
Statistics: Unlocking the Power of Data Lock 5 Chapter 3: Confidence Intervals A confidence interval for a parameter is an interval computed from sample data by a method that will capture the parameter for a specified proportion of all samples A 95% confidence interval will contain the true parameter for 95% of all samples
12
Statistics: Unlocking the Power of Data Lock 5 Confidence Intervals The parameter is fixed The statistic is random (depends on the sample) The interval is random (depends on the statistic) 95% of 95% confidence intervals will capture the truth
13
Statistics: Unlocking the Power of Data Lock 5 Confidence Intervals Sample Bootstrap Sample... Calculate statistic for each bootstrap sample Bootstrap Distribution Standard Error (SE): standard deviation of bootstrap distribution Margin of Error (ME) (95% CI: ME = 2×SE) statistic ± ME Bootstrap Sample
14
Statistics: Unlocking the Power of Data Lock 5 Percentile Method
15
Statistics: Unlocking the Power of Data Lock 5 Question of the Day What happens when you switch to organic food?
16
Statistics: Unlocking the Power of Data Lock 5 The Organic Effect The Organic Effect: What Happens When you Switch to Organic FoodThe Organic Effect: What Happens When you Switch to Organic Food. 9/15/15.
17
Statistics: Unlocking the Power of Data Lock 5 The Organic Effect A study took a Swedish family who ate conventionally (non-organic) and had them eat only organic food for 2 weeks The resulting video publicizing the study has had over 30 million views between Youtube, Facebook, and other sites Here we look at data from the original study Magner, J., Wallberg, P., Sandberg, J., and Cousins, A.P. (2015). Human exposure to pesticides from food: A pilot study. IVL Swedish Environmental Research Institute. January 2015. Human exposure to pesticides from food: A pilot study
18
Statistics: Unlocking the Power of Data Lock 5 Data Measurements on pesticide levels taken the week before switching to organic (while on regular non-organic diet) Ate only organic for two weeks Measurements on pesticide levels taken again the second of those two weeks Explanatory variable: before or after switching to organic food Response variables: Pesticide detected or not Pesticide concentration (μg / g crt)
19
Statistics: Unlocking the Power of Data Lock 5 Generalize? Can we generalize to a population of all people? a) Yes b) No
20
Statistics: Unlocking the Power of Data Lock 5 Study Design Is this an experiment or an observational study? a) Experiment b) Observational study
21
Statistics: Unlocking the Power of Data Lock 5 Causality? Can we make conclusions about causality? a) Yes b) No
22
Statistics: Unlocking the Power of Data Lock 5 Variables We first want to analyze detection of pesticides (pesticide detected or not) before and after eating organic. This is a question involving a) One categorical variable b) Two categorical variables c) One quantitative variable d) Two quantitative variables e) One quantitative and one categorical variable
23
Statistics: Unlocking the Power of Data Lock 5 Visualization In analyzing detection of pesticides (pesticide detected or not) before and after eating organic, how might we visualize this data? a) Bar chart b) Histogram c) Side-by-side boxplots d) Segmented bar charts e) Scatterplot
24
Statistics: Unlocking the Power of Data Lock 5 Detection of Pesticides There were 12 pesticides measured 20 times each, before and after eating organic, yielding 240 measurements in each group. Before eating organic, 111 measurements detected pesticide. After eating organic, only 24 measurements detected pesticide. Create a two-way table summarizing this data.
25
Statistics: Unlocking the Power of Data Lock 5 Detection of Pesticides
26
Statistics: Unlocking the Power of Data Lock 5 Parameter and Statistic In analyzing detection of pesticides (pesticide detected or not) before and after eating organic, what would be an appropriate parameter and statistic for inference? a) Single proportion b) Difference in proportions c) Single mean d) Difference in means e) Correlation
27
Statistics: Unlocking the Power of Data Lock 5 Detection of Pesticides p 1 – p 2 : Proportion detecting pesticides before eating organic – proportion detecting pesticides after eating organic. Calculate the relevant statistic. DetectedNot DetectedTOTAL Before eating organic111129240 After eating organic24216240 TOTAL135345480
28
Statistics: Unlocking the Power of Data Lock 5 Bootstrap Distribution This is a bootstrap distribution based on 1000 simulations. Approximate a 99% confidence interval. a) (0.33, 0.4) b) (0.3, 0.42) c) (0.27, 0.45) d) (0.25, 0.5)
29
Statistics: Unlocking the Power of Data Lock 5 Interpretation Interpret the confidence interval in context.
30
Statistics: Unlocking the Power of Data Lock 5 Variables We’ll next analyze the response variable concentration of pesticide (measured in μg / g creatinine) to see if it differs before and after eating organic. This is a question involving a) One categorical variable b) Two categorical variables c) One quantitative variable d) Two quantitative variables e) One quantitative and one categorical variable
31
Statistics: Unlocking the Power of Data Lock 5 Visualization In analyzing concentration of pesticides before and after eating organic, how might we visualize this data? a) Bar chart b) Histogram c) Side-by-side boxplots d) Segmented bar charts e) Scatterplot
32
Statistics: Unlocking the Power of Data Lock 5 Pesticides
33
Statistics: Unlocking the Power of Data Lock 5
34
Parameter and Statistic In analyzing pesticide concentration (for any one pesticide) before and after eating organic, what would be an appropriate parameter and statistic for inference? a) Single proportion b) Difference in proportions c) Single mean d) Difference in means e) Correlation
35
Statistics: Unlocking the Power of Data Lock 5 Difference in Means Pesticide 24D6.460.945.52 3-PBA27.052.5424.52 35DC19.120.9718.15 CCC179.433.68175.75 ETU2.901.201.70 Mepiquat34.205.5828.62 Propamocarb2.100.241.86 TCP38.837.4831.35 Which one do you want a confidence interval for?
36
Statistics: Unlocking the Power of Data Lock 5 95% Confidence Interval Use bootstrap distribution from StatKey to create a 95% confidence interval using the standard error method. Interpret the confidence interval in context.
37
Statistics: Unlocking the Power of Data Lock 5 Variables Actually, this data is paired (before and after measurements on the same people), so it makes sense to look at the differences in pesticide concentration, before – after. This is a question involving a) One categorical variable b) Two categorical variables c) One quantitative variable d) Two quantitative variables e) One quantitative and one categorical variable
38
Statistics: Unlocking the Power of Data Lock 5 Visualization In analyzing the differences in concentration of pesticides before and after eating organic, how might we visualize this data? a) Bar chart b) Histogram c) Side-by-side boxplots d) Segmented bar charts e) Scatterplot
39
Statistics: Unlocking the Power of Data Lock 5 Histograms
40
Statistics: Unlocking the Power of Data Lock 5 Parameter and Statistic In analyzing the differences in concentration of each pesticide before and after eating organic, what would be an appropriate parameter and statistic for inference? a) Single proportion b) Difference in proportions c) Single mean d) Difference in means e) Correlation
41
Statistics: Unlocking the Power of Data Lock 5 3-PBA Insecticide found in grains, fruits, vegetables Difference in 3-PBA concentration before and after:
42
Statistics: Unlocking the Power of Data Lock 5 Bootstrap for 3-PBA Based on this bootstrap distribution with 1000 bootstrap statistics, the standard error is closest to… a) 1 b) 4 c) 7 d) 10
43
Statistics: Unlocking the Power of Data Lock 5 Confidence Interval Using the statistic and standard error, calculate a 95% confidence interval and interpret in context.
44
Statistics: Unlocking the Power of Data Lock 5 EXAM 1 In-class on Friday, 10/9 Covers everything we have done in class or lab so far (Chapters 1 – 3 in book, except not 2.6) Bring a non-cell phone calculator and a one- sided page of notes (8 ½ x 11 paper) Optional review problems posted on WileyPlus (doing lots of problems is the best way to study!)
45
Statistics: Unlocking the Power of Data Lock 5 Exam Topics: Ch 1-3 Chapter 1: Data Collection Data (cases, variables, etc.) Sampling Observational studies and confounding Randomized experiments Chapter 2: Descriptive Statistics (except not 2.6) Summary statistics for one or two variable(s) Graphical displays for one or two variable(s) Conditional probability Chapter 3: Confidence Intervals Sampling distributions Confidence intervals Margin of error Standard error and 95% confidence intervals Bootstrapping Percentile method for any level of confidence
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.