Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.

Slides:



Advertisements
Similar presentations
STAT 101 Dr. Kari Lock Morgan
Advertisements

Sampling: Final and Initial Sample Size Determination
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTION 2.6, 9.1 Least squares line Interpreting.
Statistics: Unlocking the Power of Data Lock 5 Inference Using Formulas STAT 101 Dr. Kari Lock Morgan Chapter 6 t-distribution Formulas for standard errors.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1 BA 275 Quantitative Business Methods Residual Analysis Multiple Linear Regression Adjusted R-squared Prediction Dummy Variables Agenda.
Monday, 4/29/02, Slide #1 MA 102 Statistical Controversies Monday, 4/29/02 Today: CLOSING CEREMONIES!  Discuss HW #3  Review for final exam  Evaluations.
Stat 512 – Lecture 12 Two sample comparisons (Ch. 7) Experiments revisited.
Stat 217 – Day 27 Chi-square tests (Topic 25). The Plan Exam 2 returned at end of class today  Mean.80 (36/45)  Solutions with commentary online  Discuss.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
STAT 101 Dr. Kari Lock Morgan Exam 2 Review.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Simple Linear Regression Least squares line Interpreting coefficients Prediction Cautions The formal model Section 2.6, 9.1, 9.2 Professor Kari Lock Morgan.
Simple Linear Regression Analysis
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals.
Statistics: Unlocking the Power of Data Lock 5 1 in 8 women (12.5%) of women get breast cancer, so P(breast cancer if female) = in 800 (0.125%)
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Synthesis and Review 3/26/12 Multiple Comparisons Review of Concepts Review of Methods - Prezi Essential Synthesis 3 Professor Kari Lock Morgan Duke University.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution Central limit theorem Normal.
Normal Distribution Chapter 5 Normal distribution
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Confidence Intervals I 2/1/12 Correlation (continued) Population parameter versus sample statistic Uncertainty in estimates Sampling distribution Confidence.
Essential Synthesis SECTION 4.4, 4.5, ES A, ES B
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Using Lock5 Statistics: Unlocking the Power of Data
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory variables.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 9.1 Inference for correlation Inference for.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/6/12 Simple Linear Regression SECTIONS 9.1, 9.3 Inference for slope (9.1)
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/6/12 Synthesis Big Picture Essential Synthesis Bayesian Inference (continued)
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/20/12 Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 2.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution (5.1) Central limit theorem.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 9.1 Inference for correlation Inference for.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Marginal Distribution Conditional Distribution. Side by Side Bar Graph Segmented Bar Graph Dotplot Stemplot Histogram.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 1.
Howard Community College
More on Inference.
Synthesis and Review for Exam 1
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
CHAPTER 12 More About Regression
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
More on Inference.
Review for Exam 2 Some important themes from Chapters 6-9
When You See (This), You Think (That)
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9

Statistics: Unlocking the Power of Data Lock 5 Exam 2 In class Thursday 11/15 Cumulative, covering chapters 1-9 (but not 8.2 or 9.2… everything we have done so far in the course) Closed book, but allowed 2 double-sided pages of notes prepared by you You will need a calculator, and will need to know how to compute p-values for normal, t, chi-square, and F distributions using your calculator Practice exam and solutions to review problems available under documents on the course webpage

Statistics: Unlocking the Power of Data Lock 5 Tuesday Prof Morgan, 1 – 2:30 pm, Old Chem 216 Wednesday Prof Morgan, 2 – 3 pm, Old Chem 216 Prof Morgan, 4:30 – 5:30 pm, Old Chem 216 Heather, 8 – 9pm, Old Chem 211A Thursday Prof Morgan, 1 – 2:30 pm, Old Chem 216 Also, the Stat Education Center in Old Chem 211A is open Sunday – Thurs 4pm – 9pm with stat majors and stat PhD students available to answer questions Office Hours This Week

Statistics: Unlocking the Power of Data Lock 5 Was the sample randomly selected? Possible to generalize to the population Yes Should not generalize to the population No Was the explanatory variable randomly assigned? Possible to make conclusions about causality Yes Can not make conclusions about causality No Data Collection

Statistics: Unlocking the Power of Data Lock 5 Variable(s)VisualizationSummary Statistics Categoricalbar chart, pie chart frequency table, relative frequency table, proportion Quantitativedotplot, histogram, boxplot mean, median, max, min, standard deviation, z-score, range, IQR, five number summary Categorical vs Categorical side-by-side bar chart, segmented bar chart two-way table, difference in proportions Quantitative vs Categorical side-by-side boxplotsstatistics by group, difference in means Quantitative vs Quantitative scatterplotcorrelation, simple linear regression

Statistics: Unlocking the Power of Data Lock 5 Confidence Interval A confidence interval for a parameter is an interval computed from sample data by a method that will capture the parameter for a specified proportion of all samples A 95% confidence interval will contain the true parameter for 95% of all samples

Statistics: Unlocking the Power of Data Lock 5 How unusual would it be to get results as extreme (or more extreme) than those observed, if the null hypothesis is true? If it would be very unusual, then the null hypothesis is probably not true! If it would not be very unusual, then there is not evidence against the null hypothesis Hypothesis Testing

Statistics: Unlocking the Power of Data Lock 5 The p-value is the probability of getting a statistic as extreme (or more extreme) as that observed, just by random chance, if the null hypothesis is true The p-value measures evidence against the null hypothesis p-value

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing 1.State Hypotheses 2.Calculate a test statistic, based on your sample data 3.Create a distribution of this test statistic, as it would be observed if the null hypothesis were true 4.Use this distribution to measure how extreme your test statistic is

Statistics: Unlocking the Power of Data Lock 5 Distribution of the Sample Statistic 1.Sampling distribution: distribution of the statistic based on many samples from the population 2.Bootstrap Distribution: distribution of the statistic based on many samples with replacement from the original sample 3.Randomization Distribution: distribution of the statistic assuming the null hypothesis is true 4.Normal, t,  2, F: Theoretical distributions used to approximate the distribution of the statistic

Statistics: Unlocking the Power of Data Lock 5 Sample Size Conditions For large sample sizes, either simulation methods or theoretical methods work If sample sizes are too small, only simulation methods can be used

Statistics: Unlocking the Power of Data Lock 5 For confidence intervals, you find the desired percentage in the middle of the distribution, then find the corresponding value on the x-axis For p-values, you find the value of the observed statistic on the x-axis, then find the area in the tail(s) of the distribution Using Distributions

Statistics: Unlocking the Power of Data Lock 5 Confidence Intervals

Statistics: Unlocking the Power of Data Lock 5 Confidence Intervals Return to original scale with

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing

Statistics: Unlocking the Power of Data Lock 5 General Formulas When performing inference for a single parameter (or difference in two parameters), the following formulas are used:

Statistics: Unlocking the Power of Data Lock 5 General Formulas For proportions (categorical variables), the normal distribution is used For inference involving any quantitative variable (means, correlation, slope), the t distribution is used

Statistics: Unlocking the Power of Data Lock 5 Standard Error The standard error is the standard deviation of the sample statistic The formula for the standard error depends on the type of statistic (which depends on the type of variable(s) being analyzed)

Statistics: Unlocking the Power of Data Lock 5 ParameterDistributionConditionsStandard Error Proportion Normal All counts at least 10 np ≥ 10, n(1 – p) ≥ 10 Difference in Proportions Normal All counts at least 10 n 1 p 1 ≥ 10, n 1 (1 – p 1 ) ≥ 10, n 2 p 2 ≥ 10, n 2 (1 – p 2 ) ≥ 10 Meant, df = n – 1n ≥ 30 or data normal Difference in Means t, df = smaller of n 1 – 1, n 2 – 1 n 1 ≥ 30 or data normal, n 2 ≥ 30 or data normal Paired Diff. in Means t, df = n d – 1n d ≥ 30 or data normal Correlation t, df = n – 2n ≥ 30 pg 470

Statistics: Unlocking the Power of Data Lock 5 Multiple Categories These formulas do not work for categorical variables with more than two categories, because there are multiple parameters For one or two categorical variables with multiple categories, use  2 tests For testing for a difference in means across multiple groups, use ANOVA

Statistics: Unlocking the Power of Data Lock 5 Simple linear regression estimates the population model with the sample model: Simple Linear Regression

Statistics: Unlocking the Power of Data Lock 5 Simple Linear Regression Inference for the slope can be done using

Statistics: Unlocking the Power of Data Lock 5 Inference for the Slope

Statistics: Unlocking the Power of Data Lock 5 A confidence interval has a given chance of capturing the mean y value at a specified x value (the point on the line) A prediction interval has a given chance of capturing the y value for a particular case at a specified x value (the actual point) Intervals

Statistics: Unlocking the Power of Data Lock 5 Inference based on the simple linear model is only valid if the following conditions hold: 1)Linearity 2)Constant Variability of Residuals 3)Normality of Residuals Conditions for SLR

Statistics: Unlocking the Power of Data Lock 5 Inference Methods