Two-Sample Problems – Means 1.Comparing two (unpaired) populations 2.Assume: 2 SRSs, independent samples, Normal populations Make an inference for their.

Slides:



Advertisements
Similar presentations
Chapter 12 Inference for Linear Regression
Advertisements

Residuals.
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
CHAPTER 24: Inference for Regression
Objectives (BPS chapter 24)
Scatter Diagrams and Linear Correlation
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
Ch 2 and 9.1 Relationships Between 2 Variables
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 7 Forecasting with Simple Regression
Chapter 12 Section 1 Inference for Linear Regression.
Linear Regression/Correlation
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Correlation & Regression
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Linear Regression.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
STA291 Statistical Methods Lecture 27. Inference for Regression.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Chapter 15 Inference for Regression
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Correlation and Regression SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram.
Statistical Analysis Topic – Math skills requirements.
Chapter 10 Correlation and Regression
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Linear Regression Day 1 – (pg )
Business Statistics for Managerial Decision Making
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis – mutually exclusive – exhaustive.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 12: Correlation and Linear Regression 1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Regression and Correlation
CHAPTER 12 More About Regression
Chapter 4.2 Notes LSRL.
Sections Review.
AP Statistics Chapter 14 Section 1.
CHAPTER 12 More About Regression
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Review for Exam 2 Some important themes from Chapters 6-9
Chapter 10 Analyzing the Association Between Categorical Variables
Analyzing the Association Between Categorical Variables
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Presentation transcript:

Two-Sample Problems – Means 1.Comparing two (unpaired) populations 2.Assume: 2 SRSs, independent samples, Normal populations Make an inference for their difference: Sample from population 1: Sample from population 2: 1

S.E. – standard error in the two-sample process Confidence Interval: Estimate ± margin of error Significance Test: 2

Using the Calculator Confidence Interval: On calculator: STAT, TESTS, 0:2-SampTInt… Given data, need to enter: Lists locations, C-Level Given stats, need to enter, for each sample: x, s, n and then C-Level Select input (Data or Stats), enter appropriate info, then Calculate 3

Using the Calculator Significance Test: On calculator: STAT, TESTS, 4:2-SampT –Test… Given data, need to enter: Lists locations, H a Given stats, need to enter, for each sample : x, s, n and then H a Select input (Data or Stats), enter appropriate info, then Calculate or Draw Output: Test stat, p-value 4

Ex 1. Is one model of camp stove any different at boiling water than another at the 5% significance level? Model 1: Model 2: 5

Ex 2. Is there evidence that children get more REM sleep than adults at the 1% significance level? Children: Adults: 6

Ex 3. Create a 98% C.I for estimating the mean difference in petal lengths (in cm) for two species of iris. Iris virginica: Iris setosa: 7

Ex 4. Is one species of iris any different at petal length than another at the 2% significance level? Iris virginica: Iris setosa:

Two-Sample Problems – Proportions Make an inference for their difference: Sample from population 1: Sample from population 2: 9

Using the Calculator Confidence Interval: On calculator: STAT, TESTS, B:2-PropZInt… Need to enter: C-Level Enter appropriate info, then Calculate. Estimate ± margin of error 10

Using the Calculator On calculator: STAT, TESTS, 6:2-PropZTest… Need to enter: and then H a Enter appropriate info, then Calculate or Draw Output: Test stat, p-value Significance Test: 11

Ex 5. Create a 95% C.I for the difference in proportions of eggs hatched. Nesting boxes apart/hidden: Nesting boxes close/visible: 12

Ex 6. Split 1100 potential voters into two groups, those who get a reminder to register and those who do not. Of the 600 who got reminders, 332 registered. Of the 500 who got no reminders, 248 registered. Is there evidence at the 1% significance level that the proportion of potential voters who registered was greater than in the group that received reminders? Group 1: Group 2: 13

Ex 6. (continued) 14

Ex 7. “Can people be trusted?” Among year olds, 45 said “yes”. Among year olds, 72 said “yes”. Does this indicate that the proportion of trusting people is higher in the older population? Use a significance level of α =.05. Group 1: Group 2: 15

Ex 7. (continued) 16

Scatterplots & Correlation Each individual in the population/sample will have two characteristics looked at, instead of one. Goal: able to make accurate predictions for one variable in terms of another variable based on a data set of paired values. 17

Variables Explanatory (independent) variable, x, is used to predict a response. Response (dependent) variable, y, will be the outcome from a study or experiment. height vs. weight, age vs. memory, temperature vs. sales 18

Scatterplots Plot of paired values helps to determine if a relationship exists. Ex: variables – height(in), weight (lb) HeightWeight

Scatterplots - Features Direction: negative, positive Form: line, parabola, wave(sine) Strength: how close to following a pattern Direction: Form: Strength: 20

Scatterplots – Temp vs Oil used Direction: Form: Strength: 21

Correlation Correlation, r, measures the strength of the linear relationship between two variables. r > 0: positive direction r < 0: negative direction Close to +1: Close to -1: Close to 0: 22

.85, -.02,.13,

Lines - Review y = a + bx a: b: 24

Regression Looking at a scatterplot, if form seems linear, then use a linear model or regression line to describe how a response variable y changes as an explanatory variable changes. Regression models are often used to predict the value of a response variable for a given explanatory variable. 25

Least-Squares Regression Line The line that best fits the data: where: 26

Example Fat and calories for 11 fast food chicken sandwiches Fat: Calories: 27

Example Fat and calories for 11 fast food chicken sandwiches Fat: Fat Calories Calories: 28

Example-continued What is the slope and what does it mean? What is the intercept and what does it mean? How many calories would you predict a sandwich with 40 grams of fat has? 29

Why “Least-squares”? The least-squares lines is the line that minimizes the sum of the squared residuals. Residual: difference between actual and predicted xy ………… 30

Scatterplots – Residuals To double-check the appropriateness of using a linear regression model, plot residuals against the explanatory variable. No unusual patterns means good linear relationship. 31

Other things to look for Squared correlation, r 2, give the percent of variation explained by the regression line. Chicken data: 32

Other things to look for Influential observations: Prediction vs. Causation: x and y are linked (associated) somehow but we don’t say “x causes y to occur”. Other forces may be causing the relationship (lurking variables). 33

Extrapolation: using the regression for a prediction outside of the range of values for the explanatory variables. ageweight

On calculator Set up: 2 nd 0(catalog), x -1 (D), scroll down to “Diagnostic On”, Enter, Enter Scatterplots: 2 nd Y=(Stat Plot), 1, On, Select Type And list locations for x values and y values Then, ZOOM, 9(Zoom Stat) Regression: STAT, CALC, 8: LinReg (a + bx), enter, List location for x, list location for y, enter Graph: Y=, enter line into Y1 35

Examples: CatChickDogDuckGoatLionBirdPig Bun ny Squir rel x Incubation, days y Lifespan, years x age, years y resale, thousands $ 36

Contingency Tables Contingency tables summarize all outcomes – Row variable: one row for each possible value – Column variable: one column for each possible value – Each cell (i,j) describes number of individuals with those values for the respective variables. Making comparisons between two categorical variables Age\Income< >30Total < > Total

Info from the table – # who are over 25 and make under $15,000: – % who are over 25 and make under $15,000: – % who are over 25: – % of the over 25 who make under $15,000: Age\Income< >30Total < > Total

Marginal Distributions – Look to margins of tables for individual variable’s distribution – Marginal distribution for age: – Marginal distribution for income: Age\Income< >30Total < > Total AgeFreq.Rel. Freq < >2512 Total40 Income< >30Total Freq Rel. Freq. 39

Conditional Distributions – Look at one variable’s distribution given another – How does income vary over the different age groups? – Consider each age group as a separate population and compute relative frequencies: Age\Income< >30Total < > Age\Income< >30Total < >25 40

Independence Revisited Two variables are independent if knowledge of one does not affect the chances of the other. In terms of contingency tables, this means that the conditional distribution of one variable is (almost) the same for all values of the other variable. In the age/income example, the conditionals are not even close. These variables are not independent. There is some association between age and income. 41

Test for Independence Is there an association between two variable? – H 0 : The variables are ( The two variables are ) – H a : The variables (The two variables are ) Assuming independence: – Expected number in each cell (i, j): (% of value i for variable 1)x(% of j value for variable 2)x (sample size) = 42

Example of Computing Expected Values Rh\BloodABABOTotal Total Expected number in cell (A, +): Rh\BloodABABOTotal Total

Chi-square statistic To measure the difference between the observed table and the expected table, we use the chi- square test statistic: where the summation occurs for each cell in the table. 1.Skewed right 2.df = (r – 1)(c – 1) 3.Right-tailed test 44

Test for Independence – Steps  State variables being tested  State hypotheses: H 0, the null hypothesis, vars independent H a, the alternative, vars not independent  Compute test statistic: if the null hypothesis is true, where does the sample fall? Test stat = X 2 -score  Compute p-value: what is the probability of seeing a test stat as extreme (or more extreme) as that?  Conclusion: small p-values lead to strong evidence against H 0. 45

ST – on the calculator On calculator: STAT, TESTS, C:X 2 –Test Observed: [A] Expected: [B] Enter observed info into matrix A, then perform test with Calculate or Draw. Output: Test stat, p-value, df To enter observed info into matrix A: 2 nd, x -1 (Matrix), EDIT, 1: A, change dimensions, enter info in each cell. 46

Ex. Test whether type and rh factor are independent at a 5% significance level. 47

Ex. Test whether age and stance on marijuana legalization are associated. stance\age Total for against Total

Additional Examples personality\collegeHealthScienceLib ArtsEducator extrovert introvert Job grade\marital statusSingleMarriedDivorced City size\practice status GovernmentJudicialPrivateSalaried <250, , >500,