Examining the Relationship Between Two Variables (Bivariate Analyses)

Slides:



Advertisements
Similar presentations
A PowerPoint®-based guide to assist in choosing the suitable statistical test. NOTE: This presentation has the main purpose to assist researchers and students.
Advertisements

CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Qualitative predictor variables
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Examining the Relationship Between Two Variables (Bivariate Analyses)
Statistical Tests Karen H. Hagglund, M.S.
QUANTITATIVE DATA ANALYSIS
MSc Applied Psychology PYM403 Research Methods Quantitative Methods I.
Decision Tree Type of Data Qualitative (Categorical) Type of Categorization One Categorical Variable Chi-Square – Goodness-of-Fit Two Categorical Variables.
Basic Statistics for Research: Choosing Appropriate Analyses and Using SPSS Dr. Beth A. Bailey Dr. Tiejian Wu Department of Family Medicine.
Measures of Association Deepak Khazanchi Chapter 18.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Regression and Correlation
Non-Parametric Methods Professor of Epidemiology and Biostatistics
STATISTICAL TECHNIQUES FOR research ROMMEL S. DE GRACIA ROMMEL S. DE GRACIA SEPS for PLANNING & RESEARCH SEPS for PLANNING & RESEARCH.
Logistic Regression. Outline Review of simple and multiple regressionReview of simple and multiple regression Simple Logistic RegressionSimple Logistic.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Can I Believe It? Understanding Statistics in Published Literature Keira Robinson – MOH Biostatistics Trainee David Schmidt – HETI Rural and Remote Portfolio.
Simple Linear Regression
Principles of Research Writing & Design Educational Series Fundamentals of Biostatistics (Part 2) Lauren Duke, MA Program Coordinator Meharry-Vanderbilt.
Non-Parametric Methods Professor of Epidemiology and Biostatistics
Shuyu Chu Department of Statistics February 17, 2014 Lisa Short Course Series R Statistical Analysis Laboratory for Interdisciplinary Statistical Analysis.
Multiple Linear Regression. Multiple Regression In multiple regression we have multiple predictors X 1, X 2, …, X p and we are interested in modeling.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Biostatistics – A Revisit What are they? Why do we need them? Their relevance and importance.
How to Teach Statistics in EBM Rafael Perera. Basic teaching advice Know your audience Know your audience! Create a knowledge gap Give a map of the main.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL);
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
2 Categorical Variables (frequencies) Testing mean differences of a continuous variable between groups (categorical variable) 2 Continuous Variables 2.
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
Linear correlation and linear regression + summary of tests
Bivariate data are used to explore the relationship between 2 variables. Bivariate Data involves 2 variables. Scatter plots are used to graph bivariate.
Choosing a statistical What are you trying to do?.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Regression & Correlation. Review: Types of Variables & Steps in Analysis.
Chapter 4: Describing the relation between two variables Univariate data: Only one variable is measured per a subject. Example: height. Bivariate data:
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
ANALYSIS PLAN: STATISTICAL PROCEDURES
Simple linear regression Tron Anders Moger
Fundamental Concepts of Biostatistics Cathy Jenkins, MS Biostatistician II Lisa Kaltenbach, MS Biostatistician II April 17, 2007.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
A first order model with one binary and one quantitative predictor variable.
SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.
Nonparametric Statistics
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Chapter 4 Selected Nonparemetric Techniques: PARAMETRIC VS. NONPARAMETRIC.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Examining the Relationship Between Two Variables
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
Non-Parametric Tests.
Y - Tests Type Based on Response and Measure Variable Data
Analysis of Data Graphics Quantitative data
Statistical Tool Boxes
Examining the Relationship Between Two Variables
Learning outcomes By the end of this session you should know about:
Exercise 1: Gestational age and birthweight
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Exercise 1: Gestational age and birthweight
Examine Relationships
Presentation transcript:

Examining the Relationship Between Two Variables (Bivariate Analyses)

What type of analysis? We have two variables X and Y and we are interested in describing how a response (Y) is related to an explanatory variable (X). We have two variables X and Y and we are interested in describing how a response (Y) is related to an explanatory variable (X). What graphical displays do we use to show the relationship between X and Y ? What graphical displays do we use to show the relationship between X and Y ? What statistical analyses do we use to summarize, describe, and make inferences about the relationship? What statistical analyses do we use to summarize, describe, and make inferences about the relationship?

Type of Displays Y is Continuous Scatterplot Comparative Boxplot Y is Ordinal or Nominal Logistic Plot 2-D Mosaic Plot X is Continuous X is Ordinal or Nominal

Fit Y by X in JMP X Variable/Predictor Data Type Y Variable/Response Data Type In the lower left corner of the Fit Y by X dialog box you will see this graphic which is the same as the more stylized version on the previous slide.

Type of Displays Y is Continuous Scatterplot Comparative Boxplot Y is Ordinal or Nominal Logistic Plot 2-D Mosaic Plot X is Continuous X is Ordinal or Nominal

Type of Analyses Y is Continuous Y is Continuous Correlation and Regression - Parametric or Nonparametric If X has k = 2 levels then Two-Sample t-Test or Wilcoxon Rank Sum Test. If X has k = 2 levels then Two-Sample t-Test or Wilcoxon Rank Sum Test. If X has k > 2 levels then Oneway ANOVA or Kruskal Wallis Test If X has k > 2 levels then Oneway ANOVA or Kruskal Wallis Test Y is Ordinal or Nominal Y is Ordinal or Nominal If Y has 2 levels then use Logistic Regression If Y has 2 levels then use Logistic Regression If Y has more than 2 levels then use Polytomous Logistic Regression If Y has more than 2 levels then use Polytomous Logistic Regression If both X and Y have two levels then use Fisher’s Exact Test, RR/OR, and Risk Difference/AR If both X and Y have two levels then use Fisher’s Exact Test, RR/OR, and Risk Difference/AR If either X or Y has more than two levels use a Chi-square Test. If either X or Y has more than two levels use a Chi-square Test. McNemar’s Test (dependent) McNemar’s Test (dependent) X is Continuous X is Ordinal or Nominal

Fit Y by X in JMP X continuous X nominal/ordinal Y nominal/ordinal Y continuous

Example: Low Birthweight Study (Note: This is not NC one) List of Variables id – ID # for infant & mother id – ID # for infant & mother headcir – head circumference (in.) headcir – head circumference (in.) leng – length of infant (in.) leng – length of infant (in.) weight – birthweight (lbs.) weight – birthweight (lbs.) gest – gestational age (weeks) gest – gestational age (weeks) mage – mother’s age mage – mother’s age mnocig – mother’s cigarettes/day mnocig – mother’s cigarettes/day mheight – mother’s height (in.) mheight – mother’s height (in.) mppwt – mother’s pre-pregnancy mppwt – mother’s pre-pregnancy weight (lbs.) weight (lbs.) fage – father’s age fedyrs – father’s education (yrs.) fnocig – father’s cigarettes/day fheight – father’s height lowbwt – low birth weight indicator (1 = yes, 0 = no) mage35 – mother’s age over 35 ? (1 = yes, 0 = no) smoker – mother smoked during preg. (1 = yes, 0 = no) Smoker – mother’s smoking status (Smoker or Non-smoker) Low Birth Weight – birth weight (Low, Normal) Continuous Nominal

Example: Low Birthweight Study (Birthweight vs. Gestational Age) Y = birthweight (lbs.) Continuous X = gestational age (weeks) Continuous

Regression and Correlation Analysis from Fit Y by X

Example: Low Birthweight Study (Birthweight vs. Mother’s Smoking Status) Y = birthweight (lbs.) Continuous X = mother’s smoking status (Smoker vs. Non-smoker) Nominal

Independent Samples t-Test from Fit Y by X

Example: Low Birthweight Study (Birthweight Status vs. Mother’s Cigs/Day) Y = birthweight status (Low, Normal) Nominal X = mother’s cigs./day Continuous P(Low|Cigs/Day)

Logistic Regression from Fit Y by X

Example: Low Birthweight Study (Birthweight Status vs. Mother’s Smoking Status) Y = birthweight status (Low, Normal) Nominal X = mother’s smoking status (Smoker, Non-smoker) Nominal

Independent Samples p 1 vs. p 2 - Fisher’s Exact, Chi-square, Risk Difference, RR, & OR Skipped the arrows this time, everything should self-explanatory. Notice the OR is upside-down and needs reciprocation. OR = 1/.342 = 2.92

Summary In summary have seen how bivariate relationships work in JMP and in statistics in general. We know that the type of analysis that is appropriate depends entirely on the data type of the response (Y) and the explanatory variable or predictor (X).