Ordinal Classification of Heart Disease Severity

Slides:



Advertisements
Similar presentations
Correlation and Linear Regression.
Advertisements

Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous.
Chapter 15 Multiple Regression. Regression Multiple Regression Model y =  0 +  1 x 1 +  2 x 2 + … +  p x p +  Multiple Regression Equation y = 
QUANTITATIVE DATA ANALYSIS
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Class 3: Thursday, Sept. 16 Reliability and Validity of Measurements Introduction to Regression Analysis Simple Linear Regression (2.3)
More about Correlations. Spearman Rank order correlation Does the same type of analysis as a Pearson r but with data that only represents order. –Ordinal.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Multinomial Logistic Regression Basic Relationships
1 Measurement Adapted from The Research Methods Knowledge Base, William Trochim (2006). & Methods for Social Researchers in Developing Counries, The Ahfad.
Correlation Nabaz N. Jabbar Near East University 25 Oct 2011.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
SHOWTIME! STATISTICAL TOOLS IN EVALUATION CORRELATION TECHNIQUE SIMPLE PREDICTION TESTS OF DIFFERENCE.
The effect of surgeon volume on procedure selection in non-small cell lung cancer surgeries Dr. Christian Finley MD MPH FRCSC McMaster University.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
Moderation & Mediation
Basic Statistic in Technology and Assessment Mary L. Putman.
Evidence Based Medicine
Lecture 22 Dustin Lueker.  The sample mean of the difference scores is an estimator for the difference between the population means  We can now use.
Multinomial Logistic Regression Basic Relationships
Correlational Research Chapter Fifteen Bring Schraw et al.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Correlation & Regression
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Introduction to Multivariate Analysis Epidemiological Applications in Health Services Research Dr. Ibrahim Awad Ibrahim.
Basic Statistic in Technology and Assessment Mary L. Putman.
Chapter 16 Data Analysis: Testing for Associations.
Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.
Unit 1 Sections 1-1 & : Introduction What is Statistics?  Statistics – the science of conducting studies to collect, organize, summarize, analyze,
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Evaluating the benefits of using VAT data to improve the efficiency of editing in a multivariate annual business survey Daniel Lewis.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
DSCI 346 Yamasaki Lecture 6 Multiple Regression and Model Building.
Stats Methods at IC Lecture 3: Regression.
F-tests continued.
Chapter 12 Understanding Research Results: Description and Correlation
Statistics 200 Lecture #6 Thursday, September 8, 2016
Correlation and Regression analysis
Chapter 7. Classification and Prediction
Statistics 101 Chapter 3 Section 3.
Basic Estimation Techniques
in Technology and Assessment Mary L. Putman
Determining How Costs Behave
What is Correlation Analysis?
General principles in building a predictive model
Basic Statistics Overview
Estimating changes in mortality due to climate change
Understanding Standards Event Higher Statistics Award
Analytics in Higher Education: Methods Overview
Regression and Residual Plots
Multiple logistic regression
SDPBRN Postgraduate Training Day Dundee Dental Education Centre
Scientific Practice Regression.
Simple Linear Regression
Chapter 2 Looking at Data— Relationships
STA 291 Summer 2008 Lecture 23 Dustin Lueker.
وضعیت موجود مرگ و میر و علل مرگ غیر مادری در افراد سال کارشناس اداره سلامت میانسالان معصومه آرشین چی همدان 27 و 28 تیرماه 1396.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
Correlation and the Pearson r
Variable Selection - Accelerator
Correlations: Correlation Coefficient:
STA 291 Spring 2008 Lecture 23 Dustin Lueker.
Honors Statistics Review Chapters 7 & 8
Risk differences for incident stroke, coronary heart disease (CHD), and cardiovascular mortality (per 1000 person-years) by clinical risk factor in the.
Cytokine profiles can predict severe CRS
Correlation and Prediction
Presentation transcript:

Ordinal Classification of Heart Disease Severity Joey Glasser David Cavender Vincent Li Zsofia Voros

Problem and its relevance to health: For years, heart disease has been the leading cause of death in the United States. While many factors such as smoking and being overweight have been associated with heart disease, there are still plenty of nuances in the causes of heart disease. A stronger analysis of relationships between different attributes of a person's health and whether or not they have heart disease via creation of a predictive model will help to increase understanding of the causes of heart disease and, in turn, decrease the mortality rate.

The Data The Cleveland Clinic Foundation collected information on 14 health-related attributes of roughly 300 individuals. Some attributes include: Age Sex Smoking Frequency Cardiovascular stress from exercise One attribute represents if a given individual does or doesn't have heart disease. It takes the value of zero if the individual does not have heart disease, and otherwise is an integer value between one and four representing the severity of the heart disease.

The Model In this project we will use a logistic regression to predict the degree of heart disease in individuals using the explanatory variables such as sex, age, cholesterol levels, etc. The data we are analyzing is ordinal, meaning there is a ranking to the predicted variable: the degree of heart disease. Essentially, we will create four logistic models to predict if the degree of heart disease is greater than 0, greater than 1, greater than 2, and greater than 3. To predict the degree of heart disease for a given person, their data will be fed into all four models and the results will be combined to calculate the probability a person has degree of heart disease of 0, 1, 2, 3, and 4. Whatever probability is the highest is the one the overall model will predict.

How the model will be evaluated Anticipated challenges The model will be evaluated by using the micro-averaged F1-score on a holdout test data set. We decided to use the micro-averaged F1-score since the class sizes are imbalanced. To compare how our method compares to a simple logistic regression model, we will build both and compare their scores on the test data set. Anticipated challenges One challenge that might arise is dependence between predictors. In the case of multicollinearity, we would analyze and remove the necessary amount of highly correlated predictors, or change our model from a logistic model to a classification model that handles data with multicollinearity more accurately such as a random forest model.