Primer on Statistics for Interventional Cardiologists Giuseppe Sangiorgi, MD Pierfrancesco Agostoni, MD Giuseppe Biondi-Zoccai, MD.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Correlation and regression
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 10 Simple Regression.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Chapter Eighteen MEASURES OF ASSOCIATION
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Lecture 5 Correlation and Regression
Lecture 16 Correlation and Coefficient of Correlation
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Correlation and Regression
Inference for regression - Simple linear regression
Primer on Statistics for Interventional Cardiologists Giuseppe Sangiorgi, MD Pierfrancesco Agostoni, MD Giuseppe Biondi-Zoccai, MD.
CORRELATION & REGRESSION
Correlation.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Examining Relationships in Quantitative Research
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Linear correlation and linear regression + summary of tests
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Lecture 10: Correlation and Regression Model.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Simple linear regression Tron Anders Moger
Correlation & Regression Analysis
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Chapter 13 Simple Linear Regression
Inference for Least Squares Lines
Linear Regression and Correlation Analysis
Chapter 11: Simple Linear Regression
Correlation and Regression
CHAPTER 29: Multiple Regression*
Product moment correlation
Presentation transcript:

Primer on Statistics for Interventional Cardiologists Giuseppe Sangiorgi, MD Pierfrancesco Agostoni, MD Giuseppe Biondi-Zoccai, MD

What you will learn Introduction Basics Descriptive statistics Probability distributions Inferential statistics Finding differences in mean between two groups Finding differences in mean between more than 2 groups Linear regression and correlation for bivariate analysis Analysis of categorical data (contingency tables) Analysis of time-to-event data (survival analysis) Advanced statistics at a glance Conclusions and take home messages

What you will learn Linear regression and correlation for bivariate analysis –Simple linear regression –Regression diagnostics –Correlation analysis –Non-parametric alternatives: Spearman rho

How can I assess the quantitative impact of dilation pressure during stenting on final minimum lumen diameter? In other words, can I quantitatively predict the change in a dependent variable given specific changes in an independent variable Regression

Minimum lumen diameter (mm) Dilation pressure during stenting (ATM) Beforehand plotting is pivotal

We cannot define a specific mathematical function (eg F=m*a): there is no precise relationship Regression means a relationship which is not very precise, where a given value of the independent variable corresponds to a distribution of values of the dependent variable Regression

Regression analysis It models a continuous dependent variable and a continuous independent variable The dependent variable in the regression equation is modeled as a function of the independent variable, a corresponding parameter (constant), and an error term (a random variable representing unexplained variation in the dependent variable) Parameters are estimated so as to give a "best fit" of the data, by means of the least squares method

Linear regression Independent variable Distribution of the dependent variable Average of the distribution of values of the dependent variable Regression line

Through regression I can estimate the average value of the dependent variable given a specific value of the independent variable To do it, I need a specific model: MLD = costant + β * dilation pressure where β is the angular coefficient and shows the change in Y (MLD) given a unit change of X (dilation pressure) β is the parameter to assess, in order to appraise whether it is different from zero (ie if MLD steadily changes given a change in dilation pressure) How can we estimate β? Linear regression

It can be intuitively understood that it is the line that minimizes the differences between observed values (y i ) and estimated values (y i ’) Which of these different possible lines that I can graphically trace and compute is the best regression line? Linear regression

Linear regression analysis computes a statistical test to assess whether the coefficient of the independent variable is significantly different from zero If the test has a probability value lower than the critical value (p<0.050), the regression model is valid Linear regression

Linear regression: different models and precisions

The relationship between differences after squaring and further mathematical passage becomes: Total deviance = Residual deviance + Regression deviance The ratio (R 2 ) can be used to test the statistical significance of the regression model, ie the null hypothesis that β equals zero Linear regression

The best index of regression accuracy is the coefficient of determination: R 2 It varies between 0 (no accuracy) and 1.0 (perfect accuracy) In other words, R 2 express the % of variability of the dependent variable which can be solely and directly explained by variations in the independent variable Beware of R 2 >0.90 in biology, in most cases they are fraudulent Linear regression

The difference between observed values and estimated values can be defined by: Linear regression

Regression Mauri et al, Circulation 2005

Regression Mauri et al, Circulation 2005

Regression Mauri et al, Circulation 2005

What you will learn Linear regression and correlation for bivariate analysis –Simple linear regression –Regression diagnostics –Correlation analysis –Non-parametric alternatives: Spearman rho

Regression diagnostics Once a regression model has been constructed, it may be important to confirm the goodness of fit of the model and its statistical significance Common checks of goodness of fit are: R 2, analyses of the pattern of residuals (must be randomly and normally distributed, and have non- constant variance) and hypothesis testing Statistical significance can be checked by an F-test of the overall fit, followed by t-tests of individual parameters Interpretations of these diagnostic tests rest heavily on the model assumptions

Regression diagnostics Although examination of the residuals can be used to invalidate a model, the results of a t-test or F-test are sometimes more difficult to interpret if the model's assumptions are violated If the error term does not have a normal distribution, in small samples the estimated parameters will not follow normal distributions, which complicates inference With relatively large samples, however, the central limit theorem (CLT) can be invoked such that hypothesis testing may proceed using asymptotic approximations

Residuals Residuals are the differences between the predicted values of Y at each value of X They should be randomly and normally distributed, without any apparent trend or curvature The plot of the residuals against X provides a visual assessment of the distribution of the residuals – this distribution should appear random (Crawley’s “sky at night”) if the model reasonably predicts the trend in Y

Residual plots

Checklist for linear regression To check that linear regression is an appropriate analysis for these data, ask yourself these questions Q1: Can the relationship between X and Y be graphed as a straight line? In many experiments the relationship between X and Y is curved, making linear regression inappropriate. Either transform the data, or use a program that can perform nonlinear curve fitting Q2: Is the scatter of data around the line Gaussian (at least approximately)? Linear regression analysis assumes that the scatter is Gaussian Q3: Is the variability the same everywhere? Linear regression assumes that scatter of points around the best-fit line has the same standard deviation all along the curve. The assumption is violated if the points with high or low X values tend to be further from the best-fit line. The assumption that the standard deviation is the same everywhere is termed homoscedasticity

Checklist for linear regression Q4: Do you know the X values precisely? The linear regression model assumes that X values are exactly correct, and that experimental error or biological variability only affects the Y values. This is rarely the case, but it is sufficient to assume that any imprecision in measuring X is very small compared to the variability in Y. Q5: Are the data points independent? Whether one point is above or below the line is a matter of chance, and does not influence whether another point is above or below the line. Q6: Are the X and Y values intertwined? If the value of X is used to calculate Y (or the value of Y is used to calculate X) then linear regression calculations are invalid. One example would be a graph of midterm LVEF (X) vs. long-term LVEF (Y). Since the midterm exam LVEF is a component of the final LVEF, linear regression is not valid for these data

More than one independent variable can be included in the model, yielding a multiple linear regression model: Y = a + β 1 X 1 + β 2 X 2 + β 3 X 3 + …. Statistical analysis can even simultaneously appraise the quantitative contribution of each β! Multiple linear regression

What you will learn Linear regression and correlation for bivariate analysis –Simple linear regression –Regression diagnostics –Correlation analysis –Non-parametric alternatives: Spearman rho

Correlation The square root of the coefficient of determination (R 2 ) is the correlation coefficient (R) and shows the degree of linear association between 2 continuous variables, but disregards causation Assumes values between -1.0 (negative association), 0 (no association), and +1.0 (positive association) It can be summarized as a point summary estimate, with specific standard error, 95% confidence interval, and p value K. Pearson

Regression and correlation Briguori et al, Eur Heart J 2002

Regression and correlation Briguori et al, Eur Heart J 2002

Correlation Escolar et al, AJC 2007

Correlation Escolar et al, AJC 2007

What about non-linear associations? Each number correspond to the correlation coefficient for linear association (R)

Dangers of not plotting data Four sets of data all with the same R=0.81

What you will learn Linear regression and correlation for bivariate analysis –Simple linear regression –Regression diagnostics –Correlation analysis –Non-parametric alternatives: Spearman rho

Pearson vs Spearman Whenever the independent and dependent variables can be assumed to belong to normal distributions, the Pearson linear correlation method can be used, maximizing statistical power and yield Whenever the data are sparse, rare, and/or not belonging to normal distributions, the non- parametric Spearman correlation method should be used, which yields the rank correlation coefficient (rho), but not its R 2 C. Spearman

Spearman rho Abbate et al, JACC 2003

Spearman rho Abbate et al, JACC 2003

Regression and correlation: do-it-yourself with SPSS

Linear regression

Scatterplot

Correlation

Thank you for your attention For any correspondence: For further slides on these topics feel free to visit the metcardio.org website: