Measuring of Correlation

Slides:



Advertisements
Similar presentations
Estimation of authenticity of results of statistical research (part I)
Advertisements

QUANTITATIVE DATA ANALYSIS
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Measures of Central Tendency
Understanding Research Results
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Graphic representations in statistics
Graphic representations in statistics. Graphic representation and graphic analysis n Graphic representations are used for evident representation of statistical.
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Average Arithmetic and Average Quadratic Deviation.
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
Medical Statistics as a science
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
Chapter Eight: Using Statistics to Answer Questions.
Data Analysis.
Authenticity of results of statistical research. The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95%
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Average Arithmetic and Average Quadratic Deviation.
Average values and their types. Averages n Averages are widely used for comparison in time, that allows to characterize the major conformities to the.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Introduction to Medical Statistics. Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which.
Estimation of authenticity of results of statistical research.
Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts.
Measuring of Correlation. Definition Correlation is a measure of mutual correspondence between two variables and is denoted by the coefficient of correlation.
How do we do research when it would be unethical to conduct an experiment and still get legitimate results?
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Data analysis and basic statistics KSU Fellowship in Clinical Pathology Clinical Biochemistry Unit
Methods of Presenting and Interpreting Information Class 9.
Chapter 11 Summarizing & Reporting Descriptive Data.
Outline Sampling Measurement Descriptive Statistics:
Relative values and their types
Descriptive Statistics ( )
Chapter 12 Understanding Research Results: Description and Correlation
Statistical analysis.
Statistics in Management
Medical Statistics as a science
ESTIMATION.
Average Arithmetic and Average Quadratic Deviation
Variety of characteristic
Statistical analysis.
Direct method of standardization of indices
Relative Values.
PCB 3043L - General Ecology Data Analysis.
Understanding Results
Statistics.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
26134 Business Statistics Week 3 Tutorial
Basic Statistics Overview
Description of Data (Summary and Variability measures)
STATS DAY First a few review questions.
Graphic representations in statistics
Numerical Descriptive Measures
Introduction Second report for TEGoVA ‘Assessing the Accuracy of Individual Property Values Estimated by Automated Valuation Models’ Objective.
Basic Statistical Terms
NURS 790: Methods for Research and Evidence Based Practice
Descriptive and inferential statistics. Confidence interval
Numerical Descriptive Measures
Data analysis and basic statistics
Descriptive and Inferential
Product moment correlation
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Chapter Nine: Using Statistics to Answer Questions
Numerical Descriptive Measures
Presentation transcript:

Measuring of Correlation

Definition Correlation is a measure of mutual correspondence between two variables and is denoted by the coefficient of correlation.

Statistical Analysis in a Simple Experiment Half the subjects receive one treatment and the other half another treatment (usually placebo) Randomly select sample of subjects to study Measure baseline variables in each group Define population of interest Use statistical techniques to make inferences about the distribution of the variables in the general population and about the effect of the treatment

Randomization Randomization is the process of making something random; this means: Generating a random permutation of a sequence (such as when shuffling cards). Selecting a random sample of a population (important in statistical sampling). Generating random numbers: see Random number generation. Transforming a data stream using a scrambler in telecommunications.

Applications and characteristics a) The simple correlation coefficient, also called the Pearson's product-moment correlation coefficient, is used to indicate the extent that two variables change with one another in a linear fashion.

Applications and characteristics b) The correlation coefficient can range from - 1 to + 1 and is unites (Fig. A, B, C).

Applications and characteristics c) When the correlation coefficient approaches - 1, a change in one variable is more highly, or strongly, associated with an inverse linear change (i.e., a change in the opposite direction) in the other variable.

Randomization Techniques Although historically "manual" randomization techniques (such as shuffling cards, drawing pieces of paper from a bag, spinning a roulette wheel) were common, nowadays automated techniques are mostly used. As both selecting random samples and random permutations can be reduced to simply selecting random numbers, random number generation methods are now most commonly used, both hardware random number generators and pseudo-random number generators.

Applications and characteristics d) When the correlation coefficient equals zero, there is no association between the changes of the two variables.

Types of Data Categorical data:  values belong to categories Nominal data: there is no natural order to the categories e.g. blood groups Ordinal data: there is natural order e.g. Adverse Events (Mild/Moderate/Severe/Life Threatening) Binary data: there are only two possible categories e.g. alive/dead Numerical data:  the value is a number (either measured or counted) Continuous data: measurement is on a continuum e.g. height, age, haemoglobin Discrete data: a “count” of events e.g. number of pregnancies

Applications and characteristics (e) When the correlation coefficient approaches +1, a change in one variable is more highly, or strongly, associated with a direct linear change in the other variable.

Applications and characteristics A correlation coefficient can be calculated validly only when both variables are subject to random sampling and each is chosen independently.

Correlation coefficient

Correlation coefficient

Correlation coefficient

Types of correlation There are the following types of correlation (relation) between the phenomena and signs in nature: а) the reason-result connection is the connection between factors and phenomena, between factor and result signs. б) the dependence of parallel changes of a few signs on some third size.

Quantitative types of connection functional one is the connection, at which the strictly defined value of the second sign answers to any value of one of the signs (for example, the certain area of the circle answers to the radius of the circle)

Quantitative types of connection correlation - connection at which a few values of one sign answer to the value of every average size of another sign associated with the first one (for example, it is known that the height and mass of man’s body are linked between each other; in the group of persons with identical height there are different valuations of mass of body, however, these valuations of body mass varies in certain sizes – round their average size).

Correlative connection Correlative connection foresees the dependence between the phenomena, which do not have clear functional character. Correlative connection is showed up only in the mass of supervisions that is in totality. The establishment of correlative connection foresees the exposure of the causal connection, which will confirm the dependence of one phenomenon on the other one.

Correlative connection Correlative connection by the direction (the character) of connection can be direct and reverse. The coefficient of correlation, that characterizes the direct communication, is marked by the sign plus (+), and the coefficient of correlation, that characterizes the reverse one, is marked by the sign minus (-). By the force the correlative connection can be strong, middle, weak, it can be full and it can be absent.

Estimation of correlation by coefficient of correlation Force of connection Line (+) Reverse (-) Complete +1 Strong From +1 to +0,7 From -1 to -0,7 Average from +0,7 to +0,3 from –0,7 to –0,3 Weak from +0,3 to 0 from –0,3 to 0 No connection

Types of correlative connection By direction direct (+) – with the increasing of one sign increases the middle value of another one; reverse (-) – with the increasing of one sign decreases the middle value of another one;

Types of correlative connection By character rectilinear - relatively even changes of middle values of one sign are accompanied by the equal changes of the other (arterial pressure minimal and maximal) curvilinear – at the even change of one sing there can be the increasing or decreasing middle values of the other sign.

Average Values Mean:  the average of the data  sensitive to outlying data Median:  the middle of the data  not sensitive to outlying data Mode:  most commonly occurring value Range:  the difference between the largest observation and the smallest Interquartile range:  the spread of the data  commonly used for skewed data Standard deviation:  a single number which measures how much the observations vary around the mean Symmetrical data:  data that follows normal distribution  (mean=median=mode)  report mean & standard deviation & n Skewed data:  not normally distributed  (meanmedianmode)  report median & IQ Range

Average Values Limit is it is the meaning of edge variant in a variation row lim = Vmin Vmax

Average Values Amplitude is the difference of edge variant of variation row Am = Vmax - Vmin

Average Values Average quadratic deviation characterizes dispersion of the variants around an ordinary value (inside structure of totalities).

Average quadratic deviation σ = simple arithmetical method

Average quadratic deviation d = V - M genuine declination of variants from the true middle arithmetic

Average quadratic deviation method of moments

Average quadratic deviation is needed for: 1. Estimations of typicalness of the middle arithmetic (М is typical for this row, if σ is less than 1/3 of average) value. 2. Getting the error of average value. 3. Determination of average norm of the phenomenon, which is studied (М±1σ), sub norm (М±2σ) and edge deviations (М±3σ). 4. For construction of sigmal net at the estimation of physical development of an individual.

Average quadratic deviation This dispersion a variant around of average characterizes an average quadratic deviation (  )

Coefficient of variation is the relative measure of variety; it is a percent correlation of standard deviation and arithmetic average.

Terms Used To Describe The Quality Of Measurements Reliability is variability between subjects divided by inter-subject variability plus measurement error. Validity refers to the extent to which a test or surrogate is measuring what we think it is measuring.

Measures Of Diagnostic Test Accuracy Sensitivity is defined as the ability of the test to identify correctly those who have the disease. Specificity is defined as the ability of the test to identify correctly those who do not have the disease. Predictive values are important for assessing how useful a test will be in the clinical setting at the individual patient level. The positive predictive value is the probability of disease in a patient with a positive test. Conversely, the negative predictive value is the probability that the patient does not have disease if he has a negative test result. Likelihood ratio indicates how much a given diagnostic test result will raise or lower the odds of having a disease relative to the prior probability of disease.

Measures Of Diagnostic Test Accuracy

Expressions Used When Making Inferences About Data Confidence Intervals The results of any study sample are an estimate of the true value in the entire population. The true value may actually be greater or less than what is observed. Type I error (alpha) is the probability of incorrectly concluding there is a statistically significant difference in the population when none exists. Type II error (beta) is the probability of incorrectly concluding that there is no statistically significant difference in a population when one exists. Power is a measure of the ability of a study to detect a true difference.

Multivariable Regression Methods Multiple linear regression is used when the outcome data is a continuous variable such as weight. For example, one could estimate the effect of a diet on weight after adjusting for the effect of confounders such as smoking status. Logistic regression is used when the outcome data is binary such as cure or no cure. Logistic regression can be used to estimate the effect of an exposure on a binary outcome after adjusting for confounders.

Survival Analysis Kaplan-Meier analysis measures the ratio of surviving subjects (or those without an event) divided by the total number of subjects at risk for the event. Every time a subject has an event, the ratio is recalculated. These ratios are then used to generate a curve to graphically depict the probability of survival. Cox proportional hazards analysis is similar to the logistic regression method described above with the added advantage that it accounts for time to a binary event in the outcome variable. Thus, one can account for variation in follow-up time among subjects.

Kaplan-Meier Survival Curves

Why Use Statistics?

Descriptive Statistics Identifies patterns in the data Identifies outliers Guides choice of statistical test

Measures of central tendency Measures of variability Тypes of descriptive statistics: Graph s Measures of central tendency Measures of variability

Statistical Analysis in a Simple Experiment Half the subjects receive one treatment and the other half another treatment (usually placebo) Randomly select sample of subjects to study (exclusion criteria but define a precise patient population) Measure baseline variables in each group (e.g. age, Apache II to ensure randomisation successful) Define population of interest Use statistical techniques to make inferences about the distribution of the variables in the general population and about the effect of the treatment

Percentage of Specimens Testing Positive for RSV (respiratory syncytial virus)

Descriptive Statistics

Numerical data:  the value is a number (either measured or counted) Ordinal data: there is natural order e.g. Adverse Events (Mild/Moderate/Severe/Life Threatening) Categorical data:  values belong to categories Data Nominal data: there is no natural order to the categories e.g. blood groups

Distribution of Course Grades

Measures of Dispersion RANGE STANDARD DEVIATION SKEWNESS