PTP 565 Fundamental Tests and Measures Thomas Ruediger, PT, DSc, OCS, ECS Statistics Overview.

Slides:



Advertisements
Similar presentations
Measurement, Evaluation, Assessment and Statistics
Advertisements

Richard M. Jacobs, OSA, Ph.D.
RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-
Table of Contents Exit Appendix Behavioral Statistics.
Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion.
Descriptive Statistics Primer
Descriptive Statistics
Methods and Measurement in Psychology. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Today Concepts underlying inferential statistics
Levels of Measurement Nominal measurement Involves assigning numbers to classify characteristics into categories Ordinal measurement Involves sorting objects.
Measures of Central Tendency
Today: Central Tendency & Dispersion
Cal State Northridge  427 Andrew Ainsworth PhD. Statistics AGAIN? What do we want to do with statistics? Organize and Describe patterns in data Taking.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
PTP 560 Research Methods Week 11 Question on article If p
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Chapter 3 Statistical Concepts.
PTP 560 Research Methods Week 3 Thomas Ruediger, PT.
Foundations of Educational Measurement
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
PTP 560 Research Methods Week 8 Thomas Ruediger, PT.
Chapter 11 Descriptive Statistics Gay, Mills, and Airasian
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Descriptive Statistics
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Descriptive Statistics Used to describe or summarize sets of data to make them more understandable Used to describe or summarize sets of data to make them.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Descriptive Statistics
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Skewness & Kurtosis: Reference
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Statistics 11 The mean The arithmetic average: The “balance point” of the distribution: X=2 -3 X=6+1 X= An error or deviation is the distance from.
UTOPPS—Fall 2004 Teaching Statistics in Psychology.
The Central Tendency is the center of the distribution of a data set. You can think of this value as where the middle of a distribution lies. Measure.
Measures of Dispersion
Psychology 101. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Research Ethics:. Ethics in psychological research: History of Ethics and Research – WWII, Nuremberg, UN, Human and Animal rights Today - Tri-Council.
Numerical Measures of Variability
Test Score Distribution * Low Variability.
Measures of Reliability in Sports Medicine and Science Will G. Hopkins Sports Medicine 30(4): 1-25, 2000.
RESEARCH & DATA ANALYSIS
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
1 Outline 1. Why do we need statistics? 2. Descriptive statistics 3. Inferential statistics 4. Measurement scales 5. Frequency distributions 6. Z scores.
Kin 304 Descriptive Statistics & the Normal Distribution
Standardized Testing. Basic Terminology Evaluation: a judgment Measurement: a number Assessment: procedure to gather information.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Outline Sampling Measurement Descriptive Statistics:
And distribution of sample means
BPK 304W Descriptive Statistics & the Normal Distribution
Kin 304 Descriptive Statistics & the Normal Distribution
BPK 304W Descriptive Statistics & the Normal Distribution
Introduction to Statistics
Descriptive and inferential statistics. Confidence interval
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Descriptive Statistics
Presentation transcript:

PTP 565 Fundamental Tests and Measures Thomas Ruediger, PT, DSc, OCS, ECS Statistics Overview

Outline Statistic(s) Central Tendency Distribution Standard Error Referencing Sources of Errors Reliability Validity – Sensitivity/Specificity – Likelihood Ratios Receiver Operator Characteristics (ROC) Curves Clinical Utility

Statistic(s) A statistic – “Single numerical value or index…” Rothstein and Echternach Index – a number or ratio (a value on a scale of measurement) derived from a series of observed facts wordnet.princeton.edu/perl/webwn Descriptive or inferential? – D: What we did and what we saw – I: This is what you should expect in general population Examples – 61.5 kg, 0.75, 0.25, 3.91 GPA ie. numbers and ratios

Central Tendency What is an average? – Mean? μ for population X for sample – Median? – Mode? Which do we use for each of these? Distribution of Names=mode (nominal-counting) Distribution of Ages=it depends Distribution of Gender=mode (nominal-counting) Distribution of Body Mass Distribution of Strength  How is it calculated?  Sum/n  Middle # (or middle two/2)  Most frequent value

Bell Curve 68.2% +/- 1 SD 95.4% +/- 2SD 99.7% +/- 3SD Mu=mean of population

Variability Population How measurements differ from each other – Measured from the mean In total these difference always sum to zero Variance handles this – Sum of squared deviations – Divided by the number of measurements – σ 2 for population variance Standard deviation – Square root of variance – σ for population SD

Variability (of the Sample, not Population) How measurements differ from each other – Measured from the mean In total, these always sum to zero Variance handles this – Sum of squared deviations – Divided by (the number of measurements – 1) – s 2 for sample variance (now a estimate_ – Also called an “unbiased estimate of the parameter σ 2 “ P & W p 396 Standard deviation – Square root of variance – s for sample standard deviation

Calculating Variance and SD 1,3,5,7,9 5-1=4^2=16 5-9=4^2=16 5-3=2^2=4 5-7=2^2= = 40/5=8 Variance: 8^2=64 SD: sqroot(64)= 8

Skewed distributions

Mode=15 Median=15.26 Mean=15.6

Skewness The amount of asymmetry of the distribution Kurtosis The peakedness of the distribution

Standard error of the measure (SEM) Product of the standard deviation of the data set and the square root of 1 - ICC – SD x squroot of 1 - ICC An indication of the precision of the score Standard Error used to construct a confidence interval (CI) around a single measurement within which the true score is estimated to lie 95% CI around the observed score would be: Observed score ± 1.96*SEM – Nearly 2SD but not quite (observed score +/- 2SD) Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res. Feb 2005;19(1):

Minimum detectable difference (MDD)? SEM doesn’t take into account the variability of a second measure SEM is therefore not adequate to compare paired values for change Of course there is a way to handle this (1.96*SEM*√2) Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res. Feb 2005;19(1): Eliasziw M, Young SL, Woodbury MG, Fryday-Field K. Statistical methodology for the concurrent assessment of interrater and intrarater reliability: using goniometric measurements as an example. Phys Ther. Aug 1994;74(8):

Standard error of the mean (S.E. mean) An estimate of the standard deviation of the population An indication of the sampling error Three points relative to the sample – The sample is a representation of the larger population – The larger the sample, the smaller the error – If we take multiple samples, the distribution of the sample means looks like a bell shaped curve Standard deviation / √ of the sample size (s/√n) Equation 18.1 P & W

Normative Reference How does this datum compare to others? Gives you a comparison to the group Datum should be compared to similar group – 55 stroke patient vs. 25 year old athlete? WRONG – 25 year old soccer player vs. 25 year old swimmer? CORRECT! Datum may (or may not) indicate capability – Strength is +3 SD of normal – Can he bench 200 kg?

Criterion Reference How does this datum compare to a standard? For example, in many graduate courses – All could earn an “A” – All could fail In contrast, Vs. Norm Referencing – Same group above, but in norm referenced course – Some would be “A”, some “B”, some “C”…. Criterion references often used in PT for – Progression – Discharge

Percentiles 100 equal parts Relative position – 89 th percentile – 89% below this Quartiles a common grouping – 25 th (Q1), 50 th (Q2), 75 th (Q3), 100 th (Q4) – Interquartile Range Distance between Q3-Q1 Middle 50% – Semi-interquartile Range Half the interquartile range Useful variability measure for skewed distributions

Stanines STAndard NINE Nine-point Results are ranked lowest to highest Lowest 4% is stanine 1, highest 4% is stanine 9 Calculating Stanines 4% 7% 12% 17% 20% 17% 12% 7% 4%

Sources of Measurement Error Systematic: ruler is 1 inch too short for true foot Random: usually cancels out Individual – Trained – Untrained The instrument – Right instrument – Same instrument Variability of the characteristic – Time of day – Pre or post therapy

Reliability Test-Retest – Attempt to control variation – Testing effects – Carryover effects Intra-rater – Can I (or you) get the same result two different times? Inter-rater – Can two testers obtain the same measurement? Required to have validity

Reliability ICC reflects both correlation and agreement – What PT use commonly Kappa: Others

Validity Not required for Reliability Measurement measures what is intended to be measured Is not something an instrument has=it has to be valid for measuring “something” Is specific to the intended use Multiple types – Face – Content – Criterion-referenced Concurrent Predictive – Construct

Sensitivity and Specificity are components of validity

Sensitivity The true positive rate Sensitivity – Can the test find it if it’s there? Sensitivity increases as: – More with a condition correctly classified – Fewer with the condition are missed Highly sensitive test good for ruling out disorder – If the result is Negative – SnNout 1-sensitivity = false negative rate EX: All people are females in classes is high sensitivity, but males are all then “false positives”

Specificity The true negative rate Specificity – Can the test miss it if it isn’t there? Specificity increases as: – More without a condition correctly classified – Fewer are falsely classified as having condition Highly specific test good for ruling in disorder – If the result is positive – SpPin 1-specificity = false positive rate

Likelihood Ratios Useful for confidence in our diagnosis Importance ↑ as they move away from 1 1 is useless: means false negatives = false positives 50% – Negative 0 to 1 Positive 1 to infinity LR + = true positive rate/false positive rate LR - = false negative rate/ true negative rate

Truth Test Sp Sn ab cd NPV = d/c+d PPV = a/a+b 1-Sn = - LR + LR = 1-Sp Sp = d/b+d Sn = a/a+c

Receiver Operating Characteristics (ROC) Curves Tradeoff between missing cases and over diagnosing Tradeoff between signal and noise Well demonstrated graphically In the next slide you see the attempt to maximize the area under the curve P & W have an example on page 637

Receiver Operating Characteristics (ROC) Curves Aka Sensitivity Aka 1 - specificity

Clinical Utility Is the literature valid? – Subjects – Design – Procedures – Analysis Meaningful Results – Sn, Sp, Likelihood ratios Do they apply to my patient? – Similar to tested subjects? – Reproducible in my clinic? – Applicable? – Will it change my treatment? – Will it help my patient?

Hypotheses Directional – I predict “A” intervention is better than “B” intervention Non-directional – I think there is a difference between “A” intervention and “B” intervention

Evidence based practice Ask clinically relevant and answerable questions Search for answers Appraise the evidence Judge the validity, impact and applicability Does it apply to this patient? Sackett et al. Evidence-Based Medicine: How to Practice and teach EBM. 2 nd ed.