Chapter 7 Criterion-Referenced Reliability and Validity PoorSufficientBetter.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Objective: To test claims about inferences for two proportions, under specific conditions Chapter 22.
Sample size estimation
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Statistical Decision Making
REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree.
We’re ready to TEST our Research Questions! In science, how do we usually test a hypothesis?
Measures of association
Foci on Health Needs Public health imperative: concern with total population needs and the development of strategies based on prevention and health promotion.
Cohort Studies.
BS704 Class 7 Hypothesis Testing Procedures
Case-Control Studies. Feature of Case-control Studies 1. Directionality Outcome to exposure 2. Timing Retrospective for exposure, but case- ascertainment.
Statistics for Health Care
Sample Size Determination
COHORT STUDY DR. A.A.TRIVEDI (M.D., D.I.H.) ASSISTANT PROFESSOR
MEASUREMENT AND EVALUATION
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Cohort Study.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Multiple Choice Questions for discussion
Gerstman Case-Control Studies 1 Epidemiology Kept Simple Section 11.5 Case-Control Studies.
September 15. In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control.
Chapter 3 The Research Design. Research Design A research design is a plan of action for executing a research project, specifying The theory to be tested.
Research Study Design and Analysis for Cardiologists Nathan D. Wong, PhD, FACC.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
CRT Dependability Consistency for criterion- referenced decisions.
Statistics for Health Care Biostatistics. Phases of a Full Clinical Trial Phase I – the trial takes place after the development of a therapy and is designed.
Lecture 6 Objective 16. Describe the elements of design of observational studies: (current) cohort studies (longitudinal studies). Discuss the advantages.
Measures of Association
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Prospective Studies (cohort, longitudinal, incidence studies) Sue Lindsay, Ph.D., MSW, MPH Division of Epidemiology and Biostatistics Institute for Public.
1 Ch 11 Estimating Risk: Is There an Association? Table 11-1 A hypothetical investigation of a foodborne disease outbreak The suspect foods were identified.
The binomial applied: absolute and relative risks, chi-square.
Correlation & Prediction REVIEW Correlation BivariateDirect/IndirectCause/Effect Strength of relationships (is + stronger than negative?) Coefficient of.
Chapter 2 Nature of the evidence. Chapter overview Introduction What is epidemiology? Measuring physical activity and fitness in population studies Laboratory-based.
Measures of Association and Impact Michael O’Reilly, MD, MPH FETP Thailand Introductory Course.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
CHAPTER 6: HEALTH RELATED FITNESS. Definitions  Physical activity:  The process of body movement  MVPA is most beneficial  Physical fitness:  Product.
Basic concept of clinical study
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Chapter 14 Chi-Square Tests.  Hypothesis testing procedures for nominal variables (whose values are categories)  Focus on the number of people in different.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
CHP400: Community Health Program - lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Cohort Study Present: Disease Past: Exposure.
REVIEW I Reliability scraps Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure.
Health Research. What is the placebo effect? An expectation of an effect gives that effect.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Doing Analyses on Binary Outcome. From November 14 th Dr Sainani talked about how the math works for binomial data.
Chapter 6 Norm-Referenced Reliability and Validity.
Chapter 7 Criterion-Referenced Measurement PoorSufficientBetter.
Assessing Student Performance Characteristics of Good Assessment Instruments (c) 2007 McGraw-Hill Higher Education. All rights reserved.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Chapter 2 Using Technology in Measurement and Evaluation.
Copyright © 2008 Delmar. All rights reserved. Chapter 4 Epidemiology and Public Health Nursing.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Epidemiological Study Designs And Measures Of Risks (1)
Measures of Association (2) Attributable Risks and fractions October Epidemiology 511 W. A. Kukull.
Chapter 18 Cross-Tabulated Counts
Multiple logistic regression
Measurements of Risk & Association …
Interpreting Epidemiologic Results.
HEC508 Applied Epidemiology
REVIEW I Reliability scraps Index of Reliability
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
BMTRY 736 Foundation of Epidemiology Fall 2019
Presentation transcript:

Chapter 7 Criterion-Referenced Reliability and Validity PoorSufficientBetter

Criterion-Referenced Testing Mastery Learning Standard Development Judgmental Normative Empirical Combination SU

Guidelines for Writing Behavioral Objectives (Mager, 1962) Identify the desired behavior by name Define the desired behavior Specify the criteria of acceptable performance

Advantages of Criterion-Referenced Measurement Represent specific, desired performance levels linked to a criterion Are independent of the proportion of the population that meets the standard If not met, specific diagnostic evaluations can be made Degree of performance is not important... reaching the standard is

Limitations of Criterion-Referenced Measurement Cutoff scores always involve subjective judgment Misclassifications can be severe Students who meet the cutoff may no longer be motivated to do better PF

Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths

Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths

Statistical Analysis of CRTs Nominal data Contingency table development Phi coefficient (PPM) Chi-square analysis Review chapter 5

Considerations With CRT The same as norm-referenced testing Reliability Consistency of measurement Validity Truthfulness of measurement

Figure 7.1a FITNESSGRAM Standards 24 (4%) 21 (4%) 64 (11%) 472 (81%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max

Figure 7.1b AAHPERD Physical Best Standards 130 (22%) 23 (4%) 201 (35%) 227 (39%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max

Meeting Criterion-Referenced Standards Possible Decisions Truly below criterion Truly above criterion Did not achieve standard Correct decision False positive Did achieve standard False negative Correct decision

Table 7.1 CRT Test-Retest Reliability Example Day 2 Day 1Did not achieve the standard Did achieve the standard Total Did not achieve the standard Did achieve the standard Total P =.825 K =.576 phi =.586  2 = , df = 1, p <.001

Table 7.2 Criterion-Referenced Equivalence Reliability Between the 1 Mile Run/Walk and PACER TestsTotal sampleBoysGirls Trial 1 P K Trial 2 P K

Figure 7.3 A Theoretical Example of the Divergent Group Method

Examples of Criterion Referenced Standards Cholesterol < 240 mg / dl Systolic blood pressure < 130 mmHg Diastolic blood pressure < 90 mmHg FITNESSGRAM 1-mile run time for boy age 10 < 11:30 President’s Challenge Health Fitness curl-ups for girl age 14 > 24

CRT Reliability Fail Day 1 Pass Fail Pass Day 2

CRT Validity Fail Field Test Pass Fail Pass Criterion

Racquetball Example Can a wall volley test serve as a good criterion measure to determine who should enter intermediate racquetball? Example Reliability study Validity study

Racquetball Test Illustration 2 extra racquetballs You must always hit the ball from behind the broken line Front Wall The test Trial 1 60 seconds Trial 2 60 seconds

Set a standard for passing the field test. Our standard is set at 25 hits. You must hit the ball against the front wall at least 25 times in a trial. This meets the “standard” for entry into intermediate racquetball. You want to see if players can achieve the standard on each trial. If you determine the consistency of their meeting the standard, this is a criterion-referenced reliability study. Reliability Study

Reliability—What You Would Like to See Trial 2 Trial 1Failed to meet standard (<25) Met the Standard (> 25) Failed to meet standard (<25) People here on BOTH Trials No one here Met the Standard (> 25) No one herePeople here on BOTH Trials

PASW Output Meet standard on Trial 2?Total Did NOT meet standard of 25 DID meet standard of 25 Meet standard on Trial 1? Did NOT meet standard of DID meet standard of Total Chi square = 23.6, p <.001 Phi = 0.65 Percent agreement = ( )/56 = 48/56 = 85.7% This field test demonstrates acceptable criterion-referenced reliability

The standard for passing the field test is 25 hits. We need a criterion measure of TRUE racquetball ability We used self reported racquetball experience. Inexperience = novice player Experienced = skilled OR completed beginning racquetball class You want to see if experienced players are more likely to achieve the standard on the field test and the inexperienced players are less likely to meet the field test standard. This is a criterion-referenced validity study. Validity Study

Criterion-Referenced Validity— What You Would Like to See Criterion Results of field Test InexperiencedExperienced < 25 hitsMany people hereNo one here > 25 hitsNo one hereMany people here The criterion is that the student has at least completed a beginning racquetball class

PASW Output—Trial 1 vs. Criterion CriterionTotal InexperiencedExperienced Meet standard on trial 1? Did NOT meet standard of DID meet standard of Total Chi square = 6.7, p <.01 Phi = 0.35 Percent agreement = (33 + 8)/56 = 41/56 = 73%

PASW Output—Trial 2 vs. Criterion CriterionTotal InexperiencedExperienced Meet standard on trial 2? Did NOT meet standard of DID meet standard of Total Chi square = 4.8, p <.03 Phi = 0.29 Percent agreement = (30 + 9)/56 = 39/56 = 70% The results of the TWO validity studies suggest this field test and the criterion of 25 hits is a moderately valid measure of racquetball experience

Table 7.8 Table 7.8 Research Designs in Epidemiology TypeDescription Experimental Randomized clinical trial Randomly assign subjects to treatments or exposures Community trialRandomly assign whole communities to treatments or exposures Observational Cases seriesNoting cases at a particular time or place Cross-sectionalA snapshot of identifiable groups at one point in time Proportionate mortality or morbidity study Compare results of a study group to the population Case-controlCompares known cases of mortality or morbidity with matched noncases CohortLongitudinal, generally long term tracking of populations

Epidemiological Statistics Incidence—the number, proportion, rate, or percentage of new cases of mortality and morbidity. Incidence could be calculated in a randomized clinical trial or a prospective, longitudinal cohort study. Prevalence—the number, proportion, rate, or percentage of total cases of mortality and morbidity. Prevalence would be calculated in a cross-sectional study.

Estimates of Risk Absolute risk—the risk (proportion, percentage, rate) of mortality or morbidity in a population that is exposed or not exposed to a risk factor. Relative risk—the ratio of risks between the exposed or unexposed populations. This statistic is calculated with incidence measures. Odds ratio—an estimate of relative risk used in prevalence studies. Attributable risk—the risk of mortality and morbidity directly related to a risk factor. It can be thought of as the reduction in risk related to removing a risk factor.

Table 7.9 Results of a Hypothetical Study Relating Cholesterol and Heart Attack Mortality Exposure Outcome Heart attack deathsNo heart attack deaths High cholesterol A 25 B 31 No high cholesterol C7C7 D 37

Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure “Bad” thing here“Good” thing here “Risky” thing here “Better” thing here

Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure DeadAlive Smoker Non-smoker

Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure “Bad” thing here“Good” thing here “Risky” thing here “Better” thing here Design a Physical Activity Study

Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure DeadAlive Sedentary Physically active

Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure “Bad” thing here“Good” thing here “Risky” thing here “Better” thing here Design a Physical Activity Study about USDHHS PA guidelines

Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure HypertensiveNormotensive Does NOT meet PA guidelines MEETS PA guidelines