Screening and Diagnostic Test

Screening and Diagnostic Test
Ferdon Mijit Department of Epidemiology and Biostatistics School of Public Health Xinjiang Medical University

It is necessary to distinguish between people in the population who have the disease and those who do not.(challange both in clinical and public health arena) Thus, the quality of screening and diagnostic tests is a critical issue.( a physical examination, a chest X-ray, an electrocardiogram, or a blood or urine assay) the same issue arises: How good is the test in separating populations of people with and without the disease in question?

Natural history of disease
Figure 18-1A is a schematic representation of the natural history of a disease in an individual.

Screening The presumptive identification of unrecognized disease or defect by the application of tests, examinations or other procedures which can be applied rapidly. Screening tests sort out apparently well persons who probably have a disease from those who probably do not. Persons with positive or suspicious findings must be referred to their physicians for diagnosis and necessary treatment.” (definition from the US Commission on Chronic Illness)

Population STN Screening test STP+DTN STP+DTP Diagnostic test
Ⅰ Screening test STP+DTN STP+DTP Ⅱ Diagnostic test Ⅲ Treatment

Screening ≠diagnostic tests

Features of Screening A screening test is a test for a particular disease given to indentify patients who have no symptoms (asymptomatic). Screening tests are generally cheap; they are designed to be sensitive (detect lots of possible cases of the disease) but not as specific (accurately identify actual cases of disease). Additionally, the disease or diseases being screened for are common in the general population of patients who receive the screening test.

4. The goal of this type of test is therefore to identify all individuals who might have the disease. (A wide net is cast to catch all suspects.) 5. Those patients so identified must then be subjected to further tests （diagnostic test）which are highly specific, that is, accurately identify real disease.

Criteria for Use of a Screening Test
Significant burden of disease in population Preclinical stage is detectable and prevalent Early detection improves outcome (mortality) with acceptable morbidity Screening tests are acceptable to population, inexpensive, and relatively accurate Effective treatment available for detected disease

Goal 1: secondary prevention
Why screening ? Goal 1: secondary prevention detect disease before clinical point for cure or improved outcome ,get people with disease into appropriate treatment (early detecting, early dignosis, early treatment) Goal 2: primary prevention Identifying high risk population of certain disease

Evaluation of Screening and diagnostic tests
Distribution of human Characteristics Validity of Screening and diagnostic tests Reliability of Screening and diagnostic tests Predictive values

Ideal distribution of human characteristics
---Bimodal curve

Bimodal curve-distribution of human characteristics
“cutoff level" “normal” “diseased"

Unimodal curve-distribution of human characteristics

“normal” diseased false negatives false positives

Distribution of human Characteristics Validity of Screening and diagnostic tests Reliability of Screening and diagnostic tests Predictive values

VALIDITY OF SCREENING TESTS
The validity of a test is defined as its ability to distinguish between who has a disease and who does not. Validity has two components: sensitivity and specificity. The sensitivity of the test is defined as the ability of the test to identify correctly those who have the disease. The specificity of the test is defined as the ability of the test to identify correctly those who do not have the disease.

Gold standard to calculate the sensitivity and specificity of a test, we must know who “really” has the disease and who does not from a source other than the test we are using. We are, in fact, comparing our test results with some “gold standard”—an external source of “truth” regarding the disease status of each individual in the population(e.g., cardiac catheterization or tissue biopsy).

Reference test/Gold standard
DM: Blood glucose Fasting plasma glucose 2 hour plasma glucose DM Diagnosis Criteria 1999 WHO: 1)typical symptoms, serum blood glucose ≥11.1mmol/L 2)or FPG≥7.0mmol/L 3) or OGTT(oral glucose tolerance test 2h PG≥11.1mmol/L) Before evaluating a given test a gold standard is necessary. A Reference test is a diagnsotic procedure that classifies individuals in true categories: diseased and non-diseased. In Cancer diseases a histological verification is used as reference test. In Diabetes blood glucose is used as the reference test. And there are different diagnostic criteria. These are either fasting blood glucose or 2 hour blood glucose.

total a+c (C1) b+d(C2) a+b+c+d
disease screening yes no total positive a b a+b (R1) negative c d c+d (R2) total a+c (C1) b+d(C2) a+b+c+d TRUE CHARACTERISTICS IN THE POPULATION Test Results With Disease Without Disease True positive (TP) = Have disease and have positive test (a) False positive (FP) = No disease, but have positive test (b) False negative (FN) = Have disease, but have negative test (c) True negative (TN) = No disease and have negative test (d)

Sensitivity the proportion of truly diseased persons, as measured by the gold standard, and also are identified as diseased by the test under study. True Positives/(True Positives + False Negatives) a/(a+c) Sensitivity= Rules Out(true positive rate) false positive rate = c/(a+c) (rates of missed diagnosis )

Specificity The proportion of truly non-diseased persons, as measured by the gold standard, who are so identified by the diagnostic test under study. True Negatives/(False Positive + True Negative) d/(b+d) Specificity = Spin = Rules In false negative rate =b/(b+d) (mis-diagnosis rate)

The Ideal Situation--100% Agreement
Disease present Disease absent n = 200 n = 800 200 True positive False positive Positive result Ideally, what a test developer wants to see is the 100% agreement that is shown in this slide. There are NO False Positives or False Negatives. Presumably, if the new test/procedure is easier to do from a technical perspective, more acceptable to patients, less expensive, or has fewer risks, then the new option is going to be very attractive. Of course, this degree of concordance is rarely achieved. False negative 800 True negative Negative result

A More Likely Outcome 30 30 Disease present Disease absent n = 200
170 True Positive 30 False Positive Positive result It is much more likely that the screening test does not accurately reflect the reality of individuals’ disease states. For some cancer screening tests, the true sensitivity and specificity for tests can only be approximated. That is because not everyone in the study or population will undergo the diagnostic procedure that is the ‘gold standard’ for determining disease status. For example, the sensitivity and specificity of the PSA test in a screening trial cannot be determined—unless everyone with a negative result in the study will undergoes extensive prostatic biopsy or prostatectomy for verification! Thus, the evaluation of many PSA studies has focused on the positive predictive value of the test rather than sensitivity/specificity. 30 False Negative 770 True Negative Negative result

Sensitivity and Specificity
Consequences of a False Positive Even 3-5% will be large on a population level Follow-up tests, cost, potential harm, anxiety Periodic screening increases lifetime risk Consequences of a False Negative Even one person can have tragic implications At best, a false sense of security Might neglect future screening tests It is necessary to consider the implications of a False Positive rate, and a False Negative rate, on a population level. The intention is to use a screening test in the asymptomatic population, usually on a specific interval. Therefore, there will be millions of people screened, on a recurring basis. For a positive test result, there are going to be follow-up procedures. These follow-ups have their own associated costs, risk of side effects, and sources of anxiety. A False Positive rate of even 3% translates to 30 per 1000, 300 per 10,000, and 3000 per 100,000. Also, when periodic screening is necessary, the False Positive rate translates into a lifetime "risk" getting at least one such result. A negative result is even more significant, because an existing cancer is missed. A person may skip the next due date for re-screening due to feeling that they got a "clean bill of health" and that no cancer could develop over such a seemingly short period of time. Thus, a delay in presentation and subsequent treatment can have potentially disastrous effect on the individual who had a false-negative.

The Tradeoff: Sensitivity vs. Specificity
If missing cancers is a concern, sensitivity can be raised by adjusting the diagnostic cut point for a positive result But, the false positive rate will also increase How will this affect screening program costs? The Minnesota Cancer Control Study (Mandel et al., 1993; 2000) provides an illustration of how changing the sensitivity of a test to improve cancer detection rate resulted in a decrease in specificity. During the 18 years of follow up, the original 46,551 study participants underwent FOBT screening on an annual or biennial basis. In early screening rounds, the test sensitivity was 80%. By rehydrating the FOBT slides and increasing the number of positive test results, sensitivity was raised to 92%. However, the specificity was lowered to around 90% for all age groups. By the end of the study period, about 38% of participants had received a full colonic evaluation (e.g., colonoscopy). Ransohoff and Lang (1997) estimate that rehydration of FOBT slides would add about $3 billion a year to Medicare program costs. Specificity is also critical to determining positive predictive value and provides a good overall assessment of a screening test effectiveness. For example, raising specificity from 96% to 98% would half the number and false-positive results and double the predictive value of a positive test (Simon 1998).

“normal” diseased false negatives false positives

How to choose the cutoff level
If prognosis of disease is not good, missing of patients will bring severe consequences, then move the cut-off point left, increase the sensitivity If not ,then move the cut-off point right, increase the specificity. Cost of false positives is high ,then move it right If both of specificity and sensitivity are important, and choose the crossing point

Developmental characteristics: Cut-points and Receiver Operating Characteristic (ROC)

Receiver Operating Characteristic (ROC)
ROC Curve allows comparison of different tests for the same condition without (before) specifying a cut-off point. The test with the largest AUC (Area under the curve) is the best.

Likelihood Ratios The likelihood ratio for a test result compares the likelihood of that result in patients with disease to the likelihood of that result in patients without disease Likelihood of a (true) positive test among patients with disease, relative to the likelihood of a (false) positive test among those without disease( Positive LR) or true positive rates/false positive rates Example: Positive LR = (a/a+c)/(b/b+d) sensitivity / (1-specificity) Negative LR = (c/a+c)/(d/b+d) (false negative rates/true negative rates) (1-sensitivity) / specificity

Impact on Disease Likelihood
LR >10 or <0.1 cause large changes in likelihood LR 5-10 or cause moderate changes LR 2-5 or cause small changes LR between <2 and 0.5 cause little or no change

Ruling In & Out Does patient have disease ? Higher Positive LR means disease is likely to be present if test is positive Does patient not have disease? Lower Negative LR means that disease is not likely present or cause of patient current condition

Distribution of human Characteristics Validity of Screening and diagnostic tests Predictive values Reliability of Screening and diagnostic tests

Understanding Predictive Values
Clinician’s perspective: If a test result is positive, how likely is it that this individual has the disease? Predictive value varies with the prevalence of the disease in the screened population Bayes’ theorem: As the prevalence of a disease increases, the positive predictive value of the test increases (PPV) and its negative predictive value (NPV) decreases. Physicians and health care practitioners are often more interested in the probability that a person actually has the disease given a positive screening test result. Predictive values are expressed as percentages. Prevalence is defined as the total number of cases of the disease that exist within a population at a given point in time. While sensitivity and specificity are relatively independent of the prevalence of the disease, predictive values are highly dependent on its prevalence. Bayes was an eighteenth century English mathematician, who expressed the relationships among predictive values and disease prevalence as equations: PPV = (sensitivity x prevalence) [(sensitivity x prevalence)] + [(1 – specificity) x (1 – prevalence)] NPV = (sensitivity) x (1 – prevalence) [specificity x (1 – prevalence)] + [(1 – sensitivity) x prevalence] Note that the converse is true, as prevalence decreases, the NPV increases and PPV decreases. There are websites featuring clinical calculators that perform the math for these equations once the appropriate values are plugged in. [See

Predictive Values The predictive value of a screening test is determined by the sensitivity and specificity of the test, and by the prevalence of the condition for which the test is used.

Positive Predictive Value
True Positive/(True Positive + False Positive) A/(A+B) Probability that a person with positive test is a true positive (does have the disease) Spe PPV prevalence PPV Negative Predictive Value True Negative/(True Negative + False Negative) D/(D+C) Probability that a person with a negative test truly does not have the disease Sen NPV prevalence NPV

What Affects Predictive Values?
Sensitivity Specificity Prevalence (Sensitivity)(Prevalence) +PV= (Sens)(Prev) + (1-Spec)(1-Prev) (Specificity)(1-Prevalence) -PV= (Spec)(1-Prev) + (1-Sens)(Prev)

Using Predictive Values
Keep clinical significance in mind Terminal or rare disease Impact of false negative on patient outcome Benefit of testing to patient Population tested is high or low risk? Alternative Tests for screening

Reliability of the test
Let us consider another aspect of assessing diagnostic and screening tests—the question of whether a test is reliable or repeatable. Can the results obtained be replicated if the test is repeated?

The factors that contribute to the variation between test results are discussed first:
-intrasubject variation (variation within individual subjects) -intraobserver variation (variation in the reading of test results by the same reader) -interobserver variation (variation between those reading the test results).

evaluating any test result, it is important to consider the conditions under which the test was performed, including the time of day.

Readings performed by two radiologists
Readings performed by two radiologists. In this diagram, the readings of observer 1 are cross-tabulated against those of observer 2. The number of readings in each cell is denoted by a letter of the alphabet. Thus, A X-rays were read as abnormal by both radiologists. C X-rays were read as abnormal by radiologist 2 and as doubtful by radiologist 1. M X-rays were read as abnormal by radiologist 1 and as normal by radiologist 2.

Agreement rate: Agreement rate=(A+D)/(A+B+C+D)
Percent agreement is also significantly affected by the fact that even if two observers use completely different criteria to identify subjects as positive or negative, we would expect the observers to agree solely as a function of chance.

Kappa statistics Kappa expresses the extent to which the observed agreement exceeds that which would be expected by chance alone (numerator) relative to the most that the observers could hope to improve their agreement (i.e., 100% - agreement expected by chance alone) [denominator]. Thus kappa quantifies the extent to which the observed agreement that the observers achieved exceeds that which would be expected by chance alone, and expresses it as the proportion of the maximum improvement that could occur beyond the agreement expected by chance alone. The kappa statistic can be defined by the equation:

Kappa range: (-1, +1). If Kappa equal to +1, there are complete agreement between two criteria. If Kappa equal to –1, two criteria are totally different. If Kappa equal to 0, the agreement is caused by chance. If Kappa >=0.75, the agreement is very satisfy. If Kappa between , moderate agreement. If Kappa <0.4, the agreement is not so satisfy.

Two-stage screening To increase sensitivity or specificity: two-stage screening Series test: more specificity ( eg. diabetes, HIV infection ) Parallel test: more sensitivity ( eg. blood donor screening )

Positive: positive cases in both tests
Negative: negative cases in either test Positive: positive cases in either test Negative: negative cases in both test

multipal test Test result Diabetes mellitus urinate blood yes no sugar sugar total

Blood sugar sen=(33+117)/199×100%=75.38%
spe=( )/7641×100%=99.58% Urinate sugar sen=(14+117)/199×100%=65.83% spe=( )/7641×100%=99.59% Parallel test sen=( )/199×100%=82.41% spe=7599/7641×100%=99.45% Series test sen=117/199×100%=58.79% spe=( )/7641×100%=99.73%

example A four- item memory test is assessed as a screening tool for dementia in the elderly. In a sample of 483 person, 50 had dementia as defined by extensive neurologic and cognitive assessment. Of those with dementia, the screening test was positive for 40 persons. Of those without dementia, the screening test was negative for 416 persons.

What is the prevalence of dementia in the study sample?
What is the sensitivity of the test? What is the specificity of the test? What is the positive predictive value of the test? What is the negative predictive value of the test?

A physical examination was used to screen for breast cancer in 2,500 women with biopsy-proven adenocarcinoma of the breast ( breast cancer cases) and in 5,000 control women. The results of the physical examination were positive in 1,800/2,500 cases and in 800/5,000 control women, all of whom showed no evidence of cancer at biopsy. What are the sensitivity, specificity and positive predictive value of the physical examination? ( Hint: make tables)

case control total Sensitivity= 1800/2500 Specificity=4200/5000 PPV=1800/5000

Doctor B Doctor A Total + - 373 7 380 1 3057 3058 374 3064 3438

Screening and Diagnostic Test

Similar presentations

Presentation on theme: "Screening and Diagnostic Test"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Screening and Diagnostic Test

Similar presentations

Presentation on theme: "Screening and Diagnostic Test"— Presentation transcript:

Similar presentations

About project

Feedback