Introduction to Evidence-Based Practice



Literature Search

Evaluating the validity of a Diagnostic study

Are the results valid?

1. Did participating patients present a diagnostic dilemma?

The group of patients in which the test was conducted should include patients with a high, medium and low probability of having the target disease. The clinical usefulness of a test is demonstrated in its ability to distinguish between obvious illness and those cases where it is not so obvious or where the diagnosis might otherwise be confused. The patients in the study should resemble what might be expected in a clinical practice.

2. Did investigators compare the test to an appropriate, independent reference standard?

The reference (or gold)  standard refers to the commonly accepted proof that the target disorder is present or not present. The reference standard might be an autopsy or biopsy. The reference standard provides objective criteria (e.g., laboratory test not requiring subjective interpretation) OR a current clinical standard (e.g., a venogram for deep venous thrombosis) for diagnosis. Sometimes there may not be a widely accepted reference standard. The author will then need to clearly justify their selection of the reference test. Those who are conducting or evaluating the other test should not know the results of any of the tests.

3. Were those interpreting the test and reference standard blind to the other results?

To avoid potential bias, those conducting the test should not know or be aware of the results of the other test.

4. Did the investigators perform the same reference standard to all patients regardless of the results of the test under investigation?

Researchers should conduct both tests (the study test and the reference standard) regardless of the results of the test in question. Researchers should not be tempted to forego either test based on the results of only one of the tests. Nor should the researchers apply a different reference standard to patients with a negative results in the study test.

Key issues for Diagnostic Studies:

  • diagnostic uncertainty
  • blind comparison to gold standard
  • each patient gets both tests

What are the results?

What likelihood ratios (LRs) were associated with the range of possible test results?
How much will different levels of the diagnostic test result raise or lower the pretest probability of disease?


Reference Standard
Disease Present

Reference Standard
Disease Absent

New Test



New Test



Sensitivity: measures the proportion of patients with the disease who also test positive for the disease in this study. It is the probability that a person with the disease will have a positive test result.
Sensitivity = true positive / all disease positives [ a / (a + c) ]

Specificity: measures the proportion of patients without the disease who also test negative for the disease in this study. It is the probability that a person without the disease will have a negative test result.
Specificity = true negative / all disease negatives [d / (b + d) ]

Sensitivity and specificity are characteristics of the test but do not provide enough information for the clinician to act on the test results.

Likelihood ratios (LR): indicate the likelihood that a given test result would be expected in a patient with the target disorder compared to the likelihood that the same result would be expected in a patient without that disorder.

Likelihood ratio of a positive test result (LR+) increases the odds of having the disease after a positive test result.

Likelihood ratio of a negative test result (LR-) decreases the odds of having the disease after a negative test result.

How much do LRs change disease likelihood?

LRs greater than 10 or less than 0.1

cause large changes

LRs 5 - 10 or 0.1 - 0.2

cause moderate changes

LRs 2 - 5 or 0.2 - 0.5

cause small changes

LRs less than 2 or greater than 0.5

cause tiny changes

LRs = 1.0

cause no change at all

How to use a nomogram with a likelihood ratio

More about likelihood ratios: Diagnostic tests 4: likelihood ratios. JJ Deeks & Douglas G Altman BMJ 2004 329:168-169

How can I apply the results to patient care?

Will the reproducibility of the test result and its interpretation be satisfactory in your clinical setting?
Does the test yield the same result when reapplied to stable participants?
Do different observers agree about the test results?

Are the study results applicable to the patients in your practice?Does the test perform differently (different LRs) for different severities of disease?
Does the test perform differently for populations with different mixes of competing conditions?

Will the test results change your management strategy?
What are the test and treatment thresholds for the health condition to be detected?
Are the test LRs high or low enough to shift posttest probability across a test or treatment threshold?

Will patients be better off as a result of the test?
Will patient care differ for different test results?
Will the anticipated changes in care do more good than harm?

Based on:  Guyatt, G. Rennie, D. Meade, MO, Cook, DJ.  Users' Guide to Medical Literature: A Manual for Evidence-Based Clinical Practice, 2nd Edition 2008.

Note: For criteria for other types of studies, see the following supplements:
 Therapy | Prognosis | Etiology/ Harm | Systematic Review

Previous References Next

Revised July 2010