|
Methodological quality of studies and
strength of recommendations
A grading system was used for the strength of the
recommendations. This grading system is simple and easy
to apply, and shows a large degree of consistency
between the grading of therapeutic and preventive,
prognostic and diagnostic studies. The system is based
on the original ratings of the AHCPR Guidelines (1994)
and levels of evidence used in systematic (Cochrane)
reviews on low back pain.
Strength of recommendations: 1. Therapy
and prevention:
Level A : |
Generally consistent findings provided by (a
systematic review of) multiple high quality
randomised controlled trials (RCTs). |
Level B : |
Generally consistent findings provided by (a
systematic review of) multiple low quality RCTs or
non-randomised controlled trials (CCTs). |
Level C : |
One RCT (either high or low quality) or
inconsistent findings from (a systematic review
of) multiple RCTs or CCTs. |
Level D: |
No RCTs or CCTs. |
|
| Systematic review:
systematic methods of selection and inclusion of
studies, methodological quality assessment, data
extraction and analysis.
2. Prognosis:
Level A : |
Generally consistent findings provided by (a
systematic review of) multiple high quality
prospective cohort studies. |
Level B : |
Generally consistent findings provided by (a
systematic review of) multiple low quality
prospective cohort studies or other low quality
prognostic studies. |
Level C : |
One prognostic study (either high or low
quality) or inconsistent findings from (a
systematic review of) multiple prognostic
studies. |
Level D, no evidence: |
No prognostic studies. |
|
| High quality
prognostic studies: prospective cohort studies Low
quality prognostic studies: retrospective cohort
studies, follow-up of untreated control patients in a
RCT, case-series
3 Diagnosis:
Level A : |
Generally consistent findings provided by (a
systematic review of) multiple high quality
diagnostic studies. |
Level B : |
Generally consistent findings provided by (a
systematic review of) multiple low quality
diagnostic studies. |
Level C : |
One diagnostic study (either high or low
quality) or inconsistent findings from (a
systematic review of) multiple diagnostic
studies. |
Level D, no evidence: |
No diagnostic studies. |
|
| High quality
diagnostic study: Independent blind comparison of
patients from an appropriate spectrum of patients, all
of whom have undergone both the diagnostic test and the
reference standard. (An appropriate spectrum is a cohort
of patients who would normally be tested for the target
disorder. An inappropriate spectrum compares patients
already known to have the target disorder with patients
diagnosed with another condition)
Low quality diagnostic study: Study performed in a
set of non-consecutive patients, or confined to a narrow
spectrum of study individuals (or both) all of who have
undergone both the diagnostic test and the reference
standard, or if the reference standard was unobjective,
unblinded or not independent, or if positive and
negative tests were verified using separate reference
standards, or if the study was performed in an
inappropriate spectrum of patients, or if the reference
standard was not applied to all study patients.
The methodological quality of additional studies will
only be assessed in areas that have not been covered yet
by a systematic review or of the non-English literature.
The methodological quality of trials is usually
assessed using relevant criteria related to the internal
validity of trials. High quality trials are less likely
to be associated with biased results than low quality
trials. Various criteria lists exist, but differences
between the lists are subtle.
Quality assessment should ideally be done by at least
two reviewers, independently, and blinded with regard to
the authors, institution and journal. However, as
experts are usually involved in quality assessment it
may often not be feasible to blind studies. Criteria
should be scored as positive, negative or unclear, and
it should be clearly defined when criteria are scored
positive or negative. Quality assessment should be pilot
tested on two or more similar trials that are not
included in the systematic review. A consensus method
should be used to resolve disagreements and a third
reviewer was consulted if disagreements persisted. If
the article does not contain information on the
methodological criteria (score 'unclear'), the authors
should be contacted for additional information. This
also gives authors the opportunity to respond to
negative or positive scores.
The following checklists are recommended:
Checklist for methodological quality of therapy /
prevention studies
Items: |
|
1) |
Adequate method of randomisation, |
2) |
Concealment of treatment allocation, |
3) |
Withdrawal / drop-out rate described and
acceptable, |
4) |
Co-interventions avoided or equal, |
5) |
Blinding of patients, |
6) |
Blinding of observer, |
7) |
Blinding of care provider, |
8) |
Intention-to-treat analysis, |
9) |
Compliance, |
10) |
Similarity of baseline
characteristics. |
Checklist for methodological quality of prognosis
(observational) studies
Items: |
|
1) |
Adequate selection of study population, |
2) |
Description of in- and exclusion
criteria, |
3) |
3) Description of potential prognostic
factors, |
4) |
Prospective study design, |
5) |
Adequate study size (> 100
patient-years), |
6) |
Adequate follow-up (> 12 months), |
7) |
Adequate loss to follow-up (< 20%), |
8) |
Relevant outcome measures, |
9) |
Appropriate statistical
analysis. |
Checklist for methodological quality of diagnostic
studies
Items: |
|
1) |
Was at least one valid reference test
used? |
2) |
Was the reference test applied in a
standardised manner? |
3) |
Was each patient submitted to at least one
valid reference test? |
4) |
Were the interpretations of the index test and
reference test performed independently of each
other? |
5) |
Was the choice of patients who were assessed
by the reference test independent of the results
of the index test? |
6) |
When different index tests are compared in the
study: were the index tests compared in a valid
design? |
7) |
Was the study design prospective? |
8) |
Was a description included regarding missing
data? |
9) |
Were data adequately presented in enough
detail to calculate test characteristics
(sensitivity and
specificity)? | |