FROM:
European Journal of Physiotherapy 2019 (Jul 8); 1–32 ~ FULL TEXT
Nadège Lemeunier, Minisha Suri-Chilana, Patrick Welsh, Heather M. Shearer, Margareta Nordin, et al.
Institut Franco-Européen de Chiropraxie,
72 chemin de la Flambère,
31300, Toulouse, France.
nlemeunier@ifec.net
The purpose of this study is to determine the reliability and validity of clinical tests used to assess cervical function, muscle strength and endurance in adults with neck pain and its associated disorders (NAD). Systematic review and update of the Bone and Joint Decade 2000–2010 Task Force on NAD. We systematically searched five electronic databases. Eligible reliability and validity studies were critically appraised using the QAREL and QUADAS-2 tools, respectively. Validity studies were ranked according to the Sackett and Haynes classification to determine clinical utility. Early studies of novel tests provide preliminary evidence, and phase III/IV studies are necessary to confirm the validity of tests in clinical practice. We conducted a best evidence synthesis. We screened 7846 citations and critically appraised 28 articles. Eighteen low risk of bias articles provide preliminary evidence of reliability and validity (phase I/II) for the cranio-cervical flexion test and deep cervical extensor (DCE) test in patients with NAD. Only two clinical tests were found to be reliable and valid. Cranio-cervical flexion test and DCE test could assess cervical muscle strength in adults with NAD. However, the evidence is supported by only phase I and II validity studies from the Sackett and Haynes classification.
KEYWORDS: Systematic review, neck pain, functional tests, neck strength, neck endurance, reliability, validity
From the FULL TEXT Article:
Introduction
Neck pain and its associated disorders (NAD) is a common
disorder and can have a significant impact on an individual’s
function and quality of life. NAD is ranked as the fourth leading
cause for disability and 21st for overall burden of disease. [1] The annual prevalence of non-specific NAD is estimated
to be 30–50% globally, with 1.7–11.5% reporting activity-limiting
pain. [2] Furthermore, 50–85% of individuals
reported a second episode of neck pain 1–5 years following
initial onset. [3] Overall, NAD is often recurrent and represents
a significant source of pain and activity limitations in
the working population. [3, 4]
The physical examination of patients with NAD involves
observation, range of motion, palpation, and neurological
examination. Functional tests are employed as additional
examination tools to provide measures of an individual’s abilities
to perform physical tasks such as lifting and overhead
reaching. [5–8] In theory, these tests can provide clinicians
with performance-based outcomes to evaluate daily physical
abilities and inform return to function and clinical goals.
Although functional tests are frequently employed by clinicians,
their reliability and validity remain unclear.
A systematic review by the Bone and Joint Decade
2000–2010 Task Force on Neck Pain and its Associated
Disorders (Neck Pain Task Force) investigated the reliability
and validity of various assessment and diagnostic procedures
for NAD, including neck function, muscle strength and
endurance tests. [9] The Neck Pain Task Force identified:
(1) preliminary evidence that lower functional ability is associated
with higher pain intensity in patients with chronic NAD;
(2) evidence that muscle testing of the neck and upper
extremity has poor reliability which may be due to measurement
error (Kappa 0.60); and
(3) preliminary evidence that
cervical flexor endurance tests in the supine position may
help differentiate between patients with whiplash-associated
disorder (WAD) grade II and healthy controls. [9]
The search
used by the Neck Pain Task Force including literature published
up to 2006 and an update of the systematic review is
needed to determine the reliability and validity of functional
tests for the assessment of NAD.
Aim
The purpose of our systematic review was to update the
Neck Pain Task Force and determine the reliability and validity
of clinical tests used in the assessment of neck function
in adults aged 18 years or older with NAD grades I–IV. This
review is the last in a series of five systematic reviews updating
the Neck Pain Task on assessment of patients with NAD. [10–13] Together, these reviews will inform the development
of a clinical practice guideline for the clinical assessment
of NAD.
From the FULL TEXT Article:
Methods
Registration
We registered two review protocols with the International
Prospective Register of Systematic Reviews (PROSPERO) on 2
February 2016 (CRD4201603XXXX for the functional tests section
and CRD4201603XXXX for the muscle strength and
endurance tests section).
Eligibility criteria
Population:
We included studies of adults, 18 years of age
and older, with NAD (grades I–IV) including WAD (grades
I–IV). We defined NAD according to the Neck Pain Task Force
(Supplementary Table S1) [14] and WAD according to the
Quebec Task Force [15] (Supplementary Table S2). NAD
includes non-traumatic neck pain and neck pain subsequent
to a traffic collision (whiplash), with or without its associated
disorders, which include arm pain radiating from the neck
and upper thoracic pain, and/or headache, and/or temporomandibular
joint pain where they are associated with neck
pain. [14]
According to the Neck Pain Task Force, NAD is classified into four grades [14]:
Grade I: Pain of low intensity and related to low levels of
disability and interference with activities of daily living. No
signs or symptoms suggestive of major structural pathology
and no or minor interference with activities of daily living.
Grade II: Pain of high intensity, but associated with low
level of disability and interference with activities of daily
living. No signs or symptoms of major structural pathology,
but major interference with activities of daily living.
Grade III: Pain that is associated with high levels of disability
and moderate limitations in activities of daily living.
No signs or symptoms of major structural pathology,
but presence of neurologic signs such as decreased deep
tendon reflexes, weakness, and/or sensory deficits.
Grade IV: Pain that is associated with high levels of disability and severe limitations in activities of daily living.
Signs or symptoms of major structural pathology, such as fracture, myelopathy, neoplasm, or systemic disease; requires prompt investigation and treatment.
The Quebec Task Force Classification of Grades of Whiplash-associated Disorder [15]:
Grade I WAD: Neck pain and associated symptoms in the absence of objective physical signs.
Grade II WAD: Neck pain and associated symptoms in the presence of objective physical signs and without evidence of neurological involvement.
Grade III WAD: Neck pain and associated symptoms with evidence of neurological involvement including decreased or absent reflexes, decreased or limited sensation, or muscular
weakness.
Grade IV WAD: Neck pain and associated symptoms accompanied by fracture and dislocation.
Interventions
We limited our review to studies assessing the reliability and
validity of neck function, muscle strength and endurance
tests used to assess NAD patients. Reliability refers to the
ability of a test to give an equivalent result with repeated
application in a person with a particular level of a disease. [16] Reliability can be measured within (intra-rater) and
between (inter-rater) individuals performing a test. We also
considered test–retest reliability which is defined as the stability
of a clinical phenomenon in subjects who are supposed
to have not changed. Validity refers to the degree to which
persons with or without the condition under study are correctly
categorised. [16] Construct validity is the degree to
which a test measures what it purports to measure, while criterion
validity compares a measure to a gold standard. [16]
Definition of functional tests
The definition of a functional test is adapted from Solway
et al. [8] Functional tests are measures of functional status
and capacity, referring primarily to the ability to undertake
physically demanding activities of daily living or work-related
tasks. [8] Examples of functional tests include, but are not
limited to, the assessment of lifting, stepping, hopping or
general movement (e.g. walking, running, or gait). We
included studies that assessed home and work-related function
and functional capacity evaluations. We excluded active
and passive range of motion tests, orthopaedic tests, which
were reported in another review. [10, 13]
Definition of muscle strength and endurance tests
The National Strength and Conditioning Association of
America defines muscle strength as the maximal force that a
muscle or muscle group can generate at a specified velocity
or as an isometric contraction. [17] Muscle endurance is
defined as the time limit of a person’s ability to maintain an
isometric force or a power level involving combinations of
concentric and/or eccentric muscular contractions. [17] Tests
of neck strength and endurance include but are not limited
to manual muscle testing, dynamometry, and endurance
tests. [7, 18–20]
Study characteristics
To be included in the systematic review, studies met the following
inclusion criteria:
(1) English or French language;
(2) published from 1 January 2005 to 7 November 2017;
(3) published
in a peer-reviewed journal;
(4) reliability or validity
studies of neck functional tests; or muscle strength and/or
endurance; and
(5) study population including adults (18
years of age or older) with grades I–IV neck pain (including
non-traumatic neck pain and neck pain subsequent to a traffic
collision) with or without its associated disorders.
If studies
included a mixed population with individuals less than 18
years of age, results must be stratified for adults 18 years of
age and older. In studies with multiple diagnostic assessments
or tests (e.g. strength, range of motion, and palpation),
results must be stratified for each test.
We excluded studies meeting any of the following criteria:
(1) publication types including guidelines, letters, editorials,
commentaries, unpublished manuscripts, dissertations, government
reports, books and book chapters, conference proceedings,
meeting abstracts, lectures and addresses,
consensus development statements, guideline statements;
(2) study designs including systematic and non-systematic
reviews, and case studies;
(3) cadaveric or animal studies;
(4) studies only targeting individuals with serious pathology or
systemic diseases (including but not limited to fractures, dislocations,
myelopathy, neoplasms, and infection);
(5) sample
size less than 20 per group;
(6) studies utilising devices that
are not commonly used or very expensive for a typical clinical
practice (e.g. electromyography [EMG]).
Data sources and searches
We developed a search strategy in consultation with a health
sciences librarian, which was reviewed by a second librarian.
We systematically searched the following electronic databases
from 1 January 2005 to 7 November 2017: MEDLINE,
Cochrane Central Register of Controlled Trials, CINAHL,
PubMed. We also searched SPORTDiscus for the muscle
strength and endurance strategy. Search terms consisted of
subject headings specific to each database (e.g. MeSH in
MEDLINE) and free text words relevant to
(1) NAD or WAD IIV,
(2) diagnosis/validity/reliability/reproducibility, and
(3) neck muscle strength and/or endurance, or functional test and/or visual inspection (Supplementary Tables S3 and S4).
Visual inspection findings were reported in a separate review. [13] We first developed the search strategy in MEDLINE and
subsequently adapted the search to the other bibliographic
databases. Our search overlapped the NPTF search by one
year to ensure studies were not missed during this period.
Study selection
We exported all citations identified by the search strategy
into EndNote for reference management and tracking of the
screening process. Eight pairs of independent reviewers
screened articles in two stages. Stage one involved screening
of titles and abstracts for relevant and possibly relevant
citations based on the inclusion and exclusion criteria.
Citations deemed possibly relevant from the first stage were
reviewed in the second stage using the full text article.
Disagreements were resolved by discussion between the
paired reviewers to reach consensus. If consensus could not
be reached, the citation was independently screened by a
third reviewer and discussed with the other two reviewers to
reach consensus.
Assessment of risk of bias
Pairs of reviewers (thirteen pairs in total) independently critically
appraised all relevant studies. We assessed the internal
validity of each study using the modified Quality Appraisal
Tool for Studies of Diagnostic Reliability (QAREL) [21] criteria
for diagnostic reliability studies, and the modified Quality
Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) criteria
for diagnostic accuracy studies. [22] We modified the
QAREL and QUADAS-2 instruments to include
(1) a question on whether the study objective was clear;
(2) not applicable options for certain questions (QAREL items # 3, 4, 5, 6, 8, QUADAS items # 3.1, 3.2, 3.3 and 3.B); and
(3) the Sackett and Haynes classification (in the QUADAS-2 instrument) described below. [23]
Based on these critical appraisal criteria,
a study was considered low risk of bias if reviewers
agreed that selection bias (questions 2 and 3 of the QAREL
checklist and questions in domain 1 for the QUADAS-2
checklist) and measurement bias (questions 4–10 of the
QAREL checklist and questions in domains 2 and 3 for the
QUADAS-2 checklist) did not threaten the internal validity of
a study.
Consensus between reviewers was reached through discussion
and an independent third reviewer was involved
when consensus could not be reached. We contacted
authors if additional information was needed to ensure the
critical appraisal was accurate. Following appraisal, we considered
studies with adequate internal validity as low risk of
bias and included these studies in the best evidence synthesis.
We classified each low risk of bias study according to the
classification system described by Sackett and Haynes. [23]
Phase I studies assess differences in the results of the diagnostic
test between patients and healthy individuals. Phase II
studies assess the association between test results compared
to a reference standard in patients diagnosed with the condition
(i.e. NAD), whereas phase III assess a test’s ability to
perform in a population with the suspected condition.
Finally, phase IV examines whether patients who were
assessed with the test have better outcomes than untested
individuals. [23] Early studies of novel tests provide preliminary
evidence of clinical utility, and phase III or IV studies are
needed to inform the validity and utility of a test in clinical
practice.
Data extraction and synthesis of results
Two reviewers (M. S., P. W.) extracted data from low risk of
bias studies to build evidence tables. A second reviewer
checked the extracted data (H. S., J. W., or N. L.). Meta-analysis
would be performed in the event that the accepted studies
were statistically and clinically homogenous. In the case of
heterogeneity, a qualitative synthesis of findings from the
studies with a low risk of bias would be performed to develop
evidence statements according to the principles of best evidence
synthesis. [24] Specifically, the research team used evidence
tables to outline the best evidence on each topic,
identify consistencies and inconsistencies in this evidence, and
formulate summary statements to describe the body of evidence
and compare the results to the NPTF findings. [9]
Statistical analyses
We computed the inter-rater reliability for the screening of
articles using the kappa coefficient (OE) and 95% confidence
intervals (CI). [25] We also calculated the percentage agreement
for classifying studies into high or low risk of bias following
independent critical appraisal.
Reporting
Our review complies with the Preferred Reporting Items for
Systematic Reviews and Meta-Analyses (PRISMA) statement [26] and Statement for Reporting Studies of Diagnostic
Accuracy (STARD). [27]
Results
Study selection
We identified 10,085 citations, removed 2239 duplicates, and
screened 7,846 articles for eligibility (Figures 1 and 2).
We
screened 165 citations for eligibility using full text and 129
were excluded due to
(1) sample size less than 20 (n = 22);
(2) irrelevant outcomes (n = 49);
(3) ineligible study population
(n = 27);
(4) ineligible study design (n = 15);
(5) ineligible publication type (n = 14);
(6) ineligible language (n = 1);
(7) ineligible device (i.e. too sophisticated machine) (n = 1).
We
critically appraised 36 articles of which nine focussed only on
visual inspection were reported in a separate review. [13]
The remaining 27 articles appraised include 28 studies as
articles could explore both reliability and validity. Eighteen
articles (reporting on 19 studies) were low risk of bias, which
included nine reliability studies [28–36] and 10 validity studies [37–45]; among which one article assessed both reliability
and validity. [34] Nine studies were deemed to be high risk
of bias [37, 42, 45, 47–52] and excluded from the best evidence
synthesis; among which reliability section of two low risk of
bias studies [42, 45] and the muscle strength study of a low
risk of bias article. [37]
The inter-rater reliability for screening of articles were
k = 0.98 (95% CI 0.67; 0.84) for the visual inspection and
functional tests section, and k = 0.88 (95% CI: 0.80–0.96) for
the muscle strength and endurance tests section. The visual
inspection findings were reported in a separate review. [13]
In total, the percentage agreement for independent critical
appraisal of studies (high versus low risk of bias) was 75%
(21/28). A meta-analysis was not possible due to clinical
heterogeneity between studies; we therefore conducted a
best evidence synthesis.
Study characteristics
Of the nine reliability studies with low risk of bias, five examined
inter-rater reliability [28, 29, 31, 33, 36], one examined
intra-rater reliability [32], and three examined both. [30, 34, 35]
Reliability was studied for neck:
(1) function (n = 3) [28–30];
(2) muscle strength (n = 2) [34, 35]; and
(3) muscle endurance (n = 6). [31–36]
In these articles, neck pain
was defined as chronic NAD I–II [30–32, 34, 35], NAD I–II [29],
or NAD I–III [28] of unknown duration, and chronic NAD
I–III. [33, 36]
Of the 10 validity studies with low risk of bias,
eight were phase I studies [34, 37, 39–41, 43–45],
of which four included a phase II component [34, 39, 41, 43]
and two studies were phase II only. [38,42]
The validity was studied for neck:
(1) function (n = 3) [37–39];
(2) muscle strength (n = 4) [34, 40, 42, 45]; and
(3) muscle endurance (n = 4) [34, 41, 43, 44]
in patients with NAD.
Seven studies targeted NAD grades I
and II [34, 37, 38, 40, 42, 44, 45], one targeted NAD grades I–III [43], and two targeted NAD grade III. [39,41]
The studies were conducted in the United States [31,33,36], Australia [32,42], Canada [39], Denmark [34, 34],
Hong Kong [40], Portugal [44], Spain [30, 45], Sweden [28, 37, 41, 43], and Switzerland [29, 38]. The examiners consisted of physical therapy students [34] and trained physiotherapists [28–31, 33, 35–37, 39, 42, 46]; while the remaining studies did not report the background of the examiners. [32, 38, 40, 41, 43, 44].
A variety of functional tests were used:
(1) active cervical
movement control tests (ROM) in various positions (n = 3) [28–30];
(2) active shoulder movement control tests (n = 3) [28–30];
(3) handgrip strength (n = 2) [37, 38]; and
(4) lifting overhead, overhead working and repetitive reaching (n = 2) [38, 39] or other functional tests (n = 2). [29, 30]
Tests of neck muscle strength included the cranio-cervical
flexion test (CCFT) performed with pressure biofeedback [34, 35, 40, 42] or cervical muscle strength. [45]
Tests of neck muscle endurance included the chin tuck
neck flexion test [31–33, 44], neck flexor muscle endurance
(NFME) test [35], neck extensor test (NET) [32, 35, 36, 44], deep
cervical extensor (DCE) test [34], and neck muscle endurance
(NME) test. [41, 43] All tests are described in the glossary
(Table 1).
Risk of bias within studies
All reliability studies with low risk of bias had clear study
objectives as well as appropriate sample selection, inter-rater
blinding, time intervals between measurements, interpretation
of tests, and use of statistics (Table 2).
However, some studies did not provide
(1) clear information regarding raters
(n = 2) [29, 32];
(2) adequate information regarding intra-rater
blinding (n = 2) [34, 35];
(3) a clear description of blinding to
the reference standard (n = 1) [35];
(4) clear description
regarding blinding of clinical information (n = 2) [33, 36] or
additional cues such as scars or unique identifying feature
on imaging films for example (n = 7) [29, 30, 32–36]; and
(5) information or no variation in the order of examination
(n = 5). [30, 31, 33, 34, 36]
All validity studies with low risk of bias had appropriate
exclusion criteria, reference standard and time interval
(Table 3). However, some studies did not provide clear information
regarding patient flow (n = 3) []
[40,41,43] and blinding
to reference standard or index results (n = 3) []
[38,41,43]
(Table 3).
Nine studies were deemed to be high risk of bias [37, 42, 44, 46–51] and excluded from the best evidence synthesis;
among which reliability sections of two low risk of
bias studies [42, 44] and the muscle strength study of a low
risk of bias article. [37]
These studies were excluded due to
(1) inappropriate rater selection [44];
(2) absence of blinding [44, 49, 50];
(3) incorrect interpretation of results [51];
(4) unexplained
study flow [46, 51];
(5) inappropriate reference standards
(not reliable or valid) [48]; or (6) insufficient information
in the Methods section. [47]
Summary of evidence
Reliability of neck functional tests
Three studies assessed the reliability of functional tests used
for neck pain assessment. Two examined inter-rater reliability
only [28, 29] and one examined both inter- and intra-rater
reliabilities. [30] Evidence from three studies supports the
reliability of active cervical and arms control tests in the
assessment of NAD I and II (Table 4). However, the reliability
findings of active shoulder tests for the assessment of NAD
I–III patients showed important measurement errors [28]
(Table 4).
Active cervical and shoulder control tests.
Patroncini et al.
examined the reliability of active movement control tests of
the cervical spine and upper limb in individuals with NAD I
and II of unknown duration. Movement control tests examined
whether there was impairment in the control of movement
during functional activities. [29] Statistically significant
differences of clinical importance for diagnostic were shown
for all index tests. Specifically, active cervical motion produced
similar results for all ranges, with higher reliability
shown for nodding movement with head on the wall
(k = 0.80, 95% CI: 0.55–1.00), chin protraction–retraction
(k = 0.91, 95% CI: 0.75–1.00), and neck flexion when supine
(k = 0.81, 95% CI: 0.61–1.00). Results suggested unilateral arm
flexion (k = 0.74, 95% CI: 0.47–0.95), and bilateral shoulder
elevation (k = 1.0) were also shown to have inter-reliability
(Table 4). [29]
Segarra et al. examined chronic NAD I-II patients in comparison
to patients with musculoskeletal disorders other than
the cervical spine. [30] Both inter- and intra-reliabilities were
assessed for active cervical and shoulder ROM in various
body positions. [30] Inter-rater reliability was assessed by
video recordings with a 2–week period between ratings. All
tests were shown to have significant reliability in seated,
standing and 4–point kneeling positions (Table 4). Inter-rater
reliability was greatest for active cervical rotation (k = 0.81,
95% CI: 0.58–1.00), active upper cervical rotation in 4–point
kneeling (k = 0.80, 95% CI: 0.66–0.93), active cervical extension
while seated (k = 0.73, 95% CI: 0.29–0.91), and active
bilateral arm flexion while standing (k = 0.71, 95% CI:
0.44–0.93). The same tests were shown to have intra-rater
reliability (0.70 (0.49–0.83) < ICC (95% CI) < 0.94 (0.90–0.97)),
along with active cervical extension (ICC = 0.92; 95% CI:
0.86–0.95) in 4–point kneeling (Table 4). [30]
A recent study by Aasa et al. assessed the inter-rater reliability
of NAD I-III patients of unknown duration with healthy
age-matched controls. [28] Index tests included: active maximal
neck extension, rotation, active scapulo-/gleno-humeral
medial rotation in the scapular plane, serratus anterior and
lower trapezius control tests. Serratus anterior testing
involved downward rotation, elevation and retraction of the
scapula in 4–point kneeling position. Lower trapezius control
was tested with downward elevation and retraction of the
scapula while prone.
Raters observed video recordings independently.
Inter-rater reliability of all index tests was statistically
significant with important measurement errors. For
expert raters, greater inter-rater reliability was found for
active neck rotation and serratus anterior tests (k = 0.89; SE
= 0.08; p0.01) (Table 4). For the novice pair, statistically
significant results were found for gleno-humeral medial rotation
right (k = 0.60; SE = 0.18; p<0.01) and left (k = 0.62; SE
= 0.20; p<0.01), and lower trapezius control tests (k = 0.58;
SE = 0.17; p<0.01). Experts were found to have a higher
inter-rater reliability compared to novice raters on all tests,
except right gleno-humeral medial rotation (Table 4). [28]
Active movement control tests of the upper limb.
Patroncini et al. examined the reliability of active movement
control tests of the upper limb in individuals with NAD I-II of
unknown duration. [29] Movement control tests that
assessed upper body forward-backward motion (k = 0.84,
95% CI: 0.68–0.94) and forward bending in standing (k = 1.0)
demonstrated inter-rater reliability, as well as weighted arm
flexion to 90° (k = 0.85, 95% CI: 0.55–1.00) (Table 4). [29]
Segarra et al. examined chronic NAD I-II patients in comparison
to patients with musculoskeletal disorders other than
the cervical spine. [30] Both inter- and intra-rater reliabilities
were assessed for rocking backwards in 4–point kneeling. [30]
This test was shown to have significant reproducibility, with
higher intra-rater reliability (0.78 (0.54–0.99) < k (95% CI) <
0.80 (0.54–1.0)) than inter-rater reliability (i = 0.36; 95% CI:
0.12–0.68) (Table 4). [30]
Validity of neck functional tests
Three articles examined the validity of functional tests used
for NAD patients. [37–39] Two phase I and two phase II studies
provide preliminary evidence that active shoulder control
tests, handgrip strength and tests such as lifting overhead,
overhead working and repetitive reaching may be helpful for
the assessment of NAD I-II patients (Table 5). However, the
clinical accuracy of these tests is not known.
Active shoulder control tests.
Juul-Kristensen et al. designed
a phase I study identifying computer workers with recurrent
neck and shoulder trouble with healthy workers reporting little
or no neck or shoulder dysfunction in the last year. [37]
Functional tests evaluated were maximum voluntary contraction
(MVC) of shoulder elevation. Construct validity was
assessed by determining the mean difference in functional
outcomes between groups (symptomatic-control). There was
a statistically significant difference of clinical importance for
diagnostic between groups (Table 5). Participants with selfreported
neck trouble had decreased shoulder elevation
compared to their asymptomatic colleagues (mean differences
symptomatic versus asymptomatic between –44.00 (95% CI: –51.38, –36.62) and –66.00 (95% CI: –77.46, –54.54)
Newton’s for the right and the left side, respectively). [37]
Handgrip strength.
Juul-Kristensen et al. also evaluated
handgrip strength using a dynamometer. [37] Construct validity
was assessed by determining the mean difference in
functional outcomes between groups (symptomatic-control).
Participants with self-reported neck trouble had decreased
right-handed grip strength compared to their asymptomatic
colleagues. However, they found there was no statistically
significant difference between groups using left handgrip
strength (Table 5). [37]
A phase II study by Trippolini et al. measured the construct
validity of functional tests in patients with persistent
NAD I and II. [28] Participants performed tests by incrementally
increasing weight until reaching their maximal ability.
The correlation between hand grip strength and reference
standards including: numeric rating scale (NRS) for pain, spinal
function sort (SFS) for functional ability, neck disability
index (NDI) for disability, and the hospital anxiety and
depression scale (HADS-A/D) for anxiety and depression were
calculated. For handgrip strength (in kgF), all correlations
with reference standards were statistically and clinically significant
indicating decreased grip strength was associated
with increased pain, disability, anxiety, depression ( –0.28 ( –0.38 to 0.17) < Pearson’s r (95% CI) < –0.25 ( –0.35 to 0.15)) and decreased functional ability (Pearson’s r = 0.38
(95% CI: 0.28–0.47)) (Table 5). [38] Statistically significant gender
differences were also shown, with all favouring greater
abilities in males (Table 5). [38]
Lifting overhead, overhead working and repetitive reaching.
A phase II study by Trippolini et al. measured the construct
validity of functional tests in patients with persistent
NAD I and II. [38] The correlation between lifting overhead,
overhead working, and repetitive reaching and reference
standards were calculated. Reference standards included:
numeric rating scale (NRS) for pain, spinal function sort (SFS)
for functional ability, neck disability index (NDI) for disability,
and the hospital anxiety and depression scale (HADS-A/D) for
anxiety and depression. Lifting overhead (kg), working overhead,
and repetitive reaching significantly correlated with all
reference standards indicating decreased functionality was
associated with increased pain, disability, anxiety, depression
and decreased functional ability (Table 5). [38] Statistically
significant gender differences were also shown for lifting and
repetitive reaching, all favouring greater abilities in males
(mean differences between male and female from 3.80 kg
(95% CI: 2.57–5.03) to 8.20 s (95% CI: 2.23–14.17), respectively)
(Table 5). [38]
A phase I and II validity studies assessed functional
impairment test in patients with WAD II and controls. [39]
Three timed tasks were performed:
(1) the waist-up test consisting
of grabbing, lifting, moving, and placing containers
on waist-level and 25 cm above waist level shelves;
(2) the
same task except that the two shelves are placed at eye level
and 25 cm below; and
(3) an overhead work task.
There was
a significant mean difference between groups (WAD II versus
controls) for all tasks (p<0.001). Performance scores of the
tasks are negatively correlated with pain intensity (NPRS) (–0.37 < Spearman’s r <–0.46); neck disability (NDI)
(–0.32 < Spearman’s < r–0.43); arm and shoulder disability
(DASH) (–0.25 < Spearman’s r <–0.36); and positively correlated
with cervical range of motion (CROM)
(0.01 < Spearman’s r <0.51). [39]
Reliability of neck muscle strength tests
Two low risk of bias studies provide evidence of inter-rater
and intra-rater reliabilities for the CCFT in patients with NAD
I and II and healthy controls [34, 35] (Table 4). Evidence from
these studies supports the reliability of Cranio-Cervical
Flexion Test in the assessment of NAD I and II patients.
Cranio-cervical flexion test.
The inter-rater reliability intraclass correlation coefficient (ICC) ranged from 0.63 (95% CI:
0.41–0.78) to 0.82 (95% CI: 0.67–0.91) measured at two
minutes. The intra-rater reliability ranged from 0.70 (95% CI:
0.43–0.85) to 0.86 (95% CI: 0.72–0.93), with measurements
occurring between one and seven days. [34]
Juul et al.
(2013) reported an inter-rater reliability ICC ranging between
0.85 (95% CI: 0.76–0.91) and 0.86 (95% CI: 0.81–0.93) measured
at ten minutes. The intra-rater reliability ranged from
0.69 (95% CI: 0.53–0.80) to 0.81 (95% CI 0.70–0.88) with
measurements taken at 1 and 3 working days (Table 4). [35]
Validity of neck muscle strength tests
Three low risk of bias studies provide evidence of validity
(phases I and II) for the CCFT as a measure of neck muscle
strength in patients with NAD/WAD grades I and II (Table 5). [34, 40, 42] Results from these studies provide preliminary evidence
that the cranio-cervical flexion test may be helpful in
the assessment of NAD and WAD I and II patients. Another
phase I study reported preliminary evidence of cervical
muscle strength for the assessment of patients with NAD I-II. [45] However, the clinical accuracy of these tests is
not known.
Cranio-cervical flexion test (CCFT).
Two phase I validity
studies reported a significant difference in CCFTs scores
[1.71/30mmHg (95% CI: 0.22–3.21); p = 0.03] [34] and performance
(p<0.001) [40] between patients with NAD I and II
and healthy controls (Table 5). A phase II validity study
reported a non-statistically significant negative correlation
between muscle strength (as measured by CCFT activation
score and performance index) and both pain intensity and
disability (as measured by the Visual Analogue Scale and
Neck Disability Index respectively) (Table 5). [42] Similarly,
Jorgensen et al. (phase II study) reported a negative correlation
between the CCFT and both the Neck Disability Index
and Numeric Rating Scale, and a positive correlation with the
SF36-Physical Component Score (Table 5). [34]
Cervical muscle strength test.
A phase I validity study
reported a significant median difference in cervical muscle
strength between NAD I and II patients and a control group
(from 3.25 kg (95% CI: 1.75–4.76); p<0.05 in latero-flexion to
4.82 kg (95% CI: 2.93–6.71); p<0.05 in extension) [45]
(Table 5).
Reliability of NME tests
Six studies reported on the reliability of five NME tests
(Table 4). [31–36] Evidence from these studies supports the
reliability of chin tuck neck flexion test, neck extensor endurance
test, DCE test in the assessment of NAD I and II
patients, and the reliability of NFME and NE tests for the
assessment of NAD I–III patients.
Chin tuck neck flexion test.
Cleland et al. (2006) reported
the inter-rater reliability [ICC = 0.57 (95% CI: 0.14–0.81)] of
the Chin Tuck Neck Flexion Test among adult patients with
NAD grades I and II, with a mean duration of 69 d [31]
(Table 4).
Two articles studied a similar test as the Chin tuck neck
flexor test (described in Table 1). One study measured neck
flexor endurance in patients with NAD I and II at least 6
months duration. [32] The intra-rater reliability coefficient
was ICC = 0.93 (95% CI: 0.86–0.97) measured three days later. [32] Hanney et al. reported on the inter-rater reliability of
the neck Flexor Endurance Test in patients with NAD grades
I–III with a mean symptom duration of 259 d, ICC = 0.70
(95% CI: 0.40–0.87). [33]
NFME test.
One study provided evidence of the intra-rater
reliability of the NFME test in patients with NAD I and II and
healthy controls. [35] Juul et al. (2013) measured neck flexor
endurance in supine and while seated.
The intra-rater reliability
ICC ranged from 0.68 (95% CI: 0.52–0.80) to 0.75 (95% CI:
0.61–0.85) in supine, and between 0.42 (95% CI: 0.18–0.60)
and 0.59 (95% CI: 0.40–0.73) when seated. [35] Juul et al.
also reported the inter-rater reliability of the NFME in supine
ranging from 0.73 (95% CI: 0.59–0.83) to 0.70 (95% CI:
0.55–0.81), and between 0.56 (95% CI: 0.37–0.71) and 0.74
(95% CI: 0.56–0.84) when seated. [35]
NET.
Three studies provide evidence of the reliability of the NET. [32, 35, 36] Edmondston et al. reported the intra-rater reliability for patients with NAD I and II of greater than 6 months duration [ICC = 0.88 (95% CI: 0.75–0.95)]. [32] Juul et al. (2013) reported that intra-rater ICC ranged between 0.41 (95% CI: 0.17–0.60) and 0.14 (95% CI: –0.17 to 0.37) for patients with NAD I and II of greater than 4 weeks duration. [35] The inter-rater reliability of the NET ranged between 0.19 (95% CI: –0.06 to 0.42) and 0.25 (95% CI: –0.01 to 0.47) [35] (Table 4).
Sebastian et al. provided evidence of the inter-rater reliability for the cervical extensor endurance test in patients with NAD I and III of unknown duration [k = 0.80 (95% CI: 0.59–1.01)] [36] (Table 4).
DCE test.
A study by Jorgensen et al. assessed the DCE test
in patients with NAD grades I and II and healthy controls. [34] The inter-rater reliability ranged from 0.75 (95% CI: 0.55–0.87) to 0.76 (95% CI: 0.59–0.86) with an intra-rater reliability
ranging from 0.77 (95% CI: 0.55–0.89) to 0.90 (95% CI: 0.79–0.95) [34] (Table 4).
Validity of neck endurance tests
Four articles provided evidence of phase I/II validity of the
NME test, deep cervical endurance test or neck flexor and
extensor tests. [34, 41, 43, 44] Evidence from these studies provide
preliminary evidence for the validity of chin tuck neck
flexion, neck extensor, DCE tests in NAD I and II patients and
NME tests in NAD I–III patients (Table 5). However, the clinical
accuracy of these tests is not known.
Chin tuck neck flexion test.
A phase I validity study
reported a median difference (in seconds) in deep neck
flexor endurance between NAD I and II patients (18.82s
(interquartile: 8.08)) and a control group (26.29s (interquartile:
24.13)) [44] (Table 5).
NET.
A phase I validity study reported a median difference
(in minutes) in deep neck extensor endurance between NAD
I and II patients (3.44 min (interquartile: 3.03) and a control
group 3.54 min (interquartile: 2.04) [44] (Table 5).
NME test.
Four studies provide evidence of validity of the
NME test. [41, 43] Halvorsen et al. examined the construct validity
(phase II) of the NME test compared to the Visual
Analogue Scale, Neck Disability Index (NDI), and Tampa Scale
of Kinesiophobia (TSK). [41] Results suggested that patients
with NAD III had a significantly reduced NME time compared
to healthy controls (prone: p < 0.01; supine: p = 0.017) (Table
5). [41] Peolsson et al. investigated the construct validity
(phase II) of the NME test compared to the VAS and NDI in
patients with NAD I–III, cervical disc disease, and healthy
controls. [43]
There was a negative correlation between
(1) NME test performed in a dorsal position and Visual Analogue
Scale for pain intensity (Pearson’s r = 0.30; p = 0.01); and
(2) NME test performed in ventral position and the Neck
Disability Index (Pearson’s r = 0.23; p = 0.07), which demonstrates
that as neck pain and disability increase, neck endurance
decreases [43] (Table 5).
DCE test.
A phase I validity study reported a significant
median difference in time on the deep extension endurance
between NAD I and II patients and a control group
(29.21 seconds; p = 0.06) [34] (Table 5).
Jorgensen et al. reported also on the construct validity of
the DCE test compared to the NDI, SF-36-PCS, and the
Numeric Rating Scale in NAD I and II patients. [34] There was
a negative correlation between the DCE test and Neck
Disability Index and Numeric Rating Scale, respectively,
and a positive correlation between the test and SF36-PCS
(Table 5). [34]
Discussion
Summary of the results and update of the neck pain task force findings
Only tests with both reliability and validity findings could be
used in clinic. For functional testing as an assessment tool,
our review suggests inter- and intra-reliabilities for active
movement control tests of the cervical spine and upper
extremity. In particular, bilateral shoulder elevation/flexion
and forward bending when standing produced a perfect
inter-rater reliability score in NAD II patients. The evidence
suggests inter-rater reliability for active cervical rotation and
scapular medial and downward rotation as well. However,
the evidence suggests some important measurement errors
and the clinical accuracy of these tests is not known.
For muscle and endurance tests, our results demonstrate
preliminary evidence of reliability and phase I and II validity
for the cranio-cervical flexion test and deep cervical extension
test for the assessment in patients with NAD. In addition,
evidence for reliability and preliminary validity of NME
tests including both neck flexor tests and extensor tests was
identified. However, the clinical accuracy of these tests is
not known.
Update of the Bone and Joint Decade 2000–2010 Task Force on neck pain’s systematic review
For functional tests, the NPTF reported on one study that
showed some evidence for construct validity (phase II). [9]
Ljungquist et al. showed that patients with NAD I-II had less
lifting ability from waist to shoulder compared to those with
lower back pain. [52] They found a higher rating of pain
intensity or pain behaviour associated with lower
performance on functional tests (i.e. stepping, lifting, and
walking). There were no studies reported on the reliability of
this assessment method. Our findings update new evidence
for functional tests.
For muscle and endurance tests, the reliability and validity
results of this review are in agreement with the Neck Pain
Task Force, which accepted seven studies dealing with neck
muscle strength (active arm and shoulder control tests and
flexor tests) as an assessment tool for patients with neck
pain. [9] Six of these articles were phase I/II and one was a
phase III but utilising EMG to measure muscle strength
(which was excluded from this review). In their review,
Nordin et al. (2008) reported that neck muscle strength tests
had a slight to moderate inter-examiner reliability (kappa ≤0.60) in patients with neck pain, with or without radiculopathy. [9]
The Neck Pain Task Force also reported on one validity
study which found that cervical flexor muscle endurance
could distinguish between patients with WAD II and healthy
controls. [9] Our review is in agreement with the Neck Pain
Task Force, as we identified significant mean differences in
NFME (supine) between patients with NAD III and healthy
controls. [41]
Furthermore, the Neck Pain Task Force did not identify
any low risk of bias studies assessing cervical extensor endurance
tests. However, we found new evidence that examines
the preliminary validity of two cervical extensor endurance
tests for NAD: the NME Test prone and the DCE test. [34, 41]
Even though the Neck pain Task Force was published
more than 10 years ago, there is still an important need for
phase III and IV validity studies to establish the clinical utility
of these tests in clinical practice.
Comparison of results to previous systematic reviews
One previous systematic review [7] investigated the reliability
and validity of neck muscle strength and endurance tests
since the Neck Pain Task Force. [9] de Koning et al. reported
that muscle endurance tests of the short neck flexors and
the cervical progressive iso-inertial lifting evaluation (PILE)
test could be reliable with ICC intra-rater reliability ranged
from 0.88 to 0.96. An almost perfect inter-observer reliability
coefficient was reported (ICC = 1.00 (95% CI 0.99–1.0). [7] No
studies using the PILE test in the current review met our
inclusion criteria (sample size < 20). [37]
Strengths and limitations
There are several strengths to our review. First, we worked
with a librarian to develop a search strategy that was comprehensive
and methodologically rigorous. To minimise
errors, this search strategy was reviewed by a second independent
librarian. Second, prior to reviewing the literature,
we outlined detailed inclusion and exclusion criteria to identify
relevant citations from the searched literature. Third, we
searched multiple databases using database-specific subject
headings (e.g. MeSH) when available. Fourth, we had multiple
pairs of independent reviewers complete screening and
critical appraisal to minimise error and bias. Fifth, we used
standardised quality assessment tools (QAREL/QUADAS-2) for
the critical appraisal process. Finally, we used best-evidence
synthesis to minimise the risk of bias associated with the
inclusion of low-quality studies. Any count points were used
to decide the risk of bias of each article, and pairs of
reviewers were trained to determine the overall internal validity
of the studies and to assess how biases influenced
the results.
Our review has some limitations. First, our literature
search was restricted to the English and French languages
and potentially admissible non-English/French studies may
have been excluded. However, previous systematic reviews
of clinical trials have investigated the impact of language
restriction and found that it does not lead to bias as most
large trials are published in English. [53–57]
Second, it is possible
that we missed potentially relevant studies, despite
using a sensitive search strategy and an independent screening
process. We updated our literature search to November
2017, but found no new information. Nevertheless, it is possible
that new research has been published since.
Third,
there is judgment in the critical appraisal process, which may
vary between reviewers. However, we minimised this by
using pairs of independent, trained reviewers and standardised
quality assessment tools. Fourth, most clinical tests
described in this review involve a subjective evaluation of
the patient which may lead to measurement error. This could
be especially important when comparing versus experienced
examiner.
This could explain some of the inconsistency outlined
in our results. Finally, we elected to keep shoulder tests
because it is important to report what has been published in
the literature. However, the pathophysiological rational for
some of these tests was lacking or ill-conceived.
Conclusions
We found active shoulder tests reliable and valid to assess
neck function in adults with NAD. Experts were found to
have a higher inter-rater reliability compared to novice raters.
The cranio-cervical flexion test and DCE test were also
reported to be reliable and valid for the assessment of cervical
muscle strength in NAD patients. Overall, the evidence
is preliminary at best, supported by phase I and II validity
studies from the Sackett and Haynes classification. [23]
Clinicians must consider the preliminary nature of the evidence
when considering the use of these tests in clinic.
More than 10 years after the publication of the Neck Pain
Task Force, we still know little about the reliability and validity
of clinical tests used to assess cervical function, muscle
strength, and endurance in adults with neck pain. At best
the current literature provides preliminary evidence for the
active shoulder tests, cranio-cervical flexion tests and the
DCE test. Therefore, the clinical utility of these tests remains
unknown. Future high-quality studies, particularly phase III
validity studies, are needed to inform the use of these tests
for the assessment of NAD in clinical practice and their utility
for treatment recommendations.
Acknowledgement:
The authors acknowledge and thank Mrs Sophie Despeyroux, librarian at
the Haute Autorite de Sante, for her suggestions and review of the
search strategy. This research was undertaken, in part, thanks to funding
from the Canada Research Chairs programme to Dr Pierre Cote, Canada
Research Chair in Disability Prevention and Rehabilitation at the
University of Ontario Institute of Technology.
Conflict of Interest
The authors report no declarations of interest. None of these associations
were involved in the collection of data, data analysis, interpretation
of data, or drafting of the manuscript.
Funding
This study was funded by the Institut Franco-Europeen de Chiropraxie,
the Association Franc¸aise de Chiropraxie and the Fond de Dotation de
Recherche en Chiropraxie in France. Fond de Dotation en Recherche
Chiropratique.
References:
GBD 2017 DALYs and HALE Collaborators.
Global, regional, and national disability-adjusted life-years (DALYs) for 359 diseases
and injuries and healthy life expectancy (HALE) for 195 countries and territories,
1990–2017: a systematic analysis for the Global Burden of Disease Study 2017.
The Lancet. 2018;392(10159):1859–1922.
Hogg-Johnson, S, van der Velde, G, Carroll, LJ et al.
The Burden and Determinants of Neck Pain in the General Population: Results of the
Bone and Joint Decade 2000–2010 Task Force on Neck Pain and Its Associated Disorders
Spine (Phila Pa 1976). 2008 (Feb 15); 33 (4 Suppl): S39–51
Carroll LJ, Hogg-Johnson S, Cote P, van der Velde G, Holm LW, et al.
Course and Prognostic Factors for Neck Pain in Workers: Results of the Bone and Joint Decade
2000–2010 Task Force on Neck Pain and Its Associated Disorders
Spine (Phila Pa 1976). 2008 (Feb 15); 33 (4 Suppl): S93–100
Cote P, van der Velde G, Cassidy JD, Carroll LJ, Hogg-Johnson S, Holm LW, et al.
The Burden and Determinants of Neck Pain in Workers: Results of the Bone and Joint Decade
2000–2010 Task Force on Neck Pain and Its Associated Disorders
Spine (Phila Pa 1976). 2008 (Feb 15); 33 (4 Suppl): S60–74
Silverman JL, Rodriquez AA, Agre JC.
Quantitative cervical flexor strength in healthy subjects and in subjects with mechanical neck pain.
Arch Phys Med Rehabil. 1991;72:679–681.
Rodriquez AA, Bilkey WJ, Agre JC.
Therapeutic exercise in chronic neck and back pain.
Arch Phys Med Rehabil. 1992;73:870–875.
de Koning CH, van den Heuvel SP, Staal JB, et al.
Clinimetric evaluation of methods to measure muscle functioning in patients with non-specific
neck pain: a systematic review.
BMC Musculoskelet Disord. 2008;9:142.
Solway S, Brooks D, Lacasse Y, et al.
A qualitative systematic overview of the measurement properties of functional walk test
used in the cardiorespiratory domain.
Chest. 2001;119:256–270.
Nordin M, Carragee EJ, Hogg-Johnson S, Weiner SS, Hurwitz EL, Peloso PM, et al.
Assessment of Neck Pain and Its Associated Disorders: Results of the Bone and Joint Decade
2000–2010 Task Force on Neck Pain and Its Associated Disorders
Spine (Phila Pa 1976). 2008 (Feb 15); 33 (4 Suppl): S101–S122
Lemeunier N; da Silva-Oolup S; Chow N; Southerst D; Carroll L; Wong JJ; et al..
Reliability and Validity of Clinical Tests to Assess the Anatomical Integrity of the Cervical Spine
in Adults with Neck Pain and its Associated Disorders: Part 1- A Systematic Review from the
Cervical Assessment and Diagnosis Research Evaluation (CADRE) Collaboration
European Spine Journal 2017 (Sep); 26 (9): 2225–2241
Moser N, Lemeunier N, Southerst D, Shearer H, Murnaghan K, Sutton D, Cote P (2017)
Validity and Reliability of Clinical Prediction Rules used to Screen for Cervical Spine Injury
in Alert Low-risk Patients with Blunt Trauma to the Neck: Part 2. A Systematic Review
from the Cervical Assessment and Diagnosis Research Evaluation
(CADRE) Collaboration
European Spine Journal 2018 (Jun); 27 (6): 1219–1233
Lemeunier N; da Silva-Oolup S; Olesen K; Carroll LJ; Shearer H; Wong JJ; Brady OD; et al.
Reliability and validity of clinical tests to assess measurements of pain and disability in adults
with neck pain and its associated disorders: Part 3. A systematic review from the Cervical
Assessment and Diagnosis Research Evaluation (CADRE) Collaboration
Musculoskeletal Science & Practice 2018 (Dec); 38: 128–147
Lemeunier N, Jeoun EB, Suri M, et al.
Reliability and Validity of Clinical Tests to Assess Posture, Pain Location, and Cervical Spine Mobility
in Adults with Neck Pain and its Associated Disorders: Part 4. A Systematic Review from the
Cervical Assessment and Diagnosis Research Evaluation (CADRE) Collaboration
Musculoskeletal Science & Practice 2018 (Dec); 38: 128–147
Guzman J, Hurwitz EL, Carroll LJ, Haldeman S, Cote P, Carragee EJ, et al.
A New Conceptual Model Of Neck Pain: Linking Onset, Course, And Care
Results of the Bone and Joint Decade 2000–2010 Task Force on
Neck Pain and Its Associated Disorders
Spine (Phila Pa 1976). 2008 (Feb 15); 33 (4 Suppl): S14–23
Spitzer WO, Skovron ML, Salmi LR, Cassidy JD, Duranceau J, Suissa S, Zeiss E.
Scientific Monograph of the Quebec Task Force on Whiplash-Associated Disorders
Redefining Whiplash and its Management
Spine (Phila Pa 1976). 1995 (Apr 15); 20 (8 Suppl): S1-S73
Rothman KJ.
Modern epidemiology.
Philadelphia, USA:
Wolters Kluwer Health/Lippincott Williams & Wilkins; 2008.
Knuttgen HG, Kraemer WJ.
Terminology and measurement in exercise performance.
J Strength Cond Res. 1987;1:1–10.
Strimpakos N, Oldham JA.
Objective measurements of neck function. A critical review of their validity and reliability.
Phys Ther Rev. 2001;6:39–51.
Strimpakos N.
The assessment of the cervical spine. Part 2: strength and endurance/fatigue.
J Bodyw Mov Ther. 2011;15:
417–430.
Dvir Z, Prushansky T.
Cervical muscles strength testing: methods and clinical implications.
J Manipulative Physiol Ther. 2008;31: 518–524.
Lucas N, Macaskill P, Irwig L, et al.
The reliability of a quality appraisal tool for studies of diagnostic reliability (QAREL).
BMC Med Res Methodol. 2013;13:111.
Whiting PF, Rutjes AW, Westwood ME, et al.
QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.
Ann Intern Med. 2011;155:529–536.
Sackett DL, Haynes RB.
The architecture of diagnostic research.
BMJ. 2002;324:539–541.
Slavin RE.
Best evidence synthesis: an intelligent alternative to meta-analysis.
J Clin Epidemiol. 1995;48:9–18.
Viera AJ, Garrett JM.
Understanding interobserver agreement: the kappa statistic.
Fam Med. 2005;37:360.
Moher D, Liberati A, Tetzlaff J, Altman DG.
Preferred Reporting Items for Systematic Reviews
and Meta-Analyses: The PRISMA Statement
PLoS Medicine 2009 (Jul 21); 6 (7): e1000100
Bossuyt PM, Reitsma JB, Bruns DE, et al.
Toward complete and accurate reporting of studies of diagnostic accuracy.
Am J Clin Pathol. 2003;119:18–22.
Aasa B, Lundstr€om L, Papacosta D, et al.
Do we see the same movement impairments? The inter-rater reliability of movement tests
for experienced and novice physiotherapists.
Eur J Physiother. 2014;16:173–182.
Patroncini M, Hannig S, Meichtry A, et al.
Reliability of movement control tests on the cervical spine.
BMC Musculoskelet Disord. 2014;15:402.
Segarra V, Duenas L, et al.
Inter-and intra-tester reliability of a battery of cervical movement control dysfunction tests.
Man Ther. 2015;20:570.
Cleland JA, Childs JD, Fritz JM, Whitman JM.
Interrater reliability of the history and physical examination in patients with
mechanical neck pain.
Arch Phys Med Rehabil. 2006;87:1388–1395.
Edmondston SJ, Wallumrød ME, Macleid F, Kvamme LS, et al.
Reliability of isometric muscle endurance tests in subjects with postural neck pain.
J Manipulative Physiol Ther. 2008;31:348–354.
Hanney WJ, George SZ, Kolber MJ, et al.
Inter-rater reliability of select physical examination procedures in patients with neck pain.
Physiother Theory Pract. 2011;27:345–352.
Jørgensen R, Ris I, Falla D, Juul-Kristensen B.
Reliability, construct and discriminative validity of clinical testing in subjects with
and without chronic neck pain.
BMC Musculoskelet Disord. 2014;15: 408.
Juul T, Langberg H, Enoch F, et al.
The intra- and inter-rater reliability of five clinical muscle performance tests in patients
with and without neck pain.
BMC Musculoskelet Disord. 2013;14:339.
Sebastian D, Chovvath R, Malladi R.
Cervical extensor endurance test: a reliability study.
J Bodyw Mov Ther [Internet]. 2015;19: 213–216.
Juul-Kristensen B, Kadefors R, Hansen K, et al.
Clinical signs and physical function in neck and upper extremities among elderly female
computer users: the NEW study.
Eur J Appl Physiol. 2006; 96:136–145.
Trippolini MA, Dijkstra PU, Geertzen JHB, et al.
Construct validity of functional capacity evaluation in patients with
Whiplash-Associated disorders.
J Occup Rehabil. 2015;25:481–492.
Pierrynowski M, McPhee C, Mehta SP, et al.
Intra and inter-rater reliability and convergent validity of FITHaNSA in individuals with
Grade G Whiplash Associated disorder.
Toorthj. 2016;10: 179–189.
Chiu TTW, Law EYH, Chiu THF.
Performance of the craniocervical flexion test in subjects with and without chronic neck pain.
J Orthop Sports Phys Ther. 2005;35:567–571.
Halvorsen M, Abbott A, Peolsson A, et al.
Endurance and fatigue characteristics in the neck muscles during sub-maximal isometric
test in patients with cervical radiculopathy.
Eur Spine J. 2014;23: 590–598.
Hudswell S, von Mengersen M, Lucas N.
The cranio-cervical flexion test using pressure biofeedback: a useful measure of cervical
dysfunction in the clinical setting?
Int J Osteopath Med. 2005;8: 98–105.
Peolsson A, Kjellman G.
Neck muscle endurance in nonspecific patients with neck pain and in patients after anterior
cervical decompression and fusion.
J Manipulative Physiol Ther. 2007;30: 343–350.
Lourenc¸o AS, Lameiras C, Silva AG.
Neck flexor and extensor muscle endurance in subclinical neck pain: intrarater reliability,
standard error of measurement, minimal detectable change, and comparison with
asymptomatic participants in a university student population.
Manipulative Physiol Ther. 2016;39: 427–433.
Lopez-de-Uralde-Villanueva I, Sollano-Vallez E, Del Corral T.
Reduction of cervical and respiratory muscle strength in patients with chronic nonspecific
neck pain and having moderate to severe disability.
Disabil Rehabil. 2017;96(3):203–210.
Trippolini MA, Reneman MF, Jansen B, et al.
Reliability and safety of functional capacity evaluation in patients with whiplash
associated disorders.
J Occup Rehabil. 2013;23:381–390.
O’Leary S, Jull G, Vicenzino B.
Do dorsal head contact forces have the potential to identify impairment during
graded-craniocervical flexor muscle contractions?
Arch Phys Med Rehabil. 2005;86: 1763–1766.
Kahlaee AH, Rezasoltani A, Ghamkhar L.
Is the clinical cervical extensor endurance test capable of differentiating the local
and globalmuscles?
Spine J. 2017;17:913–921.
Rastovic P, Gojanovic MD, Berberovic M, et al.
Isometric muscle fatigue of the paravertebral and upper extremity muscles after whiplash injury.
Ann Saudi Med. 2017;37:297–307.
Martins F, Bento A, Silva AG.
Within-session and between-session reliability, construct validity, and comparison
between individuals with and without neck pain of four neck muscle tests.
Pm R. 2018;10:183–193.
Cagnie B, Cools A, De Loose V, et al.
Differences in isometric neck muscle strength between healthy controls and women with
chronic neck pain: the use of a reliable measurement.
Arch Phys Med Rehabil. 2007;88:1441.
Ljungquist T, Jensen IB, Nygren A, et al.
Physical performance tests for people with long-term spinal pain:
aspects of construct validity.
J Rehabil Med. 2003;35:69–75.
J€uni P, Holenstein F, Sterne J, et al.
Direction and impact of language bias in meta-analyses of controlled trials: empirical study.
Int J Epidemiol. 2002;31:115–123.
Moher D, Fortin P, Jadad AR, et al.
Completeness of reporting of trials published in languages other than English:
implications for conduct and reporting of systematic reviews.
Lancet. 1996;347: 363–366.
Moher D, Pham B, Lawson ML, et al.
The inclusion of reports of randomised trials published in languages other than English
in systematic reviews.
Health Technol Assess. 2003;7:1–90.
Morrison A, Polisena J, Husereau D, et al.
The effect of English-language restriction on systematic review-based meta analyses:
a systematic review of empirical studies.
Int J Technol Assess Health Care. 2012;28:138–144.
Sutton AJ, Duval SJ, Tweedie RL, et al.
Empirical assessment of effect of publication bias on meta-analyses.
BMJ. 2000;320: 1574–1577.
Return to SPINAL PALPATION
Since 3–20–2020
|