Skip to main content

Validity of objective methods for measuring sedentary behaviour in older adults: a systematic review



The evidence showing the ill health effects of prolonged sedentary behaviour (SB) is growing. Most studies of SB in older adults have relied on self-report measures of SB. However, SB is difficult for older adults to recall and objective measures that combine accelerometry with inclinometry are now available for more accurately assessing SB. The aim of this systematic review was to assess the validity and reliability of these accelerometers for the assessment of SB in older adults.


EMBASE, PubMed and EBSCOhost databases were searched for articles published up to December 13, 2017. Articles were eligible if they: a) described reliability, calibration or validation studies of SB measurement in healthy, community-dwelling individuals, b) were published in English, Portuguese or Spanish, and c) were published or in press as journal articles in peer-reviewed journals.


The review identified 15 studies in 17 papers. Of the included studies, 11 assessed the ActiGraph accelerometer. Of these, three examined reliability only, seven (in eight papers) examined validity only and one (in two papers) examined both. The strongest evidence from the studies reviewed is from studies that assessed the validity of the ActiGraph. These studies indicate that analysis of the data using 60-s epochs and a vertical magnitude cut-point < 200 cpm or using 30- or 60-s epochs with a machine learning algorithm provides the most valid estimates of SB. Non-wear algorithms of 90+ consecutive zeros is also suggested for the ActiGraph.


Few studies have examined the reliability and validity of accelerometers for measuring SB in older adults. Studies to date suggest that the criteria researchers use for classifying an epoch as sedentary instead of as non-wear time (e.g., the non-wear algorithm used) may need to be different for older adults than for younger adults. The required number of hours and days of wear for valid estimates of SB in older adults was not clear from studies to date. More older-adult-specific validation studies of accelerometers are needed, to inform future guidelines on the appropriate criteria to use for analysis of data from different accelerometer brands.

Trial registration

PROSPERO ID# CRD42017080754 registered December 12, 2017.


Evidence showing the negative health consequences of sedentary behaviour continues to grow. ‘Sedentary behaviour’ (SB) is any awake behaviour done while sitting, reclining, or lying down that requires no more than 1.5 metabolic units of energy expenditure [1]. As well as being associated with psychological distress [2] and poor physical functioning [3], greater amounts of SB have been shown to increase risk of cardiovascular disease incidence and mortality, diabetes incidence, cancer incidence and mortality [4], and all-cause mortality [5,6,7,8,9].

Evidence further suggests a dose-response relationship between subjectively- and objectively-measured SB and poor health outcomes in older adults [3]. Being the most sedentary age cohort [10, 11], older adults are at risk of SB-related diseases. UK researchers found that older adults spent, on average, 11–12 h/day in SB [12]. Half the sampled older adults spent 80% of their time in SB. Similarly, a Canadian study suggested that 94% of older Canadians spent at least 8 h/day sitting [13]. Both these studies measured SB objectively with accelerometers.

An international group of experts in SB research concluded in a consensus statement that future SB research with older adults should provide a better understanding of the correlates of SB to inform intervention studies and that interventions that aim to decrease SB should measure the impact of interventions on SB [14]. Both types of research require accurate measurement of SB, and self-report measures have limited utility for assessing total SB [14]. Indeed, a review of 31 international studies of SB in adults aged ≥60 years found that mean daily SB time was significantly greater when measured with accelerometers (9.4 h/day) than self-report measures (5.3 h/day) [15].

To objectively measure time spent in SB, researchers often use accelerometers [16], which measure changes in acceleration. Although accelerometers were built to measure physical activity, they can indicate low levels of and the absence of movement. However, since movement is determined by acceleration, not body posture [17], they cannot distinguish between sitting and standing still. For this reason, inclinometers (instruments that measure slope or tilt) have been incorporated into some newer accelerometers to detect postures and transitions between postures.

With the addition of inclinometers in accelerometers studies are being conducted to assess the reliability and validity of newer models of accelerometers for assessing SB. Authors of a 2014 systematic review of the use of accelerometers in older adults [18] reported that few accelerometer validations studies had been conducted with older adults [18]. This is an important omission due to the potential to misclassify as non-wear time the large proportions of the day that older adults spend sitting still when standard non-wear algorithms for adults are used [18]. The non-wear algorithm selected for processing data affects estimates of SB [19] because a long string of zeros could represent either (a) time that the monitor was not worn or (b) an extended period in which the monitor wearer is still. The authors of the review [18], therefore, advocated for older adult-specific validation studies. Those authors also reported that to classify SB, accelerometer cut-points ranging from 50 to 500 counts per min (cpm) were being used. The reliability and validity of the cut-points were not discussed. Only one reviewed study included an accelerometer with an inclinometer. The current study systematically reviews the current literature on the reliability and validity of accelerometers with or without inclinometers for measuring SB in older adults.


The review was guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [20].

Search strategy and study selection

The EMBASE, PubMed and three EBSCOhost databases (MEDLINE, CINAHL and SPORTDiscus) were searched for articles published up to December 13, 2017. A three-step process was used. First, KCH searched titles and abstracts using the search terms shown in Table 1. The reference lists of located articles were also searched. Second, two authors (KCH and RLH) independently reviewed the titles and abstracts of each located article to assess its eligibility for inclusion. Disagreements between reviewers were discussed, and consensus reached about which articles would be reviewed in the final step, a review of the full text of articles. For the final step, KCH and RLH independently reviewed the full text of articles and came to consensus about articles to include in the review.

Table 1 Search terms

Inclusion criteria

As in a previous review of measurement in older adults [18], the search was limited to older adults (those aged ≥65 years), although samples with mean ages ≥60 years were included if they were located through the search strategy. Articles were eligible if they: a) described reliability, calibration or validation studies of SB measurement in healthy, community-dwelling individuals, b) were published in English, Portuguese or Spanish, and c) were published or in press as journal articles in peer-reviewed journals. The reliability and validity of accelerometers in populations living in residential care facilities or having a specific disease or disability were not included. Editorials, reviews, and conference abstracts were also not included.

Reliability and validity of accelerometers

Reliability of accelerometers refers to the consistency in accelerometer readings. Most research on reliability of accelerometers assesses test-retest reliability, which is typically estimated with the intra-class correlation coefficient (ICC) for continuous data [21] including accelerometer data.

Validity refers to the extent to which an accelerometer accurately measures SB. Two types of validity are of interest: criterion and concurrent. Criterion validity refers to the extent to which the findings from the accelerometer agree with the findings produced with a ‘gold standard’ measure [22]. For assessing the criterion validity of an accelerometer, the gold standard is typically calorimetry or direct observation. Concurrent validity refers to the extent to which findings from an accelerometer agree with the findings produced from another type of accelerometer [22]. Because accelerometer counts can be analysed for varying epoch lengths, validity is analysed for specific epochs. Assessing validity requires either (a) using an a-priori cut-point between SB and non-SB behaviours or (b) assessing a range of cut-points. To assess a range of cut-points, researchers typically evaluate which ones optimise sensitivity (the % of epochs classified as SB that were classified by the criterion or concurrent measure as SB) and specificity (the % of epochs classified as not SB that were classified by the criterion or concurrent measure as not SB). The area under the receiver operating characteristics (ROC) curve is often reported as well. Values closer to 1.00 indicate more accurate classification of SB, and values closer to 0.5 indicate less accurate classification of SB [23]. Statistical models (e.g., non-parametric or regression models) or Bland–Altman methods [24] may be used in addition to, or alternatively to, ROC methods, to examine relationships or agreement between the accelerometer of interest and the criterion or concurrent measure. Additional file 1 contains a more detailed discussion of reliability and validity.

Data extraction

Extracted from eligible articles were: participant and monitor characteristics, study setting, methodological considerations, and results. Extraction tables for the first six studies reviewed were produced independently by two authors (NA, KCH) before then being checked for consistency and accuracy against the original articles by a third author (RLH). For the remaining studies, KCH produced the extraction tables, and RLH checked them for consistency and accuracy against the original articles.


The search identified 550 separate articles (Fig. 1). After articles out of scope were removed, the full text of 32 articles was examined. After applying the selection criteria, 15 different studies of five accelerometer brands (Table 2), reported in 17 papers, were included.

Fig. 1
figure 1

PRISM flow chart

Table 2 Description of the accelerometers/inclinometers reported in articles included in the systematic review

Descriptive characteristics of included studies

The ActiGraph accelerometer (ActiGraph LLC, Fort Walton Beach, FL, USA) was examined in 11 studies [17, 25,26,27,28,29,30,31,32,33,34,35,36] (see Tables 3 and 4). The most commonly-used model of ActiGraph was the GT3X+. Studies included between 20 to 7650 participants, and the mean age of participants ranged from 61 to 78 years. Three studies included only women [27, 28, 30, 33, 34], and the remainder included women and men [17, 25, 26, 29, 31, 32, 35, 36].

Table 3 Characteristics and results of studies that examined reliability of ActiGraph models for measuring sedentary behaviour in older adults (mean age ≥ 60 years), ordered from largest to smallest sample size
Table 4 Characteristics and results of studies that examined validity and accuracy of ActiGraphs for measuring sedentary behaviour in older community-dwelling, healthy adults (mean age ≥ 60 years), ordered from largest to smallest sample size

One study each assessed the Actical, the ActivPAL3, the GENEActiv, and the MotionWatch 8. These studies included men and women. The Actical study [37], the largest of the four, included 200 participants with a mean age of 64 years. The ActivPAL3 study [38] included 53 participants with a mean age of 75 years. The GENEActiv study [39] included 40 participants with a mean age of 74 years. Last, the MotionWatch8 study [40] included 23 adults with a mean age of 70.

Reliability of the ActiGraph

Test-retest reliability of the ActiGraph in free-living conditions was assessed in three studies (Table 3). In each study, an ActiGraph was worn on the hip, and the data were analysed in 60-s epochs. In a 21-day study of the reliability of the ActiGraph 7164, Hart et al. [35] found that 5 days of measurement was required to attain an acceptable level of reliability (ICC = 0.80) in measuring SB in older adults. SB was defined as vertical axis (VA) ≤50 cpm. In a 7-day study of the ActiGraph 7164 [36], 2- to 3-day protocols provided reliable estimates of the percentage of time spent in SB, but the authors concluded that estimates should be adjusted for greater time spent in SB on weekend days than on weekdays. SB was defined as VA < 100 cpm. In a study of the 2–3 year test-retest reliability of the ActiGraph GT3X+ [27] with SB defined as vertical magnitude (VM) < 200 cpm, reliability was slightly lower for daily minutes spent in SB than seen in the other studies, but excellent given the long intervals between measurement periods (ICC = 0.75). Overall, these findings provide uncertainly about the number of days required for reliable estimates of SB. The two studies that directly addressed the required number of days were both conducted with the ActiGraph 7164, and results could be different in newer models. Also, differences in data collection periods, cut-points used to determine SB, non-wear time algorithms, and axis used to assess SB (one or three) make direct comparison among studies problematic. Therefore, the evidence to date does not provide a clear indication of the number of days of measurement with an ActiGraph that is required for older adults.

A fourth study [32] assessed the reliability of two filters that can be used with the ActiGraph GT3X: the normal filter, which is the standard filter, and the low-frequency extension (LTE) filter, which was designed to better capture low-intensity activities. Participants wore two monitors on a hip for 8 days in free-living conditions. For analysis 60-s epochs were used. The researchers found large mean differences between filters in min/day and % of time in SB with estimates systematically lower when the LFE filter was used than when the normal filter was used. The results were the same when the VA cut-point for SB was changed from < 150 cpm to < 100 cpm or < 200 cpm. The results indicate that the estimates of time spent in SB differ depending on the filter selected, and therefore, results of studies that use one type of filter are not comparable to studies that use the other type.

Validation and accuracy of ActiGraph cut-points for classifying SB

All validation studies of the ActiGraph assessed the GT3X+. With the ActivPAL as the concurrent measure, two 7-day studies showed moderate to good accuracy of the hip-worn ActiGraph for classifying SB in free-living conditions [17, 29] (Table 4). Aguilar-Farias et al. [17] reported that the optimal cut-points for the VA were < 1 count/s, < 10 counts/15 s, and < 25 cpm. The percentage of correctly classified SB epochs was good (74–80%). For VM, the optimal cut-points were < 1 count/s, < 70 counts/15 s, and < 200 cpm. For VA and VM, accuracy was better for the cpm threshold than for the 1-s and 15-s thresholds. Koster et al. [29] also showed better accuracy for 60-s epochs than 15-s epochs: for a 60-s epoch a VA cut-point of < 22 cpm and a VM cut-point of < 174 cpm were optimal. These values are slightly lower than those reported by Aguilar-Farias et al. [17]. The researchers noted that if the commonly-used VA cut-point of < 100 cpm had been used, there would have been an overestimation of SB of 114 min/day. However, the commonly-used VM cut-point of < 200 cpm produced an overestimate of only 10 mins/day of SB.

Koster et al. [29] also computed optimal SB cut-points for the wrist-worn GT3X+, and these showed comparable accuracy properties to those reported when the monitor was worn on the hip. The most accurate VM cut-points were < 2303 cpm on the dominant wrist and < 1853 on the non-dominant wrist. Their optimal 60-s epoch cut-points for hip- and wrist-worn monitors produced more accurate results than the use of their optimal 15-s epoch cut-points.

Using data collected in a laboratory setting, Evenson et al. [33] computed optimal SB cut-points for the GT3X+. The criterion measure was portable calorimeter. With 15-s epochs, VM was more accurate than VA for classifying SB, and the LFE filter was not substantially better than the normal filter. Accuracy was highest when the sum of sensitivity and specificity with either a normal filter (optimal cut-point: ≤42 counts/15 s) or LFE filter (optimal cut-point: ≤65 counts/15 s) was maximized. For another analysis of that study’s data, Bai et al. [34] showed that activity counts with the normal or LFE filter generally performed poorly against portable calorimeter in differentiating between SB and two light activities.

These findings suggest that for using ActiGraph GT3X+ worn at the hip, 60-s epochs and VM provide more accurate estimates of SB than shorter epochs or VA, respectively. The findings from the studies in free-living conditions suggest that VM < 200 cpm provides valid estimates of SB time. However, if VA is used, the cut-point should be much smaller than typically used (e.g., < 22–25 cpm). One study also provided VM cut-points for ActiGraph GT3X+ worn on the wrist. Moreover, findings suggest that the use of LFE does not substantially improve the estimates of SB time produced with a normal filter although caution is warranted in extrapolating the laboratory-based findings to free-living conditions.

Validity and accuracy of machine learning algorithms for classifying SB with the ActiGraph

Three studies assessed whether machine learning algorithms accurately classify SB (Table 4). In a study by Rosenberg et al. [30] participants wore a camera on a lanyard around their necks while wearing the ActiGraph on a hip for 7 days in free-living conditions. Epochs were set to 60 s. The researchers showed that a machine learning algorithm using ActiGraph output could more accurately differentiate SB from non-sedentary behaviours than other methods. The researchers also reported that the median counts for sitting were much lower than would be detected by a < 100 cpm threshold and the median counts for riding in a vehicle were higher than would be detected at that threshold. This finding further supported the superiority of the algorithm over the use of a set cut-point.

Sasaki et al. [31] compared two machine learning algorithms for classifying activities and examined whether algorithms created in laboratory conditions were as accurate as those created in free-living conditions for detecting SB in free-living conditions. Direct observation was the criterion measure for both conditions. Using 20-s epochs for ActiGraphs worn on the hip, wrists and ankle, the laboratory-based algorithms were not as accurate as ones developed in free-living conditions (over 2–3 h, % of minutes correctly classified as SB was > 80% except for one wrist-worn algorithm). For algorithms produced under free-living conditions, the researchers showed that the accuracy in correctly classifying minutes as SB was optimal (defined as 80% of minutes correctly classified as SB) when the ActiGraph was placed at the hip or ankle (not wrist) and 15- or 30-s epochs were used. The highest overall classification rates were for 30-s epochs.

These machine-learning algorithms performed substantially better than an algorithm developed for another study [25]. As in the study by Sasaki et al. [31], direct observation was the criterion measure for both laboratory-based and free-living conditions, which were conducted in sessions lasting less than 1 day. Epochs were set to 5 s, which the findings by Sasaki [31] suggest is not as accurate as using longer epoch lengths. Also, data from laboratory-based and free-living components of the study were combined for analysis, which could have negatively impacted the findings, given that Sasaki [31] found differences in accuracy between laboratory-based versus free-living algorithms.

Overall, these findings indicate that machine learning algorithms may provide more accurate estimates than cut-points, particularly when these algorithms use large epochs, 30-s and 60-s, with data from free-living conditions.

Validity and reliability of other brands of accelerometers

Three other monitors underwent testing in laboratories: the activPAL3, GENEActiv, and MotionWatch 8 (Table 5). Klenk et al. [38] compared the newer activPAL3 to the original activPAL. Both were worn on the thigh. The researchers reported high agreement (98%) between monitors, but for a 24-h period, the researchers calculated, between monitors, a mean difference of 45 min in time spent sitting/lying. The findings suggest that the two monitors should not be used interchangeably for assessing SB. In the second study, Wullems et al. [39] validated the GENEActiv, worn on the thigh, against indirect calorimetry. Three cut-point algorithms and one machine learning algorithm performed well at classifying SB. In the final study, Landry [40] conducted the first validation of the MotionWatch 8. In that study, participants wore two watches on non-dominant wrists while performing SB and other activities. The watch was validated against portable calorimeter. The optimal cut-point for SB was ≤179 cpm.

Table 5 Characteristics and results of studies that examined validity and accuracy of other accelerometers and inclinometers for measuring sedentary behaviour in older community-dwelling, healthy adults (mean age ≥ 60 years), ordered from largest to smallest sample size

These early validity assessments indicate that the newest models of non-ActiGraph monitor brands show promise for classifying SB in older adults. Future studies in free-living conditions are needed to verify whether these findings hold in real-life conditions.

Accuracy of non-wear time algorithms for classifying SB

Two studies [26, 28] examined the influence of the non-wear-time algorithm selected on the classification of SB (Table 4). Both studies used the ActiGraph GT3X+, worn on the hip during free-living conditions, for 7 days. Both studies used 60-s epochs and defined SB as VA < 100 cpm with one study [28] also requiring VM < 200 cpm. For determining non-wear time Keadle et al. [28] found that the use of a paper log with the Choi algorithm [41] was better than using this algorithm only or using another algorithm with our without a log, for minimising missing data. The algorithms examined used ≥60 min threshold. Dates on logs were needed because accelerometers were mailed to participants. Without a log of wear-time dates, the algorithm misclassified ‘wear’ days as ‘in the mail’ days. Chudyk et al. [26] showed that algorithms that counted ≥90 min of consecutive zeroes as non-wear time were more accurate in estimating SB compared with ones using a ≥ 60 min threshold.

In contrast, Hutto et al. [37] examined the accuracy of the wrist-worn Actical in producing estimates of SB across non-wear estimation algorithms (see Table 5). Participants worn the monitor for 7 days and kept wear-time logs. SB was defined as VA ≤100 cpm. The analysis showed that estimates of time spent in SB were similar among algorithms that counted ≥120 min of consecutive zeros as non-wear time (with no allowance for intervals of non-zero cpm). Using 60- or 90-min intervals produced underestimations of time in SB.

In summary, for the ActiGraph, it is not clear whether treating ≥60 min or ≥ 90 min of consecutive zeroes as missing data is best for accurately classifying SB, but findings from the only study that compared the two indicated that the ≥90 min is optimal. For the Actical, initial findings indicate that non-wear time should include a longer string of consecutive zeros (e.g., ≥120 min of consecutive zeros).


Accurate measurement of SB is critical for the evaluation of patterning and prevalence of this behaviour and of future health promotion strategies aimed at decreasing SB. The aim of this systematic review was to assess the validity and reliability of accelerometers for the assessment of SB in older adults. Fifteen eligible studies were identified.

Comparison among studies of older adults in this field is challenging due to the heterogeneous assumptions used for the measurement parameters. For example, studies varied greatly in what constituted a valid day, number of measurement days, epoch length, use of VA or VM, and cut-point thresholds. Validity was assessed predominately using the ActiGraph GT3X+. Reliability was assessed using the ActiGraph GT3X+ and 7164. Most studies used accelerometers worn on the hip and utilized 60-s epochs.

Test-retest reliability estimates of the ActiGraph 7146 were similar to estimates found in younger adult populations (ICC 0.74–0.94) [42]. However, the number of days required for a reliable estimate of SB in older adults remains uncertain. Although 2–5 days are suggested from the studies reviewed, these estimates are drawn from only two studies, which used an older model accelerometer (7164); therefore, estimates may be less relevant for newer models. More research is required with newer model accelerometers, to determine a reliable number of wear days in older adults. Decisions about the number of wear-days selected for use in this population must also consider that adherence to the generally recommended 7-day wear-time protocols can be burdensome to older adults [36]. A move to a wrist location, which would avoid the need for removal when changing clothes, showering or sleeping, may result in greater compliance with wear-time requirements [43]. Data from NHANES shows compliance with waist-worn protocols of 40–70% but 70–80% for wrist-worn protocols [44]. However, further investigation into the validity and reliability of wrist-worn accelerometers in older adults is required before their use in research with this population is recommended.

Another consideration is the selected cut-points, which can greatly impact the amount of SB recorded. For example, Gorman and colleagues [18] reported that in a population of older women the mins/day spent in SB ranged from 475 when the cut-point for SB was ≤50 cpm to 665 when the cut-point was < 500 cpm. The current review found only two studies that examined appropriate ActiGraph SB cut-points for older adults in free-living conditions. The evidence from these studies of the GT3X+ suggest a cut-point of < 200 cpm with VM and a cut-point of < 22–25 cpm with VA, when the wear location is the hip and the normal filter is used. However, a commonly-accepted cut-point for adults is VA < 100 cpm. Results of a study that analysed GT3X data from office workers (mean age = 47 years) indicated that a cut-point of < 150 cpm was optimal although < 100 cpm was acceptable [45]. The analysis used VA and a normal filter. The results of more recent studies in younger adults (university employees and university students) that used LFE with the GT3X+ suggested that a < 65 cpm cut-point with VA [46] or a < 150 cpm cut-point with VM [47] were appropriate. In short, the totality of evidence provides early indications that higher cut-points are needed for assessing SB in older adults than in their younger counterparts. Differences between age groups in these cut-points could indicate that estimates of movement patterns using cut-points may vary for different life stages, due to dissimilar balance and gait speed as well as the nature and contexts of movement [48, 49].

Other factors influencing estimated SB include decisions about epoch length and non-wear-time algorithms. From the findings of this review, it appears that 60-s epochs are the most accurate to use with older adults for assessing SB with the ActiGraph are, and a 90+ minute non-wear time algorithm may be most accurate although this result is from only one study. In younger adults 60-s epochs and 60+ minutes of non-wear time [50] are generally used in analysis. Others [18] have suggested that 60+ non-wear time algorithms are not likely to be appropriate for older adults because the large percentage of the day that older adults spend sitting quietly could be misclassified as non-wear time. Although cut-points remain the most common method for accelerometer data reduction [18], the choice of cut-points and their inherent assumptions (e.g., epoch length, non-wear time) greatly impact validity and reliability. Assumptions also affect comparisons of SB and PA estimates in other life stages (e.g., children [51] and adults [52]).

A developing alternative approach for estimating SB is the use of machine learning, or pattern recognition [53]. Three studies in this review indicated that machine learning algorithms provide more accurate estimates of SB than other methods when using a 30-s or 60-s epoch. The findings from these studies further suggest that using ActiGraph data from free-living conditions are more accurate than laboratory data in classifying activity as SB with machine learning. These findings support those from similar studies in younger adults [54]. As highlighted by Sasaki and colleagues [31], there is a need for more rigorous field-based assessment of SB using machine learning as few such assessments have been conducted in older or younger adults.

Of the non-ActiGraph monitors examined, early validity assessments of the ActivPAL3, GENEActiv and MotionWatch 8 in older adults show promise for classifying SB in older adults. However, studies were conducted in laboratories; studies with older adults in free-living conditions are needed to verify whether findings hold in real-life conditions. In contrast, some studies of the GENEActiv [43, 55, 56] and the ActivPAL or ActivPAL3 [45, 57,58,59,60] in younger samples have been conducted in free-living conditions. These suggests that the wrist-worn GENEActiv and thigh-worn ActivPAL/ActivPAL3 monitors may be suitable for estimating population-levels of SB, at least in younger adults, and therefore, the suitability of these for use in older adults merits further exploration. Also noteworthy from this review are the inter-brand differences for older adults, with findings suggesting that non-wear time is best captured for the Actical using the rule of ≥120 min of consecutive zeros, compared with ≥90 min for the ActiGraph. Standard algorithms for analysis of ActiGraph data call for ≥60 min [61, 62].

Strengths and limitations

This review used a systematic search of multiple bibliographic databases. The major strength is the inclusion of studies of all brands and models of accelerometers that examined the reliability or validity of accelerometers. Previous reviews have tended to narrow the focus to one monitor brand, ActiGraphs. This is reasonable given that most validation studies have been done with ActiGraphs [18]. However, for a comprehensive review of the reliability and validity of all accelerometers that are being used with older adults, it is important to include all accelerometers. Another strength of this review was the inclusion of papers published in Spanish and Portuguese in addition to those published in English. However, all studies that met the inclusion criteria were published in English.

Two limitations of the review should be noted. First, studies were not rated on their quality. Although there are reporting lists for diagnostic studies, we are not aware of quality rating lists for studies into measurement characteristic that are relevant across different models and brands of monitors or monitors using different assumptions. However, one strength of the included studies was the rigorous designs used overall, with most validation studies reporting their assumptions, collecting data in free-living conditions across multiple days, and using appropriate concurrent or criterion measures. Most studies also described criteria for inclusion of data in analysis and use of non-wear time algorithms. Second, we only included studies of healthy, community-dwelling older adults. Although the literature on the validity of accelerometers in other older populations (e.g., residential care facilities or with specific diseases) is growing, the assumptions underlying analysis is likely to be differ under those situations.


This paper reviewed the literature on the reliability and validity of accelerometers for measuring SB in older adults. The number of studies identified was small, 15 studies in 17 papers. Most studies assessed hip- or waist-worn ActiGraphs. The studies of validity assessed the GT3X+ model. These studies indicated that analysis of 60-s epochs and a VM cut-point of < 200 cpm in free-living conditions or the use of 30-s or 60-s epochs with machine learning algorithms provide the most valid SB estimates. Non-wear algorithms of 90+ consecutive zeros were suggested. This finding indicates that the criteria researchers use for classifying an epoch as sedentary instead of as non-wear time (e.g., the non-wear algorithm used) may need to be different for older adults than for younger adults. However, this conclusion is based on the findings of only one study. Two studies of an older model ActiGraph (7164) examined the number of wear-days that would be required for an acceptable reliability estimate (> 0.80). Results varied (2–5 days), and the relevance of these findings to new models is unknown. Also noteworthy was the paucity of studies on the reliability and validity of other accelerometer brands. Overall, more older-adult-specific validation studies of accelerometers are needed, to inform future guidelines on the appropriate criteria to use for analysis of data from various accelerometer brands.



Counts per minute






Sedentary behaviour


Vertical axis


Vertical magnitude


  1. Tremblay MS, Aubert S, Barnes JD, Saunders TJ, Carson V, Latimer-Cheung AE, Chastin SFM, Altenburg TM, Chinapaw MJM. Sedentary behavior research network (SBRN) – terminology consensus project process and outcome. Int J Behav Nutr Phys Act. 2017;14:75.

    Article  Google Scholar 

  2. Hamer M, Coombs N, Stamatakis E. Associations between objectively assessed and self-reported sedentary time with mental health in adults: an analysis of data from the Health Survey for England. BMJ Open. 2014;4:e004580.

    Article  Google Scholar 

  3. Copeland JL, Ashe MC, Biddle SJH, Brown WJ, Buman MP, Chastin S, et al. Sedentary time in older men and women: a critical review of measurement, associations with health, and interventions. Br J Sports Med. 2017;51:1539.

    Article  Google Scholar 

  4. Biswas A, Oh PI, Faulkner GE, Bajaj RR, Silver MA, Mitchell MS, Alter DA. Sedentary time and its association with risk for disease incidence, mortality, and hospitalization in adults: a systematic review and meta-analysis. Ann Intern Med. 2015;162:123–32.

    Article  Google Scholar 

  5. Biswas A, Alter DA. Sedentary time and risk for mortality. Ann Intern Med. 2015;162:875–6.

    Article  Google Scholar 

  6. Grunseit AC, Chau JY, Rangul V, Turid Lingaas H, Bauman A. Patterns of sitting and mortality in the Nord-Trondelag Health Study (HUNT). Int J Behav Nutr Phys Act. 2017;14:8.

  7. Matthews CE, Moore SC, George SM, Sampson J, Bowles HR. Improving self-reports of active and sedentary behaviors in large epidemiologic studies. Exerc Sport Sci Rev. 2012;40:118–26.

    PubMed  PubMed Central  Google Scholar 

  8. Pavey TG, Peeters GG, Brown WJ. Sitting-time and 9-year all-cause mortality in older women. Br J Sports Med. 2015;49:95–9.

    Article  Google Scholar 

  9. de Rezende LFM, Rey-López JP, Matsudo VKR, Luiz OC. Sedentary behavior and health outcomes among older adults: a systematic review. BMC Public Health. 2014;14:333.

    Article  Google Scholar 

  10. Harrington DM, Barreira TV, Staiano AE, Katzmarzyk PT. The descriptive epidemiology of sitting among US adults, NHANES 2009/2010. J Sci Med Sport. 2014;17:371–5.

    Article  Google Scholar 

  11. Matthews CE, Chen KY, Freedson PS, Buchowski MS, Beech BM, Pate RR, et al. Amount of time spent in sedentary behaviors in the United States, 2003-2004. Am J Epidemiol. 2008;167:875.

    Article  Google Scholar 

  12. Davis MG, Fox KR, Hillsdon M, Coulson JC, Sharp DJ, Stathi A, et al. Getting out and about in older adults: the nature of daily trips and their association with objectively assessed physical activity. Int J Behav Nutr Phys Act. 2011;8:116.

    Article  Google Scholar 

  13. Copeland JL, Clarke J, Dogra S. Objectively measured and self-reported sedentary time in older Canadians. Prev Med Rep. 2015;2:90–5.

    Article  Google Scholar 

  14. Dogra S, Ashe MC, Biddle SJH, Brown WJ, Buman MP, Chastin S, , et al. Sedentary time in older men and women: an international consensus statement and research priorities. Br J Sports Med 2017; 51:1526.

    Article  Google Scholar 

  15. Harvey JA, Chastin SFM, Skelton DA. Prevalence of sedentary behavior in older adults: a systematic review. Int J Environ Res Pub Health. 2013;10:6645–61.

    Article  Google Scholar 

  16. Atkin AJ, Gorely T, Clemes SA, Yates T, Edwardson C, Brage S, et al. Methods of measurement in epidemiology: sedentary behaviour. Int J Epidemiol. 2012;41:1460–71.

    Article  Google Scholar 

  17. Aguilar-Farias N, Brown WJ, Peeters GM. ActiGraph GT3X+ cut-points for identifying sedentary behaviour in older adults in free-living environments. J Sci Med Sport. 2014;17:293–9.

    Article  Google Scholar 

  18. Gorman E, Hanson HM, Yang PH, Khan KM, Liu-Ambrose T, Ashe MC. Accelerometry analysis of physical activity and sedentary behavior in older adults: a systematic review and data analysis. Eur Rev Aging Phys Act. 2014;11:35–49.

    Article  CAS  Google Scholar 

  19. Tudor-Locke C, Camhi SM, Troiano RP. A catalog of rules, variables, and definitions applied to accelerometer data in the National Health and Nutrition Examination Survey, 2003–2006. Prev Chronic Dis. 2012;9:E113.

  20. Moher D, Liberati A, Tetzlaff J, Altman DG, Group TP. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6:e1000097.

    Article  Google Scholar 

  21. Sallis JF, Saelens BE. Assessment of physical activity by self-report: status, limitations, and future directions. Res Q Exerc Sport. 2000;71:S1–4.

    Article  CAS  Google Scholar 

  22. Kelly P, Fitzsimons C, Baker G. Should we reframe how we think about physical activity and sedentary behaviour measurement? Validity and reliability reconsidered. Int J Behav Nutr Phys Act. 2016;13:32.

    Article  Google Scholar 

  23. Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem. 1993;39:561–77.

    CAS  PubMed  Google Scholar 

  24. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307–10.

    Article  CAS  Google Scholar 

  25. Bourke AK, Ihlen EA, Van de Ven P, Nelson J, Helbostad JL. Video analysis validation of a real-time physical activity detection algorithm based on a single waist mounted tri-axial accelerometer sensor. Conf Proc IEEE Eng Med Biol Soc. 2016;2016:4881–4.

    PubMed  Google Scholar 

  26. Chudyk AM, McAllister MM, Cheung HK, McKay HA, Ashe MC. Are we missing the sitting? Agreement between accelerometer non-wear time validation methods used with older adults’ data. Cogent Med. 2017;4:1313505.

    Article  Google Scholar 

  27. Keadle SK, Shiroma EJ, Kamada M, Matthews CE, Harris TB, Lee IM. Reproducibility of accelerometer-assessed physical activity and sedentary time. Am J Prev Med. 2017;52:541–8.

    Article  Google Scholar 

  28. Keadle SK, Shiroma EJ, Freedson PS, Lee IM. Impact of accelerometer data processing decisions on the sample size, wear time and physical activity level of a large cohort study. BMC Public Health. 2014;14:1210.

    Article  Google Scholar 

  29. Koster A, Shiroma EJ, Caserotti P, Matthews CE, Chen KY, Glynn NW, et al. Comparison of sedentary estimates between activPAL and hip- and wrist-worn ActiGraph. Med Sci Sports Exerc. 2016;48:1514–22.

    Article  Google Scholar 

  30. Rosenberg D, Godbole S, Ellis K, Di C, Lacroix A, Natarajan L, et al. Classifiers for accelerometer-measured behaviors in older women. Med Sci Sports Exerc. 2017;49:610–6.

    Article  Google Scholar 

  31. Sasaki JE, Hickey AM, Staudenmayer JW, John D, Kent JA, Freedson PS. Performance of activity classification algorithms in free-living older adults. Med Sci Sports Exerc. 2016;48:941–50.

    Article  Google Scholar 

  32. Wanner M, Martin BW, Meier F, Probst-Hensch N, Kriemler S. Effects of filter choice in GT3X accelerometer assessments of free-living activity. Med Sci Sports Exerc. 2013;45:170–7.

    Article  Google Scholar 

  33. Evenson KR, Wen F, Herring AH, Di C, LaMonte MJ, Tinker LF, et al. Calibrating physical activity intensity for hip-worn accelerometry in women age 60 to 91 years: the Women's Health Initiative OPACH calibration study. Prev Med Rep. 2015;2:750–6.

    Article  Google Scholar 

  34. Bai J, Di C, Xiao L, Evenson KR, LaCroix AZ, Crainiceanu CM, et al. An activity index for raw accelerometry data and its comparison with other activity metrics. PLoS One. 2016;11:e0160644.

    Article  Google Scholar 

  35. Hart TL, Swartz AM, Cashin SE, Strath SJ. How many days of monitoring predict physical activity and sedentary behaviour in older adults. Int J Behav Nutr Phys Act. 2011;8:62.

    Article  Google Scholar 

  36. Kocherginsky M, Huisingh-Scheetz M, Dale W, Lauderdale DS, Waite L. Measuring physical activity with hip accelerometry among U.S. older adults: How many days are enough? PLoS One. 2017;12:e0170082.

    Article  Google Scholar 

  37. Hutto B, Howard VJ, Blair SN, Colabianchi N, Vena JE, Rhodes D, et al. Identifying accelerometer nonwear and wear time in older adults. Int J Behav Nutr Phys Act. 2013;10:120.

    Article  Google Scholar 

  38. Klenk J, Buchele G, Lindemann U, Kaufmann S, Peter R, Laszlo R, et al. Concurrent validity of activPAL and activPAL3 accelerometers in older adults. J Aging Phys Act. 2016;24:444–50.

    Article  Google Scholar 

  39. Wullems JA, Verschueren SMP, Degens H, Morse CI, Onambélé GL. Performance of thigh-mounted triaxial accelerometer algorithms in objective quantification of sedentary behaviour and physical activity in older adults. PLoS One. 2017;12:e0188215.

    Article  Google Scholar 

  40. Landry GJ, Falck RS, Beets MW, Liu-Ambrose T. Measuring physical activity in older adults: calibrating cut-points for the MotionWatch 8. Front Aging Neurosci. 2015;7:165.

    PubMed  PubMed Central  Google Scholar 

  41. Choi L, Ward SC, Schnelle JF, Buchowski MS. Assessment of wear/nonwear time classification algorithms for triaxial accelerometer. Med Sci Sports Exerc. 2012;44:2009–16.

    Article  Google Scholar 

  42. Sirard JR, Forsyth A, Oakes JM, Schmitz KH. Accelerometer test-retest reliability by data processing algorithms: results from the Twin Cities Walking Study. J Phys Act Health. 2011;8:668–74.

    Article  Google Scholar 

  43. Pavey TG, Gomersall SR, Clark BK, Brown WJ. The validity of the GENEActiv wrist-worn accelerometer for measuring adult sedentary time in free living. J Sci Med Sport. 2016;19:395–9.

    Article  Google Scholar 

  44. Freedson PS, John D. Comment on “estimating activity and sedentary behavior from an accelerometer on the hip and wrist”. Med Sci Sports Exerc. 2013;45:962.

    Article  Google Scholar 

  45. Kozey-Keadle S, Libertine A, Lyden K, Staudenmayer J, Freedson PS. Validation of wearable monitors for assessing sedentary behavior. Med Sci Sports Exerc. 2011;43:1561–7.

    Article  Google Scholar 

  46. Clarke-Cornwell AM, Farragher TM, Cook PA, Granat MH. Empirically derived cut-points for sedentary behaviour: are we sitting differently? Physiol Meas. 2016;37:1669–85.

    Article  Google Scholar 

  47. Peterson NE, Sirard JR, Kulbok PA, DeBoer MD, Erickson JM. Validation of accelerometer thresholds and inclinometry for measurement of sedentary behavior in young adult university students. Res Nurs Health. 2015;38:492–9.

    Article  Google Scholar 

  48. Johannsen DL, DeLany JP, Frisard MI, Welsch MA, Rowley CK, Fang X, et al. Physical activity in aging: comparison among young, aged, and nonagenarian individuals. J Appl Physiol. 2008;105:495–501.

    Article  Google Scholar 

  49. Strath SJ, Pfeiffer KA, Whitt-Glover MC. Accelerometer use with children, older adults, and adults with functional limitations. Med Sci Sports Exerc. 2012;44:S77.

    Article  Google Scholar 

  50. Evenson KR, Terry JW Jr. Assessment of differing definitions of accelerometer nonwear time. Res Q Exerc Sport. 2009;80:355–62.

    Article  Google Scholar 

  51. Trost SG, Loprinzi PD, Moore R, Pfeiffer KA. Comparison of accelerometer cut points for predicting activity intensity in youth. Med Sci Sports Exerc. 2011;43:1360–8.

    Article  Google Scholar 

  52. Freedson P, Bowles HR, Troiano R, Haskell W. Assessment of physical activity using wearable monitors: recommendations for monitor calibration and use in the field. Med Sci Sports Exerc. 2012;44:S1.

    Article  Google Scholar 

  53. Pavey TG, Gilson ND, Gomersall SR, Clark B, Trost SG. Field evaluation of a random forest activity classifier for wrist-worn accelerometer data. J Sci Med Sport. 2017;20:75–80.

    Article  Google Scholar 

  54. Lyden K, Keadle SK, Staudenmayer J, Freedson PS. A method to estimate free-living active and sedentary behavior from an accelerometer. Med Sci Sports Exerc. 2014;46:386–97.

    Article  Google Scholar 

  55. Rowlands AV, Olds TS, Hillsdon M, Pulsford R, Hurst TL, Eston RG, et al. Assessing sedentary behavior with the GENEActiv: introducing the sedentary sphere. Med Sci Sports Exerc. 2014;46:1235–47.

    Article  Google Scholar 

  56. Rowlands AV, Yates T, Olds TS, Davies M, Khunti K, Edwardson CL. Sedentary sphere: wrist-worn accelerometer-brand independent posture classification. Med Sci Sports Exerc. 2016;48:748–54.

    Article  Google Scholar 

  57. Lyden K, Kozey Keadle SL, Staudenmayer JW, Freedson PS. Validity of two wearable monitors to estimate breaks from sedentary time. Med Sci Sports Exerc. 2012;44:2243–52.

    Article  Google Scholar 

  58. Lyden K, Keadle SK, Staudenmayer J, Freedson PS. The activPAL™ accurately classifies activity intensity categories in healthy adults. Med Sci Sports Exerc. 2017;49:1022–8.

    Article  Google Scholar 

  59. Ryde GC, Gilson ND, Suppini A, Brown WJ. Validation of a novel, objective measure of occupational sitting. J Occup Health. 2012;54:383–6.

    Article  Google Scholar 

  60. Steeves JA, Bowles HR, McClain JJ, Dodd KW, Brychta RJ, Wang J, et al. Ability of thigh-worn ActiGraph and activPAL monitors to classify posture and motion. Med Sci Sports Exerc. 2015;47:952–9.

    Article  Google Scholar 

  61. Choi L, Liu Z, Matthews CE, Buchowski MS. Validation of accelerometer wear and nonwear time classification algorithm. Med Sci Sports Exerc. 2011;43:357–64.

    Article  Google Scholar 

  62. Troiano RP, Berrigan D, Dodd KW, Masse LC, Tilert T, McDowell M. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc. 2008;40:181–8.

    Article  Google Scholar 

  63. John D, Freedson P. ActiGraph and Actical physical activity monitors: a peek under the hood. Med Sci Sports Exerc. 2012;44:S86–9.

    Article  Google Scholar 

Download references


The authors acknowledge the excellent editorial work of Naomi Stekelenburg in preparing the final draft of the manuscript.


The authors disclosed receipt of the following financial support for conducting the systematic review: KCH was supported by a Queensland University of Technology Professional Development Award. Nicolas Aguilar-Farias was supported by the International Collaboration Program CONICYT-CNPQ 441970–2016/8 (DIUFRO DIE17–0006).

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

Author information

Authors and Affiliations



KCH: Designed the final review process, conducted the review, performed the analysis, drafted the methods and results sections, co-drafted the introduction, strengths and limitations sections, and incorporated feedback. RLH: Conducted the review, performed the analysis, co-drafted the introduction section, and provided intellectual input into the remainder of the manuscript. NAF: Piloted the review process and provided intellectual input into the design of the review. JGZvU: Co-drafted the strengths and limitations and conclusion sections and provided intellectual input into the remainder of the manuscript, TP: Drafted the discussion section, co-drafted the conclusion section and provided intellectual input into the remainder of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Kristiann C. Heesch.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:

Definitions and descriptions of test-retest reliability and validity for assessment of accelerometers and inclinometers. (DOCX 15 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heesch, K.C., Hill, R.L., Aguilar-Farias, N. et al. Validity of objective methods for measuring sedentary behaviour in older adults: a systematic review. Int J Behav Nutr Phys Act 15, 119 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: