Validity of questionnaire-based assessment of sedentary behaviour and physical activity in a population-based cohort of older men; comparisons with objectively measured physical activity data

Background Older adults are the most inactive age group and self-reporting of activities may be complicated by age-related reductions in structured activities and misclassification or recall biases. We investigate the validity of simple questionnaires about sedentary behaviour (SB), (including the widely used proxy television (TV) viewing), and physical activity (PA) in comparison with objective measures. Methods Community dwelling men aged 71–93 years, from a UK population-based cohort wore a GT3X accelerometer over the right hip for 7 days and self-completed a questionnaire including information about SB (TV, reading, computer use and car use) and PA (leisure and sporting domains). Results 1566/3137 surviving men (mean age 79 years) attended. 1377 ambulatory men provided questionnaire and accelerometer data. Questionnaires under-estimated mean daily sedentary time; 317 minutes total SB (TV, computer use, reading or driving), 176 minutes (TV) vs 619 minutes (objectively measured). Correlations between objective measures and self-reports were 0.18 (total SB) and 0.17 (TV), both P < 0.001. Objective SB levels were similar across the lowest three quartiles of self-reported SB but raised in the highest quartile. Correlations between steps/day or moderate to vigorous PA with self-reported total PA were both 0.49, P < 0.001 and measured PA levels were progressively higher at higher levels of self-reported PA. Conclusions Among older men, simple SB questions performed poorly for identifying total SB time, although simple PA questions were associated with a graded increase with objectively measured PA. Future studies of health effects of SB in older men would benefit from objective measures of SB. Electronic supplementary material The online version of this article (doi:10.1186/s12966-016-0338-1) contains supplementary material, which is available to authorized users.


Background
Both physical activity (PA) and sedentary behaviour (SB) are core determinants of health and are associated with mortality and CVD risk in prospective cohort studies [1,2]. UK guidelines include recommendations about both PA and SB [3]. At older ages levels of PA are lowest and SB are highest [4] and burdens of PA preventable disease and disability rise steeply with age [5]. Accurate assessment of PA and SB is therefore essential in this age group, yet there is little data about validity of both PA and SB in the oldest old, aged up to 90 years. Until recently all studies relating PA and SB to morbidity and mortality used self-reports (with television (TV) viewing the most commonly used proxy for SB) [6]. TV viewing is consistently associated with elevated risks of mortality, CVD and diabetes events [7]. Yet some studies reporting detrimental association between self-reported TV time and CVD risk markers, fail to find associations with accelerometer measured SB [8]. This may be because TV viewing accounts for only a small part of SB and does not reflect other domains (leisure, occupational and transport) [9,10]. Furthermore, there are concerns about questionnaire-based assessment of both PA and SB in older people, in whom misclassification bias and recall bias (which could be exacerbated by memory loss) [11] are particularly likely and who report difficulty in answering questions about typical SB [12]. Moreover, this age-group undertakes fewer sporting and structured exercise and more light activities (including functional tasks such as walking for transport, household tasks, caring and gardening [13]), which may be harder to recall, or not be included in questionnaires designed for younger age groups. Sensor technology permits objective measurements of PA and SB in population-based studies, providing estimates of time spent at different intensities of activity. In order to better understand the health effects of SB and to design effective interventions to change patterns of SB in older adults, it is important to assess the validity of commonly used instruments to measure SB in large scale studies of community-dwelling older adults. Likewise, it is important to understand the validity of existing PA questionnaires which have been used to generate estimates of PA, from which the dose-response curves between PA and clinical outcomes are estimated This paper therefore aims first to investigate how the most widely used measure of SB (TV viewing) and a more comprehensive SB score made up of common sedentary activities (TV viewing, computer use, reading and car driving time) are related to accelerometer-measured SB and different intensities of activities in the oldest old. Second, it aims to evaluate whether the established British Regional Heart Study (BRHS) PA questionnaire which has been used extensively to investigate health effects of PA [14][15][16], can capture the "low active" older adults who have greatest potential to benefit from activity interventions, and differentiate physical activity levels across "high active" groups. Third, it aims to investigate the properties of a single question about recreational PA which could be useful in screening for low active older individuals in time or resource-poor settings. Additionally, it investigates associations between questionnaire and objective measures of activity in older adults (content validation) and associations with FEV 1 and heart rate as markers of fitness (construct validation).

Measures Sample
The British Regional Heart Study is an on-going prospective, population-based cohort study following up 7735 men recruited from primary care centres in 24 British towns in 1978-80 when aged 40-59 years [17]. In 2010-2012, 3137 survivors were invited to a clinic reassessment. The National Research Ethics Service (NRES) Committee London provided ethical approval. Participants provided informed written consent to the investigation. FEV 1 was measured using a Vitalograph Compact II Spirometer, with the subject standing. A nurse demonstrated how to blow into the mouthpiece and then the participant completed three tests (blows). The highest of three consecutive test readings was selected using American Thoracic Society criteria [18] : if the best test variation was >5%, a further test was recorded. FEV 1 values were standardised to height squared. Men also had a resting 12-lead electrocardiogram (Burdick Atria 6100 ECG Machine) recorded, during which heart rate was measured. Men lay on a bed and rested for at least 5 minutes prior to measurement. Sensitivity analyses of Heart Rate data excluded (i) men with atrial fibrillation or tachycardia diagnosed on ECG using Minnesota coding guidelines, and (ii) men who reported taking beta blockers, which are associated with lower heart rate. Men completed a standard study questionnaire [19,20] about PA and SB and were fitted with an accelerometer to objectively measure PA and SB.

Accelerometer data
Men wore a GT3x accelerometer (Actigraph, Pensacola, Florida) over the right hip for 7 days, during waking hours, removing it for bathing or swimming. Accelerometer data were processed using standard methods as previously described [21]. Valid wear days were defined as ≥600 minutes wear time, and participants with ≥3 valid days were included in analyses, a conventional requirement to estimate usual PA level [22][23][24]. The number of minutes per day spent in PA of different intensity levels was categorised using standard count-based intensity threshold values of counts per minute developed for older adults [25]: <100 for SB (<1.5 MET),100-1040 for light activity (1.5-3 MET) and > =1040 for MVPA,(> = 3 MET). The time spent in bouts of MVPA was labelled as MVPA1+: which is the total number of individual minutes in MVPA (irrespective of whether they occurred consecutively or not), and also as MVPA10+ which is the number of minutes spent in bouts lasting 10 consecutive minutes or more (without a break).

Questionnaire data
Men completed a questionnaire including questions about how many hours per week they spent (i) watching TV, videos or DVDs (ii) reading (iii) using a computer (iv) driving or sitting in a car. A total SB score was derived by summing time spent in the four activities. One question about self-rated recreational and domestic activity was asked "Compared to a man who spends two hours on most days on activities such as walking, gardening, household chores, DIY (do it yourself ) projects, how physically active would you consider yourself?" (much more active/ more active/ similar/ less active /much less active). Men also completed the standard BRHS physical activity questionnaire (used since 1978); usual PA is self-reported under the headings of regular walking or cycling, recreational activity, and sporting (vigorous) activity. Regular walking and cycling relate to daily journeys. Recreational activity included gardening, pleasure walking, and do-it yourself jobs. Sporting activity included running, golf, swimming, tennis, sailing and digging. A PA score (validated in relation to heart rate and FEV 1 [19,20]) was derived for each man. Scores were assigned for each type of activity and duration on the basis of the intensity and energy demands of the activities reported based on Minnesota intensity codes [26]. Men were categorized into six groups based on their total score. The total score for each man is not a measure of total time spent doing PA but is a relative measure of how much PA has been carried out [20].

Statistical methods
Quartiles of the number of hours per week of (i) total SB (sum of watching TV, reading, using a computer and driving or sitting in a car) and (ii) watching TV were calculated. The standard six category BRHS PA score was calculated for all men [20]. Spearman's correlations between total minutes/day recorded as SB by accelerometer and (i) total sedentary hours (sum of watching TV, reading, using a computer and driving or sitting in a car) and (ii) hours watching TV were calculated. Bland Altman plots compared the self-reported total SB to accelerometer measured SB, plotting the difference between the two measures (in the same metric, minutes/day) against the average of the two measures, differences and limits of agreement were calculated. Plots were repeated for TV viewing time. Neither the questionnaire nor accelerometer are gold standard measures of SB, so these plots give evidence of concurrent rather than criterion validity. Bland Altman plots were not calculated for PA measures as the questionnaire gives ranks rather than absolute PA levels and cannot therefore be compared to accelerometer PA measures on the same scale.
Random effects linear regression models estimated associations between quartiles of self-reported TV time and each of the accelerometer measures, accounting for clustering, with accelerometer measurement (range 3-8 days) at level 1 and person at level 2. Models were adjusted for measurement related factors; age, day order (first, second, third etc. day of wear), season, accelerometer wear time (minutes/day) and town of residence. Adjusted mean values for each objective measure of PA or SB were calculated for each self-reported SB or PA measure. Complete case analysis was used. Statistical analyses were run in R version 2.15.3 and Stata version 13 [27,28].

Results
Among 3137 men invited, 1655 attended (52.8 %), of whom 1455 (46.4 %) provided questionnaire and adequate accelerometer data (≥600 minutes wear time on 3-7 days) ( Table 1). Analyses were restricted to 1377 independently mobile, community-dwelling men, aged on average 79 years (range 71-93), with a BMI of 27.1kg/m 2 , 22.6 % of whom had pre-existing CVD. Comparing characteristics of men who did participate in the accelerometer survey to those who did not, participants were more active at previous time points (59 % vs 46 % were at least moderately active at a survey 10 years previously), younger (mean age 78.5 years vs 80.1years), and less likely to smoke (7.2% vs 12.9 %) compared to men who did not participate.
1377 men met the criteria above and also had complete PA questionnaire data, of which 1257 also had data on HR and FEV 1 . 1112 men had complete data on SB score (men with missing data both about participating in an SB activity and hours per week in that activity were not included in the SB score, Table 1). Among the 1112 men, the average total SB was 317 minutes/day, comprising 176 minutes watching TV, 65 minutes reading, 42 minutes driving and 35 minutes using a computer. Over 90 % of men watched TV, read, drove or sat in a car but only 51.8% used a computer. 13% of men were moderately vigorously active and 9 % of men vigorously active according to the habitual PA score (Table 2). On average each day men registered 187 accelerometer counts per minute, took 4769 steps, and spent 619 minutes in SB (122 minutes in bouts lasting > =1 hour), 200 minutes in light activity and 39 minutes in MVPA. On average only 9 minutes/day were spent in bouts lasting ≥10 minutes (Table 3), reflecting the fact that many men did not sustain a single 10 minute bout of MVPA per day.
Validity of self-reported sedentary behaviour: TV viewing, computer use, sitting in a car, reading Both self-reported total SB and TV viewing time were weakly positively correlated with accelerometer SB time (Spearman's r = 0.18 and 0.17 respectively, both P < 0.001). Bland Altman plots showed that accelerometers recorded higher SB time than either self-report measure (Figs. 1 and 2); 300 (95 % CI 291,309) minutes/day (limits of agreement −6 to 607) more compared to self-reported total SB time and 440 (95 % CI 433,447) minutes/day more (limits of agreement 193 to 687) compared to self-reported TV viewing. For both total SB time and TV time, the discrepancy between accelerometer and self-report TV time was greater (with participants reporting much less SB time than was measured) in the participants who were least sedentary. In sensitivity analyses, no evidence was found that the mean differences varied by social class (manual vs non manual), presence of pre-existing CHD or stroke and presence of mobility limitations However, among men aged <80 years and > =80 respectively, accelerometers recorded 285(95 % CI 274,295) more minutes per day and 335(95 % CI 318, 352) more minutes per day for self-reported total SB time and equivalently 431(95 % CI 423,440) and 460(95 % CI 446,473) more minutes per day for self-reported TV viewing, suggesting that error in selfreported SB as particularly marked among the over 80s. Descriptive data for the total self-reported SB and PA levels stratified by age group are presented in Additional file 1: Table S1.
Self-reported total SB and TV viewing time were strongly correlated (Spearman's r = 0.78, P < 0.001), so patterns of associations of each self-report measure with accelerometer measured SB were similar (Table 3). In both cases, men reporting highest SB levels (top quartile) had markedly higher measured SB than the rest, whereas measured SB was very similar across the lowest three quartiles of self-reported SB ( Table 2). An inverse pattern was observed for the PA measures (CPM, steps, high light activity and MVPA). As noted above, men self-reported substantially less SB time than registered on accelerometers; data in Table 3 show that in the lowest quartile of the SB score, self-reported total SB score accounted for about 25 % of measured daily SB time, compared to 80 % in the top quartile of the SB score. In the lowest quartile of TV viewing, TV time accounted for about 10% of measured SB whereas it rose to 50 % in the men in the highest quartile.

Validity of self-reported physical activity
Correlations between the self-reported PA score and accelerometer measured CPM, steps, light and MVPA were 0.49, 0.49, 0.33 and 0.49 respectively, all P < 0.001. Selfreported PA levels were associated with higher CPM, step count, light and MVPA and lower SB in a graded, stepwise manner (Table 4). There was a larger incremental difference in step count, CPM and high light activity between "inactive" and "occasional" groups than for increases in groups at higher activity levels. The gradient in activity level was stronger for step count and total MVPA than for low light activity. There was a plateau across the three highest categories of the PA scores for values of low light PA and bouts of SB lasting ≥ one hour. Overall 15 % of men achieved 150 minutes/week in bouts of ≥10 minutes, rising from 0.4 % in the "inactive" to 25.4 % in the "vigorously active".
The associations between self-reported and objectively measured PA did not vary by presence of pre-existing CVD, or by age group (interactions all p > 0.2). The only exception was the presence of interactions between age group and both CPM and total MVPA minutes (likelihood ratio test, P = 0.017 and P = 0.046 respectively). Among younger men, the PA scores were associated with higher objectively measured CPM and MVPA minutes than among older men, and this difference was greater at the higher scores for the younger men.
Total PA score was linearly and inversely related to resting heart rate and positively related to FEV 1 (Additional file 1: Table S2).

Validity of self-reported recreational activity
Responses to a single item question about how much recreational activity men did compared to other men their age were positively and linearly associated with mean CPM, daily steps, minutes of light PA and of MVPA1+ (Additional file 1: Table S3). Correlations between this single item PA question and accelerometer measured CPM, steps, light and MVPA were 0.46, 0.45, 0.42 and 0.43 respectively, all P < 0.001. The prevalence of adhering to MVPA guidelines increased though flattened out in the two most active categories. Mean daily minutes in sedentary behaviour and long sedentary bouts decreased linearly, although again there was a plateau at higher scores. Self-reported recreational activity was positively related to HR and FEV 1 (Additional file 1: Table S2).

Discussion
In this large sample of community dwelling older men, the validity of self-reported SB was weak (either as a score with four sedentary behaviours or just TV viewing), with correlations of <0.2. The top quartile of the self-reported SB score identified the most sedentary men, however measured SB and PA were similar across the three lower quartiles. Self-report SB scores had poor content validity and performed most poorly among the men aged over 80 years. In contrast, we found evidence for content and construct validity for both the PA score (a simple to administer selfreport based on different domains of activity (habitual active transport; walking and cycling, recreational and sporting activities)) and a single item recreational PA question. Both were strongly positively associated both with objective measures of total MVPA and step counts, and inversely with long and short bouts of SB, demonstrating content validity and associated with measures of fitness (higher FEV 1 and lower Heart Rate), demonstrating construct validity. Hence both were valid for ranking PA in older adults, although they cannot necessarily identify the actual intensity of physical activity level.

Validity of self-reported Sedentary behaviours
As seen in other studies [29,30], accelerometers recorded much more SB time than either the self-reported total SB score (including reading, car and computer use) or just TV score. This is partly because the scores omit some habitual SBs (such as eating, sitting talking, resting). Further there may be social desirability bias (where undesirable activities are underestimated) or recall bias (that activities are systematically forgotten). However recall bias specifically about SB seems unlikely, given that both self-reported PA scores performed very well in relation to objective measures. The four components of our SB score are reported to be among the most common individual SBs in other studies of older adults examining more individual SBs, furthermore they are among the most reliably recalled, with highest test-retest ICCs, compared to other behaviours [9,31]. In our study, we did not observe differences in performance of the SB scores by age group, presence of CVD and mobility limitations, except that the SB self-report performed better in the under 80s, which fits with other data showing better performance of an SB score in <75 year olds [31]. Overall, our correlations between SB scores and measured SB of <0.2 fit with data from other studies of older adults investigating the validity of self-reported TVwatching compared to objective measures which report correlations of 0.23 for >60 year olds [30] and 0.22 in 65-92 year olds [32] in the latter study all nine other single SB questions were also correlated ≤0.2 with accelerometer measured SB. A different single item question asking about the time spent sitting in an average day also had poor agreement with accelerometer measured SB (Kappa −0.0003) [33]. Given that TV and the other domains of SB we asked about accounted for a smaller proportion of the total daily SB time in the lower compared to the higher quartiles of the self-report scores, we were interested in whether selfreport SB scores derived from questionnaire instruments including more domains of SB might perform better. However correlations between accelerometer measured SB and self-report SB were 0.3 for a 12-item instrument [31], 0.14 [33] and 0.12 [10] respectively for 9 and 8 item scores derived from different modifications of the SB questions on the Community Health Activities Model Program for Seniors (CHAMPS) questionnaire, and 0.3 [9] for a 7-item questionnaire designed to measure SB in older adults. A correlation of 0.35 for a 10-item instrument, was maximised to 0.46 by selecting a sub-set of 6 items however sleep was included in both of these selfreport scores [32], so they are not directly comparable to other studies reported here, as sleep is specifically excluded from definitions of SB [34]. Hence the validity of self report scores for assessing total time in SB appears to be limited in older adults, although self-reported SB questionnaires may give insight into the context of behaviours. Our findings have implications for studies of self-reported SB in relation to health outcomes; the poor discrimination of the lower levels of self-reported SB time in relation to objective PA and SB may account for some of the discrepancies in the reported relations between cardio-metabolic markers and self-reported TV time compared to objectively measured SB [8,35,36].

Validity of self-reported Physical activity
Our results suggested good content and construct validity for the six-point PA score: correlations between the score and step counts, MVPA and CPM were 0.5; stronger than for the SB score and than for some other PA Table 2 Health, demographic, physical activity and sedentary behaviour characteristics (n = 1377 men), mean (SD) or % (n) Men with PA score self-reported and accelerometer data a , n 1377 Region, % (n)  [37]. Also, the PA score differentiated between individuals performing across the range of PA levels, including at low activity levels. Hence the score could help identify low-active individuals who would most benefit from PA interventions. The PA score also differentiated total measured MVPA well. However the higher PA scores performed less well at identifying MVPA in bouts lasting at least 10 minutes (needed to fulfil current PA guidelines) and light physical activity; hence the questionnaire may misclassify light activity levels in some of the more active adults. The total physical activity score differentiated sedentary behaviours well: both total amount and prolonged bouts of SB declined steadily as the score increased.
Single item screening questions asking participants to rank themselves against others have been used in many large scale epidemiologic studies because they are simple to administer and have a low participant burden. They are often used to control for total amount of physical activity as a confounding variable. Our results suggested good content and construct validity of our single item question although there was a ceiling effect, so the question was   better at ranking low-active than highly active individuals. This is important in older populations given that many other questionnaires are reported to perform more poorly in identifying low active adults. Our measure may be useful in population-based studies requiring a ranking of older adults' PA levels.

Strengths and limitations
We studied concurrent objectively measured and selfreported SB using commonly used activity questions and accelerometers in a large population based-sample of free-living community-dwelling older men recruited from across the UK. Whilst the cut point we used for defining MVPA from the accelerometer data are validated in relation to energy expenditure in older adults (8), SB defined as <100 CPM from an Actigraph accelerometer worn over the hip could include some standing time, however average CPM was <10 CPM during SB. Therefore varying the definition of SB from <100 to <50 CPM changed the total SB time recorded very little, so any biases are likely to be small. Hip-worn Actigraph-measured SB has been demonstrated to have minimal bias compared to thigh-worn Activpal measured SB and the two measures correlate r = 0.76 [38]. Accelerometer study response rates (50%) are similar or superior to other studies of older adults; in other studies valid data were obtained from 21 % [23], 43 % [22] and (in the Health Survey for England population) 37 % of women and 48 % of men over 75 years had 4 or more days with valid data [39]. Nevertheless, participants were more often younger, and had healthier behaviours than non-participants and may therefore have been more physically active and less sedentary than the general population. As this study includes only men results may not be generalizable to older women, who have different patterns of PA [21] and may have different reporting biases or recall. Indeed it has been reported that TV time is a better indicator of SB time for women than for men [40]; future studies should investigate the properties of self-reported compared to objectively measured SB in older women.

Conclusions
Our simple PA score is useful in older adult populations for differentiating the high and low active and classifying the men across low activity levels, but may misclassify light intensity activity. The single screening question about recreational activities performed well for grading all intensities of PA and SB. Validity of self-reported SB was poor overall, especially at lower levels of SB, both for the single TV viewing question and for the combined 4-domain measure. This suggests that objective assessment of SB is likely to be more accurate than self-reported measures similar to those used in this study. As discussed above, more extended SB instruments used in other studies also related poorly to objective SB measures in older adults.
Given the widespread use of TV viewing as a proxy for SB in population studies, there may be implications for accurate estimation of the health effects of SB, particularly among individuals who watch less TV. Hence there is a need for studies examining the health effects of PA and also SB to use objective assessments.

Additional file
Additional file 1: Table S1. Physical activity and sedentary behaviour characteristics, mean (SD) or % (n), stratified by age 80 years. Table S2.
Associations Between PA and SB Score and Heart Rate (Beats Per Minute) and FEV 1 (Litres). Table S3. Associations between self-reported recreational activity score and components of objectively measured PA (n=1337 men). 1 (DOCX 21 kb)