Validity of a multi-context sitting questionnaire across demographically diverse population groups: AusDiab3

Background Sitting time questionnaires have largely been validated in small convenience samples. The validity of this multi-context sitting questionnaire against an accurate measure of sitting time is reported in a large demographically diverse sample allowing assessment of validity in varied demographic subgroups. Methods A subgroup of participants of the third wave of the Australian Diabetes, Obesity, and Lifestyle (AusDiab3) study wore activPAL3™ monitors (7 days, 24 hours/day protocol) and reported their sitting time for work, travel, television viewing, leisure computer use and “other” purposes, on weekdays and weekend days (n = 700, age 36-89 years, 45 % men). Correlations (Pearson’s r; Spearman’s ρ) of the self-report measures (the composite total, contextual measures and items) with monitor-assessed sitting time were assessed in the whole sample and separately in socio-demographic subgroups. Agreement was assessed using Bland-Altman plots. Results The composite total had a correlation with monitor-assessed sitting time of r = 0.46 (95 % confidence interval [CI]: 0.40, 0.52); this correlation did not vary significantly between demographic subgroups (all >0.4). The contextual measure most strongly correlated with monitor-assessed sitting time was work (ρ = 0.25, 95 % CI: 0.17, 0.31), followed by television viewing (ρ = 0.16, 95 % CI: 0.09, 0.24). Agreement of the composite total with monitored sitting time was poor, with a positive bias (B = 0.53, SE 0.04, p < 0.001) and wide limits of agreement (±4.32 h). Conclusions This multi-context questionnaire provides a total sitting time measure that ranks participants well for the purposes of assessing health associations but has limited accuracy relative to activPAL-assessed sitting time. Findings did not differ in demographic subgroups. Electronic supplementary material The online version of this article (doi:10.1186/s12966-015-0309-y) contains supplementary material, which is available to authorized users.


Background
Evidence is accumulating on the risks to health (including premature mortality) posed by prolonged periods of time spent in sedentary behaviours [1,2], defined as sitting or reclining, and expending less than 1.5 metabolic equivalents (METs) of energy during waking hours [3]. Much of this evidence has been derived from studies examining overall sitting throughout the day or specific sedentary behaviours, such as television (TV) viewing time [4]. In order to understand and influence this adverse exposure, specific domains (e.g. occupational) and/or behaviours (e.g. TV viewing) need to be taken into account [5]. Measurement devices can assess duration and time-of-day when the sedentary time occurs; however, these devices typically do not measure the context [6] and may be expensive in large scale studies. There have been recent developments in methods to identify objectively the relevant contexts in which sedentary behaviors take place through, for example, the use of Global Positioning System devices, wireless location systems and wearable cameras, although these have had difficulty with obtaining usable data particularly inside buildings [7]. Multi-item self-report measures that assess sedentary time spent in particular contexts may provide a cost effective alternative.
It is imperative, however, that the properties of such measures are assessed, given the susceptibility of selfreport measures to recall error and bias [8].
Many validity studies for sitting time questionnaires have been carried out in small convenience samples, not representative of the general population (e.g. university staff) [6]. Previous studies have shown that accuracy of sitting time recall can vary by age, gender and education [9][10][11][12], and, therefore, it is important to assess the validity of questionnaires across different population subgroups for use in public health research.
The third wave of the Australian Diabetes, Obesity, and Lifestyle (AusDiab) study included past-week recall questions asked across a comprehensive array of sedentary domains and behaviours, and collection of objective sitting time data (via activPAL) [13,14] in a large sub-sample. This presented the opportunity to further understand the measurement properties of the questionnaire. Specifically, we examined the validity of the composite self-reported sitting time relative to sitting time measured by activPAL3™ monitors (monitor-assessed sitting) in terms of agreement and correlations, along with differences between demographic subgroups in relative validity. We also tested the associations with monitor-assessed total sitting time of the various sitting contexts, in all participants, workers and in non-workers.

Data source
The AusDiab baseline study (n = 11,247) was initially conducted during 1999 to 2000 to examine the prevalence of diabetes and its risk factors in the Australian population [15]. The third survey (AusDiab3) took place in 2011/12 and 4,614 participants (45 % of those potentially eligible from the baseline sample) attended the onsite testing and answered the questionnaires. Detailed methods of sample recruitment and data collection for the AusDiab study have been described elsewhere [15,16]. The study was approved by the Ethics Committee of the International Diabetes Institute and The Alfred Health Human Ethics Committee (no. 39/11), and written informed consent was obtained from all participants.
AusDiab3 used the same TV viewing time questions as in prior surveys and added new questions on sitting for work, transport, leisure-time computer use and "other" sitting. AusDiab3 also added an activity monitor assessment in a subsample of eligible participants, who were recruited from on-site attendees at 46 sites across Australia [16]. Each day, participants for the activity monitor sub-study were invited consecutively, beginning with the first potentially eligible participant (i.e., ambulatory, not already known to be pregnant) until either no more devices were available or five participants had been recruited for that day. Participation in this component required informed written consent, additional to that for participation in the main study. In total 1,014 participants in AusDiab3 were approached to participate, and 782 (77 %) agreed.

Data collection
On the day of recruitment, participants underwent biochemical, anthropometric and behavioural assessments as part of the larger set of AusDiab3 survey procedures. At this visit, the questionnaires were administered by trained interviewers and the activity monitors were either attached by research personnel or self-attached and checked by research staff.

Referent assessment
Objective data on time spent sitting or lying (collectively referred to here as sitting) were collected using the valid and responsive [14,17,18] activPAL3™ activity monitor (PAL Technologies Limited, Glasgow, UK; see Table, Supplemental Digital Content 1 which details methods for the activPAL3™ monitor). The monitor was secured onto the right anterior thigh with a hypoallergenic patch. Participants were asked to wear the monitor continuously (24 h/day) for the seven days following the onsite assessment and to report in a diary all wake up, sleep and monitor removal times (if any). Monitor data were processed in SAS™ 9.3 (SAS Institute Inc., Cary NC; see Additional file 1). Periods spent sleeping or not wearing the monitor, and invalid days, were excluded. Time spent sitting/lying down was totalled for each day and then averaged for all days, weekdays and weekend days deemed valid. Valid days were defined as days with monitor removal <20 % of waking hours, and with ≥10 hours estimated waking wear time (when sleep/wake were not reported in the diary). Sitting time was multiplied by a correction factor (waking hours/worn waking hours) to estimate each individual's sitting time over the entire waking day (not only the waking wear period).

Self-reported sitting time
Participants were asked to report sitting time over the past seven days, separately for weekdays and weekend days, across five contexts (work, transport, TV viewing, leisure time computer use and "other" sitting; see Additional file 2, Sitting time questions from AusDiab3). The periods of recall and activPAL wear did not overlap as the monitors were provided on the same day the questionnaire was administered. Total times over the recall period were converted to average time per day by dividing by five (weekdays), two (weekends), or seven (overall [weekday + weekend]). Such daily averages were calculated for each sitting time item, and the sum of all items (termed composite self-report sitting time).

Other measures
Socio-demographic data collected in the intervieweradministered questionnaire included: age; gender; education; work status; marital status; annual household income; and, area of residence (categorised as per Table 1). Moderate-to vigorous-intensity physical activity (MVPA) was determined using the Active Australia Survey (AAS), a validated and reliable questionnaire [19,20], by summing the time spent over the last week in walking, other moderate and vigorous physical activity; (vigorous time multiplied by two as per AAS procedures) [21]. Physical activity status was then categorised as none (0 mins/week), insufficient (>0-< 150 mins/week) and sufficient (≥150 mins/week) according to Australian guidelines [22]. Participants reported whether the week recalled in the physical activity and sedentary behaviour parts of the questionnaire was a "typical week" for them or not. Body Mass Index (BMI) was calculated using measured height and weight (protocol previously described) [16] and was categorised as normal or underweight (<25 kg/m 2 ), overweight (≥25 to <30 kg/m 2 ) or obese (≥30 kg/m 2 ).

Statistical analyses
Analyses were conducted in SPSS version 22.0 (IBM Corporation, Armonk, NY) with statistical significance set at p < 0.05 (two-tailed). The multistage sampling led to only a very average design effect (1.3); statistics were reported as per a simple random sample. Only participants who provided at least four valid days of monitor data (including at least one weekend day) (n = 711), completed the sitting time questionnaire (n = 702) and whose total self-reported sitting time was plausible (≤18 hours/day) (n = 700) were included in analyses. Analyses did not exclude the 129 participants (18 %) who reported their recall period was not "typical" for them as relative validity (r = 0.54) was not worse for them than for other participants (r = 0.45). Of the included participants 595 participants [85 %] wore the activPAL for seven valid days.
The associations between self-reported and monitorassessed sitting time over the whole week, for weekdays, and for weekend days were reported in terms of Pearson's correlations (r) with 95 % confidence intervals (95 % CI) performed in the whole sample and within various demographic subgroups. Linear regression models with interaction terms (self-report sitting time x demographic characteristic) examined differences between demographic subgroups in associations of self-report sitting with monitor-assessed sitting time. Associations of individual sitting time items with overall monitorassessed sitting time in the total sample, and specifically in workers and non-workers, were tested using Spearman's correlations (ρ). Strength of correlations are described according to the criteria established by Cohen: >0.5 large; 0.5-0.3 moderate; 0.3-0.1 small; and, <0.1 insubstantial [23].
Agreement between composite self-report and monitorassessed sitting time was examined using the method outlined by Bland and Altman [24], with the plot displaying mean difference (MD) and limits of agreement (LoA; +/-1.96 × SD). Linear regression was used to check whether the MD and LoA varied across average values of self-report and monitor-assessed sitting time ([composite self-report sitting + monitor-assessed sitting]/2). Agreement was examined over all days, for weekdays and for weekend days.

Results
The included sample (n = 700) covered participants of ages 36 to 89 years (mean age = 59 years). Most (82 %) reported that the sitting time responses they had provided were indicative of a typical week. Participants were awake (and wearing the monitor) for an average of 15.8 h/day (SD = 1.08) of which 56 % was recorded as sitting or lying down by the activPAL monitor. The included sample was not significantly different to the other AusDiab participants with respect to baseline characteristics of gender, area of residence and work status, but they were younger (p < 0.001) and more likely to be married or in a defacto relationship (p < 0.01), have a post high school qualification (p < 0.001), be born in Australia (p < 0.01), be in the normal BMI range (≥18.5-< 25 kg/m 2 , p < 0.001), and less likely to have low household income (<AU$800/ week, p < 0.001) and to report MVPA of 150 minutes or more per week (p < 0.01) (see Additional file 3, Characteristics of excluded and included participants at AusDiab1).

Context-specific sitting time
Correlations with monitor-assessed sitting time are shown in Table 2, overall and separately for weekdays and weekend days. Correlations between self-report and monitorassessed total sitting time were moderate for weekday totals (r = 0.49; 95 % CI: 0.43, 0.54) and weak for weekend day totals (r = 0.25; 95 % CI: 0.19, 0.32), even within those with two weekend days of activPAL wear (n = 648; r = 0.26; 95 % CI: 0.19, 0.33). Sitting time reported in each context individually had only small or insubstantial correlations with total monitored sitting time, that were typically statistically significant, except for transport sitting, and for "other" sitting. The highest correlations with total monitored sitting overall across the week, on weekdays, and on weekend days respectively were observed for overall work sitting (ρ = 0.25;  Missing data for work status (n = 6), highest qualification (n = 4), and income (n = 7). Standard deviation (SD), Pearson's correlation (r), 95 % confidence interval (95 % CI). P for difference between groups in association of self-report with monitor-assessed total sitting time (linear regression)

Agreement
The Bland-Altman plot for the agreement in total (composite) sitting time (h/day) assessed by self-report versus monitor is shown in Fig. 1. The difference between the instruments increased significantly with the average of the two measures, but the 95 % limits of agreement were constant, and consistently large, at ± 4.32 h/day (Fig. 1). For all but the very highest levels of average sitting time (>approximately 14 h /day), the questionnaire underestimated relative to the monitor. The difference between the two measures was -2.01 h/day on average (i.e., at 7.85 h/day, mean of self-report and activPAL sitting time). A positive bias, and wide limits of agreement were also seen for weekday and weekend day sitting (Fig. 1).

Discussion
In this sample of Australian adults, we found that composite self-reported sitting time recalled over the past seven days, measured using a 10-item multi-context questionnaire, was moderately correlated (r = 0.46, 95 % CI: 0.40, 0.52) with monitor-assessed sitting time overall and across a wide range of demographic groups. The self-report measure was an acceptable method to rank participants' sitting time. This measure is not appropriate when accurate estimates of actual sitting time are required as agreement with monitor-assessed sitting time was poor, both at a group level (mean differences averaging 2 h/day that varied proportionally to average sitting time), and for individuals (>4 h/day limits of agreement). Mostly, sedentary behaviour questionnaires show small to moderate correlations (predominantly <0.40) with referent measures, typically derived from waistworn accelerometers (sedentary time <100 counts per minute, vertical axis) that do not measure sitting [6]. Our findings show correlations of r = 0.4 to r = 0.5, despite the recall and device wear weeks being completely separate, which tends to weaken correlations. Weekend day measures may be the most affected, with the least days of monitoring available to minimise the between-day variation. The relative validity in this current study was similar to that observed against activPAL sitting time within overweight breast cancer survivors for total sitting time assessed using the seven-context Past-day Adults Sedentary Time questionnaire (PAST; on recalled day r = 0.58, 95 % CI 0.40, 0.72; for overall weekly sitting time r = 0.36; 95 % CI 0.11, 0.57) [25] and the Dutch version of the SIT-Q_7d, a five-context past week recall questionnaire (ρ = 0.52, p < 0.001) with similar contexts to the AusDiab questionnaire but with categorical responses [26].
Previous studies have examined and found differences in the validity of sitting time questionnaires by age, gender, and education [9,11,12,27]. We observed no significant or large differences in relative validity between socio-demographic groups aside from poorer relative validity within older adults (≥75 years) than their younger counterparts. Age-related differences in validity have been observed elsewhere [11]. While this could reflect an age bias in the referent measure, some caution should be exercised when using the questionnaire in adults aged ≥75 years.
Sitting for work and for TV viewing were consistently the two contextual measures most strongly correlated with monitor-assessed total sitting time. Their relative importance seemed to depend on the time period of interest (work was important overall and for weekday sitting while TV viewing was especially important for weekend-day sitting) and participant work status. In non-workers, TV viewing was always the most important context, which was somewhat consistent with a finding from NHANES [28], that the correlation of TV viewing time with total accelerometer-derived sedentary time (<100 cpm) appeared stronger within non-workers than workers.
Agreement between composite self-report and monitorassessed sitting time was poor, with a positive linear relationship between the difference and average of the two measures. At mean levels of average sitting time, participants reported over two hours less sitting per day than was recorded by the activPAL. This bias is similar to previous findings for sedentary behaviour questionnaires compared to activPAL sitting time [25,26] and to accelerometerderived sedentary time (100 cpm) [11] for both single-item sitting time questions [10,29] and composite measures of sitting time [11,30]. Limits of agreement were also wide, indicating poor individual accuracy.
Both self-report and monitor-based measures have their own unique place in furthering the science of sedentary behaviour [31]. Self-report measures, such as the AusDiab3 multi-context questionnaire, can provide important behavioural context information, are a cost-effective method of ranking participants, and are useful in assessing relationships between sitting time and health outcomes. However, when accurate sitting time duration is required, monitors are more appropriate in order to provide the necessary precision at both the group and the individual level [13].
A key strength of this study was the comparison selfreport questions against an accurate measure of sitting time (activPAL3) [13,14] in a large and diverse sample of adults. Although not population representative, the sample was much closer to representative than can be said of most validity studies, covering a wide spectrum of Australian adults across multiple geographic locations, and chiefly lacking only adults aged ≤35 years. Findings cannot be generalised to young adults. The validity of individual sitting contexts, such as computer use, may differ for young adults from what we observed. Arguably, this study may have overestimated the validity of the questionnaire as the Fig. 1 Bland-Altman Plots for total sitting time (h/day) reported over the last seven days across five contexts versus monitored by activPAL (n = 700 Australian adults >35 years): overall (a); on weekdays (b); and, on weekend days (c). The solid line represents the mean difference (MD) between the two measures and the dashed lines are the 95 % limits of agreement (LoA) in h/day general reporting abilities may be greater for this group (now involved in the third wave of a longitudinal study) than for the population at large. However, this specific questionnaire had never been administered to the participants in previous waves (with the exception of the TV viewing question). A further limitation of the study design was that sedentary behavior may vary over time, but only a single assessment of each measure was taken, close together in time, but not at the same time. As such our findings likely underestimate what the agreement and correlation would be if our objective and self-report measures were obtained for precisely the same week. Ideally, for behaviors that are likely to vary over time, validity should be assessed for long-term averages [32]. Methods to address this have been developed, involving performing assessments repeatedly, spread across an extended period of time [32]. Additionally, the reliability of this measure was not assessed, although both self-report and devicebased measures of sedentary behavior typically show good reliability [6,33].