Study design
This study uses the wave 1 data from the Interventions, Research, and Action in Cities Team (INTERACT) study [16]. INTERACT is a cohort study that is designed to examine using natural experiments the impact of transportation interventions in four Canadian cities (Victoria, Vancouver, Saskatoon, and Montreal) [17]. To provide context, we have included a socio-demographic data for four different cities included in our study (Additional file 1: Supplement A).
Study participants
Recruitment for wave 1 occurred from May 19—October 21, 2017 (156 days) in Victoria; April 20 – September 20, 2018 (123 days) in Vancouver; September 19, 2018 – January 4, 2019 (108 days) in Saskatoon; and June 6—December 21, 2018 (199 days) in Montreal. Inclusion criteria across all sites were being at least 18 years old, being able to read or write English (or French in Montreal) well enough to answer an online questionnaire, and not planning to move out of the city in the next two years. Site-specific inclusion criteria involved living in the Capital Regional District and cycling at least once a month in the city of Victoria; living within 3 km of the Greenway in Vancouver; riding the bus at least once in a typical month, or living within 800 m of the proposed BRT Bus Rapid Transit) lines in Saskatoon; and living on the Island of Montreal, Laval, or the South Shore in Montreal. The minimum requirement for participation was completing an online survey that measured health, physical activity, social participation, travel behaviour, and socio-demographic characteristics using validated measures. Participants could also choose to wear a SenseDoc [18] device for 10 days during waking hours, which recorded GPS and accelerometry data. The location data in the SenseDoc is measured using GPS at 1 Hz and accelerometer measuring at 50 Hz continuously, as long as the device was charged and on. The participants analyzed in this paper were those who completed the health survey and wore the SenseDoc device.
Measures
Outcomes
We examined two physical activity outcomes: total time spent in light, moderate, and vigorous physical activity (PA); and time spent in moderate or vigorous physical activity (MVPA). Minutes of sedentary, light, moderate, and vigorous physical activity were calculated using accelerometer data collected by the SenseDoc device worn by participants for 10 days. Minute by minute location-based physical activity level was calculated using methods applied in past research [16]. First, raw accelerometer data was converted to counts using published methods (implemented with Python code) [19, 20]. Vertical axis counts were then used for wear detection. The Choi algorithm was used to calculate device wear and non-wear [21]. For times when the accelerometer was worn, Troiano’s physical activity cut points were used to classify sedentary, light, moderate, and vigorous physical activity at the minute level [22]. GPS at the second level location data were joined to the accelerometer data at the second level. The median location at each minute was taken as the location in order to aggregate the GPS data to the minute level. We did not apply wear time criteria (e.g., 10 h of valid data) to the physical activity data as our objective was to keep as much of the data as possible for all participants.
In order to link physical activity outcome data with built environment and gentrification characteristics, the daily sum of minutes spent in either, sedentary, light, moderate, or vigorous physical activity in each dissemination area (DA) for each participant for each day was computed. Dissemination Areas Canadian census geographies representing small areas with an average population of 400 to 700 people [23].
Neighborhood built environment and gentrification exposures
We examined how neighborhood environment exposures were linked to the amount of total physical activity (light, moderate, and vigorous) and moderate or vigorous physical activity. Our environment indicators were either measured at the DA level, or census tract (CT) level (CTs are stable areas made up of multiple DAs with populations of 2,500 to 8,000 people (average of 4,000)) [24]. Exposure to different environments (e.g., active living environments, gentrification, proximity to amenities, and urban compactness) are described in the following paragraphs. We then joined our physical activity dataset to each of the environmental datasets at either the DA or CT level to create our final dataset.
Active living space exposure
Our active living exposure was measured from the 2016 Canadian Active Living Environments (Can-ALE) database. This geographic-based set of measures is intended to capture the active living friendliness of Canadian communities. A Can-ALE score is calculated by counting the number of intersections, dwellings, points of interest, and public transit stops within a circular 1-km buffer from the DA centroid. In Can-ALE, the four measures are then transformed into a Z-Score, combined into a composite measure, and divided into quintiles representing the favourability of the active living environment within each DA from 1 (very low) to 5 (very high). For example, in areas within the least amount of active living there are an average of 12 points of interest within a 1-km buffer compared to 429 points of interest in areas of very high active living.
Proximity to amenities measures
The proximity measures database released by Statistics Canada in April 2020 provides the proximity to 10 amenities types at the dissemination block-level [25]. Dissemination blocks cover all of Canada and are equivalent to a city block bounded by intersecting streets and are nested within dissemination areas. Two of the proximity measures are based on driving distance: proximity to employment within a 10 km buffer of the dissemination block centroid and proximity to healthcare within a 3 km buffer. The remaining 8 measures rely on walking distance: closeness to grocery stores, pharmacies, public transit, and neighbourhood parks in a 1 km buffer of the dissemination block centroid and proximity to primary, and secondary education, childcare, and libraries within a 1.5 km walking buffer of the dissemination block centroid. Each proximity measure has been normalized on a 0 to 1 scale where 0 indicates the lowest proximity and 1 the highest proximity in the data. We aggregated each proximity measure and calculated the median value for each DA. For analysis, we created quintiles of each proximity measure across the four cities so that a one-unit change corresponds with a one-quintile difference in proximity. We assumed the relationship between quintile measures and physical activity outcomes functioned in a linear fashion; such that, effect sizes represent the average effect across all 1-unit differences in a quintile measure.
Urban compactness
Which can be thought of as the inverse of urban sprawl, was calculated using nine urban form indicators representing four dimensions; density, mix use, street connectivity, and centering are used in the index construction [26]. The indicator was developed at the CT level in Canada using Bayesian multivariate spatial factor analysis. The urban compactness is similar to one developed by Ewing et al. in the United States [13].
Gentrification
The GENUINE database of Canadian gentrification measures are calculated from census data at the CT level in all Canadian metropolitan areas [27]. The measures rely on different combinations of change in census measures related to income, housing, occupation, education, and age. We used the measure adapted from Ding et al. to classify areas that had experienced gentrification during 2006 to 2016. A census tract was ‘gentrifiable’ or eligible to gentrify in 2006 if the median household income was below that of the respective metropolitan area. A gentrifiable census tract was classified as ‘gentrified’ by 2016 if: a) the median gross rent or median home value increased more than citywide increases and b) the proportion of college-educated residents increased more than citywide increases [28, 29]. We used a 3-level gentrification measures to represent areas that are ‘high socioeconomic status (SES) tracts, not eligible to gentrify,’ ‘low SES tracts, did not gentrify,’ and ‘gentrified tracts.’ We joined the CT gentrification measure to their corresponding DA.
Covariates
We included both demographic and weather covariates. Participant demographics were provided through survey data (age, gender, race). In regression models, we used four age groups (18–24, 25–44, 45–64, 65 + years), three gender groups (male, female, trans/non-binary/other), three income groups (annual household income < $50,000, $50,000-$99,999, $100,000 +), and a broad race grouping (persons who identified as white or Caucasian, persons who identified as a visible minority or Indigenous). We included a day of the week indicator variable (weekend, weekday) for when the PA occurred. We also included a dichotomous variable to represent the dissemination area where the participants home address was location. We define this as the Home DA in the analyses. Finally, weather variables including the total amount of precipitation that day (mm), and the average temperature (in Celsius) were included.
Analyses
Analyses were conducted in R version 4.1.0 and RStudio version 1.4.1106 and StataSE 16. The GPS and accelerometer data were spatially joined to the exposure data using the sf package and the st_join function in R. Joined GPS and accelerometer data were aggregated by individual ID, date, and dissemination area so that total minutes, minutes of physical activity, and minutes of MVPA were calculated for each DA, for each day, for each person. Following this, all exposure measures were joined to the GPS and accelerometer data using the unique identifier for each dissemination area within the study area.
We conducted descriptive statistics to characterize our study sample by city. We mapped time spent in light, moderate, and vigorous physical activity (PA) as the sum of minutes spent in each DA across all observation days and participants, and present the outcomes as quintiles to highlight spatial patterns in physical activity across cities [30].
We examined the associations between built environment and gentrification characteristics and physical activity outcomes in all four cities. To answer our research questions, we fit a series of multi-level negative binomial regression models to examine the links between the amount of time spent being physically active with built environment and gentrification characteristics. Models were stratified by city to answer whether environmental correlates of physical activity are consistent across place. Separate models were fit for our two outcomes, PA and MVPA. Each model adjusted for participant demographic (age, gender, income, race group) and covariates (weekend, home dissemination area, precipitation, temperature). To account for correlation within repeated measures for individuals over time, we included a random intercept by person and random slope as time (number of observation days). We present model coefficients as incidence rate ratios that can be interpreted as the average amount of time (in minutes) of either PA or MVPA that is associated with a particular level of a covariate relative to the reference level.
To examine the robustness of our study findings, we repeated the series of PA and MVPA models for DAs where participants spent at least 5 min per day. In doing so we were able to understand whether participants were just traveling through, or spending less than 5 min in a dissemination area, or spending more substantial amounts of time being physically active in a DA. All analysis code is available online [30].
The initial dataset included 2,493,887 min of matched accelerometer and GPS data for 544 participants. Once the data were aggregated at the person, day, and DA level, the complete dataset included 177,104 observations. The observations represented the number of minutes each participant spent per day in each DA they entered. We removed 2,157 records where geographic location was missing or date was missing. In addition, we removed 861 observations, associated with 4 participants because of device error.