Land use mix and physical activity in older adults: a longitudinal 1 study examining changes in land use mix in two Dutch cohorts

With Data from 1,114 respondents from the Longitudinal Aging Study Amsterdam (LASA) and 1,561 12 respondents from the Health and Living Conditions of the Population of Eindhoven and 13 Surroundings (GLOBE) study were linked to LUM in 1000-meter sausage network buffers at three 14 time-points. Cycling/walking outcomes were harmonized to include average minutes spent 15 cycling/walking per week. Data was pooled and limited to respondents that did not relocate between 16 follow-up waves. Associations between LUM and cycling/walking were estimated using a Random 17 Effects Within-Between (REWB) model that allows for the estimation of both within and between 18 effects. Sensitivity analyses were performed on smaller (500-meter) and larger (1600-meter) 19 buffers. and explored both within-person and 25 between-person associations of LUM on cycling/walking. present study found evidence of between-individual associations of land use mix in the 2 residential environment and the average walking time per week, as well as some evidence of 3 negative within-associations between land use mix and the average cycling/walking time in 4 respondents that did not move to a different residential address during follow-up. These findings advocate the use of research methods that combine both between- and within-individual analyses in order to gain more understanding of how land use mix in the residential environment can relate 7 to cycling/walking. More longitudinal research is needed to explore how changes in land use mix over time can influence cycling and walking outcomes.


DATA AVAILABILITY STATEMENT 23
The datasets generated for the MINDMAP project are not publicly available due to study participant 24 privacy considerations. However, data access can be requested from the individual cohort studies 25 INTRODUCTION 1 Physical inactivity has been identified as the fourth leading risk factor for global mortality [1] and 2 increasing physical activity (PA) has been marked as a top-priority intervention to reduce death 3 rates of non-communicable diseases [2]. Multiple studies have shown positive associations 4 between PA and measures of urban form, such as urban green spaces, public open spaces, 5 residential density, and land use mix [3][4][5]. Changes in the built environment, such as increased 6 investment in green spaces and pedestrian and cycling infrastructure, as well as transforming cities 7 towards more compact, mixed-used environments can potentially aid in promoting PA [4,6]. 8 Furthermore, extensive modification of the built environment for health-related purposes could gain 9 more traction in the coming years as a co-benefit of structural urban changes, such as climate 10 control efforts. 11 One commonly studied physical-environmental exposure with regards to PA is that of land use 12 mix (LUM). Land use mix represents how evenly different types of land use are distributed within a 13 specified area [7]. Mixed-use areas contain a variety of different land uses and are believed to 14 encourage PA because they include a larger number of destinations [8][9]. However, much of the 15 evidence linking varying land uses to PA is cross-sectional, which makes it difficult to establish a 16 causal relationship. Many studies adjust for confounding factors, but it remains unclear which 17 factors should be included. Furthermore, selection bias remains an issue as individuals may choose 18 to live in areas based on lifestyle preferences and socioeconomic factors [10]. A physically active 19 person may deliberately choose to live in a PA friendly area, inflating the possible relation between 20 LUM and PA. 21 Various methods have been applied to account for these methodological shortcomings, such as 22 adjustments for proxy indicators of preferences, as well as applying fixed effects (FE) models that 23 control for time-invariant characteristics, assuming that they remain stable over time. While the FE 24 model provides a valuable tool for assessing the effects of temporal changes, it disregards 25 between-individual variability. As the method solely relies on within-individual changes, it might not 26 be the best fit for LUM measures, as it is debatable how much LUM in an urban context changes 27 over time. The primary alternativethe random effects (RE) modelmakes use of between-28 individual variability, but in turn does not remove the effects of time-invariant causes, and assumes 1 that the unmeasured causes are uncorrelated with measured causes. The latter is often a difficult 2 assumption to make and, if violated, will result in omitted-variable bias [11]. Methods exist that 3 combine elements of both RE and FE models and take "the best of both worlds." These models go 4 by different names, such as random effects between-within models (REWB), Mundlak models, or 5 simply hybrid models, and make use of centering of all individual units around their means [12-13]. 6 Such models can be of great value for research considering the impact of LUM on PA as they not 7 only explore the differences between individuals, but also how a change in LUM might influence a 8 change in PA. However, these models have only been scarcely applied within the public health 9 Further complicating the evidence in the field of environment-PA research is a lack of 11 consistency in both the geographic units and scale used to define the individual's residential 12 environment [14 -15]. To quantify environmental exposures, researchers traditionally relied on 13 neighborhood-level data, such as pre-existing administrative units. A more refined method that is 14 especially relevant for PA comes with the use of network buffers that define buffers as areas 15 accessible via a street network. The "sausage" or "line-based" buffering method selects roads 16 within a certain distance of the individual and creates a buffer around these roads by a set distance 17 (e.g. 25 meters). This ensures that only those features that are directly accessible from the street 18 network are selected. This method has several key advantages as it is based directly on the road 19 network where people travel [15][16]. Sausage buffers therefore offer an attractive alternative to 20 more traditional Euclidian buffersespecially when PA is concernedas these buffers represent 21 areas that are actually accessible via the road network. 22 Our study uses sausage buffers to define LUM within the individual's residential environment 23 and links this data to cycling and walking outcomes. We linked data from two Dutch cohorts with 24 10 years of follow-up to a harmonized land use dataset, and explored both within-person and 25 between-person associations of LUM on cycling/walking. Infrastructure (PDOK) [21]. To maintain respondent privacy, addresses were extracted and 15 geocoded using a process previously described [17,22]. Respondents whose addresses could 16 not be geocoded, who did not participate in all three data collection waves, or who moved outside 17 of the study area for the respective cohorts were excluded. The sample was limited to respondents 18 that did not relocate during follow-up waves, resulting in a final sample of 1,561 respondents for 19 GLOBE and 1,114 respondents for LASA. Sensitivity analyses were performed on the total sample 20 including respondents that moved between follow-up waves. The use of personal data in the 21 The harmonization of the BBG data ensures that observed changes are representative of actual 1 changes in the environment and not related to changes in GIS processing or methodology. The 2 total land use data was grouped into 11 land use categories (supplementary file 2, table 13) database with all publicly available roads in the Netherlands with either a street name or a road 7 number. Roads that are not available to pedestrians and cyclists, such as highways, were excluded 8 to provide an accurate estimation of reachable destinations. Sausage buffers were created using 9 line buffers with a radius of 25 meters [16,26]. Land use mix was calculated for all buffer sizes 10 using the following entropy formula: 11 is an entropy score with a value between 0 and 1, the percentage of each land 15 use class of the total buffer area, and the total amount of land use classes. The calculated 16 entropy value represents a measure of heterogeneity, whereby 1 represents a perfect mix of land 17 use classes and 0 no mix of classes [27]. was set to 11 LUM classes to avoid measurement bias 18 and to improve comparability of the changes in LUM over time [28]. The LUM entropy score was 19 scaled in the analyses to represent a 10% change in LUM to improve interpretation. Cohort data 20 from each wave was linked to both NWB and BBG data from a preceding year, keeping in line with 21 an appropriate chronology of exposure preceding outcome. LUM exposure data was calculated for 22 all respondents in the final sample. 23

OUTCOME MEASURES OF WALKING AND CYCLING 25
Walking and cycling outcomes were assessed using self-reported time spent walking and cycling 26 and defined as average minutes spent walking and cycling per week. GLOBE uses the Short 27 Questionnaire to Assess Health enhancing physical activity (SQUASH) tool, which was created by 28 the Dutch National Institute of Public Health and the Environment to measure habitual physical 1 activity levels in an adult population [29]. In accordance with the SQUASH guidelines, it was 2 assumed that participants who filled-in hours or minutes per week, but omitted 'days per week,' 3 had been active for at least one day. If the number of days was provided without a corresponding 4 time frequency, the median minutes per day of all respondents was substituted. LASA uses the 5 LASA Physical Activity Questionnaire (LAPAQ), which asks respondent how often and for how long 6 they engaged in various activities, including walking and cycling in the last two weeks. LAPAQ has 7 been validated against 7-day physical activity diaries and 7-day pedometer counts in a subsample Education was considered to be time-invariant because of the relatively old age of the cohorts. 16 Marital status (married/partnership, not married, divorced, widowed), household income (monthly; 17 <€1200, €1200-1800, €1800-2600, >€2600), and employment status (employed, non-employed) 18 were included as relevant time-varying confounders. All time-varying covariates for both studies 19 were measured at all three time points, capturing changes that occurred during follow-up. Missing 20 data on covariates were handled via multiple imputation using the covariates listed above as well 21 as self-rated health (excellent, very good, good, fair, poor), smoking (yes, no), and BMI. Only the 22 covariates education, income, and employment (GLOBE), and income and employment (LASA) 23 had missing values, ranging from 2% -11% for GLOBE and 5% -12% for LASA. 24

STATISTICAL ANALYSES 26
The imputed data of both cohorts was pooled and limited to respondents with three measurements 27 on the outcomes. Pooling the data enabled us to observe more changes in the environment as well 28 as increasing variation in environmental exposure, therefore strengthening both the between-and 1 within-analyses. The analysis was restricted to non-movers to limit selection effects. Sensitivity 2 analyses were performed on data from the separate cohorts as well as on the total sample including 3 those who had moved between data collection waves. 4 We constructed a random effects within-between (REWB) model to conduct the analyses [11, 5 13]. This model decomposes the time-varying LUM variable into deviations from the individual-6 specific means (within-individual estimates) and individual-specific means (between-individual 7 estimates). The estimated between-individual regression coefficient represents how the exposure 8 across all participant-observations is related to the outcome, and the within-individual coefficient 9 represents how variation in exposure around the individual's mean level is related to the outcomes. 10 In addition, the model can include both time-varying and time-invariant covariates. A random 11 intercept is added to account for the dependence of multiple measurements for each participant. 12 The following model was used for the analyses:   Within-individual changes in LUM were observed for approximately 44% of all person-observations 2 ( Table 2). The observed changes consisted of both decreases and increases in the LUM which 3 corresponded to an average 5% decrease and an average 3% increase. Within-individual changes 4 were also observed for both outcomes with approximately 18% (cycling) and 14% (walking) 5 reporting no change in the average amount of minutes spent walking/cycling per week.

16
In the present study, we found evidence of between-individual associations of land use mix in 1000-17 meter buffers and the average amount of walking per week. We did not find evidence of within-18 associations between LUM in 1000-meter buffers and walking nor did we find evidence of within-19 or between-associations between LUM in 1000-meter buffers and cycling. We did find evidence of 20 a negative within-effect on cycling in larger 1600-meter buffers, and evidence of a positive between-1 and negative within-effect on walking in 500-meter buffers. 2 The 1000-meter network buffer is a commonly used exposure measure in PA research as it is 3 believed to be a reasonable distance that people can walk [8]. The associations that we found for 4 this buffer are in line with other studies on this subject. For example, a recent study using the 5 GLOBE data found no evidence of within-associations of green spaces in 1000-meter buffers on 6 cycling and walking outcomes [33]. Our study also found no evidence of within-associations 7 between a change in LUM in the residential environment and cycling/walking. A study conducted 8 in Brisbane, Australia found that results of estimates from random effects models indicated positive 9 associations between any walking for transport and an increase in LUM of 10%, which is in line 10 with the between-associations that we observed for walking [8]. This Australian study also found 11 positive, if less pronounced, within-individual associations. While our study did not observe within-12 associations for our main exposure buffers, we did observe within-associations for the smaller 500-13 meter buffers, but these were the inverse of the between associations. 14 Little consensus exists about what buffer sizes to use when analyzing how LUM and 15 cycling/walking relate, with other studies reporting both smaller and larger buffers [34]. As both the 16 GLOBE and LASA cohorts include a large proportion of older adults, we included a smaller buffer 17 of 500 meters in our sensitivity analyses to test whether LUM in this smaller buffer was associated 18 with walking. We also included a larger 1600-meter (approximately 1 mile) buffer in our analyses 19 specifically for the cycling outcome. The 1600-meter buffer is another commonly used buffer and 20 can be especially relevant for cycling as larger distances can be covered compared to walking. The 21 results for the larger and smaller buffer sizes were contrary to what we expected based on the 22 existing literature. For example, a study conducted in Perth, Australia found that an increase in 23 access to destinations in the residential environment was associated with taking-up cycling, 24 providing evidence that changes in the built environment may support the uptake of cycling among 25 formerly non-cycling adults [35]. Our study did not find evidence that a change in LUM in the 26 residential environment is associated with time spent cycling in our main exposure buffers of 1000 27 meters and some evidence of negative associations between LUM and cycling in larger 1600-meter 28 buffers (supplementary file 1, table 5). Explanations for these results may be found in cultural 29 differences between cycling in The Netherlands and Australia, but also in the definition of the 1 exposure and the mechanisms between LUM and cycling outcomes. Whereas the study in Perth 2 included respondents that moved to a new residential neighborhood, our study specifically only 3 included respondents that did not relocate during follow-up. The within-changes are therefore 4 indicative of changes in the residential environment and not the result of moving to a different 5 residential environment. Different mechanisms may therefore be at play when compared to the 6 effect that moving to a different neighborhood can have. As our study provides mixed results, more 7 research is needed that explores how changes in the residential environment relate to 8 cycling/walking. This is not only an important question from a scientific point of view, but also from 9 a policy perspective as it provides policy makers with more insights how a change in the 10 environment might relate to a change in cycling/walking. 11 These findings have several implications for research on the effects of LUM on cycling/walking 12 outcomes. Firstly, this study provides evidence that associations between environmental 13 exposures and health outcomes can vary greatly based on the size and type of the buffers used 14 ("crow-fly" Euclidian buffers or network buffers). This is not a new phenomenon and has been 15 described extensively in the health and environment literature [15,36]. A study comparing different 16 buffer types for PA research concluded that the sausage buffer method remains the most 17 defensible method for creating network buffers as it increases both comparability and repeatability 18 [15]. By including multiple individual-specific network buffers and by excluding roads that are not 19 accessible to pedestrians and cyclists, we aimed to provide an accurate exposure measure that 20 accounts for these issues as much as possible. Secondly, the between-individual and within-21 individual effects of LUM on cycling/walking appear to be substantially different. Our study found 22 robust positive between-associations of LUM and walking, but unexpected negative within-23 associations for our 500-meter buffers. These results therefore strongly advocate the use of both 24 between-and within-individual analyses when the effect of (built-)environmental exposures on 25 cycling/walking outcomes is considered. More longitudinal research on this topic is therefore 26 urgently needed; a call that has been echoed by other authors in the field in recent years [37]. 27

STRENGTHS & LIMITATIONS 1
The present study adds to the literature on how the residential environment relates to cycling and 2 walking by using data from two Dutch cohorts with 10 years of follow-up and linking this data to 3 harmonized LUM exposures. It fills an important methodological gap by exploring both between-4 individual and within-individual effects of LUM on cycling/walking. By applying the REWB 5 framework to longitudinal data of respondents that did not relocate during follow-up, we gain more 6 insight into how different levels of LUM affect cycling/walking and how a change in LUM can 7 potentially change the average cycling and walking time. The REWB model retains the advantages 8 of the standard FE model, but also incorporates between-individual variation, while allowing to 9 control for measured time-invariant confounders. By retaining the virtues of the standard FE 10 approach, it helps to infer potential causal relationships between changes in LUM and 11 cycling/walking that have more potential for evidence-based action [13]. It also helps to answer a 12 relevant (policy) question: is a change in LUM in the residential environment associated with a 13 change in cycling/walking? As most of the research on LUM and cycling/walking is cross-sectional, 14 answering this question can broaden the understanding of potential causal pathways between LUM 15 and PA. 16 The use of sausage network buffers offers numerous improvements over traditional buffering 17 methods. By excluding roads that are not accessible to cars, we ensured that the resulting network 18 buffers were representable of the areas that can be reached while cycling or walking. This has the 19 limitation that specific land use destinations that can easily be accessed by cars, but less easily by 20 bike or on foot, are excluded. However, we estimate that the impact of this methodological choice 21 is limited as our study was conducted in urban areas with a high density of roads accessible to 22 cyclists and pedestrians and the buffer areas were limited to the residential environment. Network 23 buffers offer improvements in this regard compared to more traditional Euclidian or "crow-fly" 24 buffers that do not consider if the street network allows or prevents access to specific locations. 25 The sausage buffering technique also offers improvements in the repeatability and consistency of 26 network buffer measures compared to other methods, such as ESRI's ArcGIS Network Analyst. 27 The sausage buffer method results in a representative area for area-based measures regardless 28 of street network connectivity, and ensures that only those features that are accessible from the 29 road network are included. By applying the buffers to a harmonized land use dataset, we ensured 1 that changes observed in the data are representative of actual changes in the environment and not 2 the result of changes in data processing of GIS methodology. 3 Finally, the present study also adds to the existing literature by considering the effects of 4 changes in LUM on cycling/walking in a Dutch socio-spatial context where cycling is a big part of 5 everyday life, and for cities that are already very compact compared to those in other countries 6 such as Australia or the United States. Evidence from such countries suggest that a move towards 7 more compact cities with a mixed-use environment can have a positive effect on cycling and 8 walking, but there is little evidence from cities that are already very compact and dense such as 9 the ones in this study [9]. By pooling data from two Dutch cohorts, we were able to both increase 10 variation in environmental exposures as well as increase the statistical power of our analyses. 11 Our study also has some limitations to consider. First, while individual-level network buffers offer 12 great improvements in measuring exposure compared to more traditional neighborhoods, we were 13 not able to control for other urban-environmental and social-urban factors, such as residential 14 density, safety, or neighborhood socio-economic status. A study conducted in Amsterdam, The 15 Netherlands found evidence that neighborhood safety was associated with cycling [38]. As we used 16 individual-specific network buffers, we were not able to control for such effects in our analyses. 17 Secondly, we were also not able to control for time spent away from the residential environment. 18 However, it has been theorized that older adults may be particularly susceptible to environmental 19 factors in the residential environment as they are likely to spend more time closer to home than 20 younger adults [39]. Finally, in order to pool the data from both cohorts, variables had to be 21 retrospectively harmonized, which means that study variables are harmonized after they have been 22 collected. While retrospective harmonization is a good way to make comparisons between cohorts 23 possible, it does inherently come with the limitation that some detail is lost in the process. For 24 example, income classes in both cohorts did not match well and therefore had to be generalized in 25 order to be comparable. Harmonization choices like these inevitably lead to a loss in sensitivity and 26 specificity of the data. More prospective harmonization would alleviate these limitations and 27 therefore make better comparisons between cohorts possible. 28

CONCLUSIONS 1
The present study found evidence of between-individual associations of land use mix in the 2 residential environment and the average walking time per week, as well as some evidence of 3 negative within-associations between land use mix and the average cycling/walking time in 4 respondents that did not move to a different residential address during follow-up. These findings 5 advocate the use of research methods that combine both between-and within-individual analyses 6 in order to gain more understanding of how land use mix in the residential environment can relate 7 to cycling/walking. More longitudinal research is needed to explore how changes in land use mix 8 over time can influence cycling and walking outcomes.