The aim of this study was to explore how variations in the categories of land uses included in entropy calculations of LUM measures in the WI can impact on the observed associations with total, transport and recreational walking. This study is unique in that it allowed a comparison of different LUM computations within the same data set. Until now comparisons have only been possible between different LUM measures used in different studies in different contexts and this limits the comparability of findings.
Irrespective of the LUM measure used, our results show that residents living in high walkable neighbourhoods do more walking than those in low walkable environments and that WIs are more strongly related to walking for transport than recreational walking. Depending upon what LUM was incorporated into the WI, residents living in highly walkable neighbourhoods were up to twice as likely to walk for transport as residents in low walkable neighbourhoods. While these findings agree with the work of others , our results show that the associations varied by type of walking, and by the amount of walking (e.g., > 0, ≥ 60 or ≥ 150 mins/week). Owen et al., also reported differences by type of walking with significant associations with walking for transport but no association with recreational walking . Our findings show that reporting more than an hour per week of transport walking had the strongest and most significant association with a WI that included 'Residential', 'Retail', 'Office', 'Health, welfare and community', and 'Entertainment, culture and recreation', while doing any recreational walking was more strongly associated with a WI that also included 'Public open space', 'Sporting infrastructure' and 'Primary and rural' land uses. There was no association with higher levels of walking (≥ 150 mins/week) however the prevalence of respondents achieving this level of neighbourhood walking was low and this may have reduced the power to detect significant associations. The variations observed lend further support to the idea that context-specific measures of the built environment (e.g., a recreational walking specific WI) would be more sensitive to detecting associations with different types of walking behaviour [25, 31, 43, 44].
Importantly, this study provides evidence that varying the combination of land uses in the LUM calculation impacts the strength of relationships with different types (and amounts) of walking behaviour. The strongest association between the WI and any transport walking was found in the land use mix computation that included 'Residential', 'Retail', 'Office', 'Health, welfare and community' and 'Entertainment, culture and recreation' land use classifications (Model 2). This LUM is most similar to the later computations of WIs used by Frank and colleagues . In contrast, Model 3, which included Model 2 land classifications plus 'Public open space', 'Sporting infrastructure' and 'Primary and rural', better captured recreational walking. The construction of Model 3 was based upon the work of Forsyth et al., . Notably, when 'Public open space', 'Sporting infrastructure' and 'Primary and rural' was included in the LUM measure (Model 3), the association between the land use z-score and transport walking was eliminated confirming that a land use class that includes public open space, is not relevant for transport walking. Rather a LUM that incorporates transport-related destinations only (i.e., Model 2) appears to be superior for capturing an association between walkability and transport walking. Similarly, Duncan et al.,  reported that the relationship between Census Collector District-level LUM and walking for transport is stronger when using LUM measures that include only theoretically relevant land uses. These results support our hypothesis that different computations of land use mix are relevant for different types of walking. Furthermore, the results are promising in that they provide evidence to suggest that manipulation of land uses included in the LUM measure can result in improved associations with recreational walking. This work provides an important first step towards developing a WI that better captures recreational walking although further work is required.
While the aim of this study was to manipulate land use classes to best capture walking behaviours, there are inherent problems with the base data and the calculation used to determine LUM (i.e., entropy formulas) which present significant barriers to the development of behaviour-specific LUM measures. It has already been highlighted  that to fully understand the results from these kinds of analyses, a detailed knowledge of the base data is important, particularly the data from which the land use classes are derived. Similar to other studies of this kind, the land classification system used in RESIDE was designed for planning purposes and commercial employment patterns , not public health research. Various data processing steps are therefore required to create land use measures and these steps are often restricted by the original base data structure and coding. Moreover, data processing can be undertaken in different ways which may not be clearly reported when published. Often the preferred specificity and groupings of land uses are not available or possible from the base data and this could impact on the relationships detected. At worst, the use of broad groupings of land use may obscure associations between the environment and behaviour of interest. These limitations have been observed previously in ecological studies of plant and animal distributions .
Another problem with the base data used to compute the LUM variable is the allocation of a single use to a land area when, in some instances, a multi-use classification may be more appropriate. For example, a large city park with a small kiosk on site would be classified as a large 'Retail' area based on the single-use hierarchy of land use classifications in Table 1. Not only does this classification fail to represent the reality on the ground (i.e., presence of both green space and a retail outlet) but it would likely alter the observed associations between the neighbourhood attributes and specific-walking behaviours. A more appropriate classification for this land parcel would have been both 'Public open space' and 'Retail' classifications.
A further limitation associated with base data is incomplete data coverage. Land uses may be omitted from the spatial classification system due to insufficient data. For example, in RESIDE it was likely that areas identified as 'Unclassified' included attractive vegetation and/or natural amenities such as waterways (streams) conducive to recreational walking. Thus, exclusion of unclassified land in the base data set may have attenuated associations with recreational walking. We therefore tested models with (Model 4) and without (Model 5) the 'Unclassified' land use to explore its potential contribution but found when added there was no association with recreational walking. We suggest that it is possible that the 'unclassified' category may include land uses that are both positively (vegetation) and negatively (derelict land) associated with walking. Future studies may therefore wish to explore this further, but in the interim it appears justified within the West Australian context to exclude this land classification and remove the 'noise' associated with potential measurement error.
Another underlying issue potentially affecting the observed relationships is the calculation of LUM itself, specifically the limitations associated with the entropy formula. As highlighted by Brown and colleagues  in a study exploring patterns of obesity, entropy scores of LUM have a number of limitations and these include: not capturing the presence of a wide range of land uses (usually only a maximum of six land use classes included); each land use class is treated as equal when the relationship between different land uses may be relative to one another; not capturing differences in the aesthetic appeal of land uses; and as noted above, unclassified land is simply ignored. Furthermore, entropy scores give a relative score of land use (range 0-1) and do not reflect the absolute size of area. Despite these limitations, the RESIDE study had access to a reasonably well organised and accessible source of land use data. Unlike studies that report a lack of coordination in the collection of land use information , information from the Values General's Office of Western Australia and the public land vesting information [37, 40] provide a strong data infrastructure that can be manipulated to support public health research. However, the methodological issues noted here, highlight that comparisons between studies may be problematic and caution is required when undertaking within-and between-country comparisons of the association between neighbourhood walkability and physical activity.
It is evident that the prediction of different amounts and types of walking behaviours may depend on the types and combinations of land use classes included in the LUM component of a WI. A simple measure of the total area of 'walkable' land uses (e.g., public open space, retail, residential) may provide a better measure of LUM than an entropy score. For example, Brown and colleagues reported that for body mass index the presence of walkable land uses was more important than the equal mix of walkable land uses calculated from entropy scores . Furthermore, the presence or density of specific destinations is relatively easy to compute and is viewed as an acceptable substitute for LUM measures [47–49]. Nevertheless, it can be difficult to generate a concise and current listing of destinations in a study area and considerable variation in data quality exists between commercially available sources and researcher-conducted field audit data . Until this issue can be resolved, the use of destination data in WIs may be limited.
It is also possible that other attributes of urban design over and above LUM may improve the explanatory value of WIs. For example, the presence or absence of sidewalks, the amount of natural vegetation (greenness index), road traffic volume, as well as the aesthetic quality of the neighbourhood could be included in an expanded WI. This may result in stronger associations with walking behaviours, which may vary across the life course from children through to older adults. Others have noted that WIs that do not include measures of aesthetics may contribute to the failure to predict variation in patterns of recreational walking [23, 31]. Future RESIDE analyses will investigate ways to create neighbourhood walkability measures which have a stronger relationship with recreational walking. As a longitudinal cohort study RESIDE is also uniquely placed to explore associations over time to determine if changes in neighbourhood walkability causes people to do more or less transport and recreational walking.
Finally, it is possible that the association between different types of walking and LUM and other design characteristics could vary by different scale [24, 51, 52]. RESIDE used a 1600 m service area to define a person's neighbourhood because theoretically, it represents how far a participant could walk from their house at 'moderate' intensity pace within 15 minutes, half the recommended daily physical activity for an adult . Future research should explore variations in LUM computations at different scales and consider the use of Global Positioning Systems (GPS) units to examine variation in both size and shape of participant's neighbourhoods and the effect this has on the association between LUM and walking behaviour. Another area of future research could involve examining thresholds for the components of WIs in different areas. At this stage, cut points for WI quartiles are sample-specific. To enable study comparisons, pooled data from different areas would enable cut points to be established.
Although RESIDE is a quasi experimental study, the data presented are cross-sectional and causality cannot be inferred. A number of GIS-related limitations mentioned in the discussion are relevant to this study. The land use base data may not accurately represent what is actually present in the environment and was not assessed for its accuracy, which is a limitation. Furthermore, the allocation of land use to a single use prevents multi-use classifications and this could have resulted in LUM scores for some neighbourhoods being under-represented. Moreover, as RESIDE participants are people building new homes, they are not representative of the general population. As they were selected from people building homes across the entire metropolitan area, they are however, likely to be representative of new home buyers. In low density car dependent cities seen in Australia and the US, walking and cycling are likely to make a smaller contribution to total physical activity compared with (say) Europe. This will limit the associations observed, and thus there is a need for global thresholds of components of WIs to enable comparisons across countries. Finally, the limitations of using self-report physical activity data are well documented .