A burgeoning literature links attributes of neighbourhoods’ built environments to residents’ physical activity, food choices, weight, and/or obesity risk. While these studies do not necessarily view the relationship as causal, it is sometimes implied. If the neighbourhood built environment influences residents’ physical activity, food choices, and/or weight, then changing the built environment may be an important public policy tool that could help reduce Americans’ rising overweight and obesity risk. But what if people choose to live in neighbourhoods that support their dietary and physical activity preferences? This latter view has recently been espoused by land-use developers . Different public policy implications would arise depending upon which mechanism is correct. Do environments affect weight or are weight and residential selection simultaneously determined?
Cross-sectional studies are especially disadvantaged in their ability to draw conclusions about causal relationships between neighbourhood environments and overweight/obesity risk. Analyses of non-experimental data gathered at a single point in time have the potential to contain residential self-selection biases . As such, they may misstate the underlying causal relationship between neighbourhood environments and health-related outcomes such as physical activity, transportation mode choices, dietary intake, and/or healthy body weight. Although authors typically note this cross-sectional limitation [3–5], rarely do they invoke any of the statistical techniques designed to adjust for such self-selection.
Cross-sectional estimates of the association between neighbourhood walkability (measured by a range of variables) and BMI are typically small and statistically significant. For instance, in an analysis of adolescents’ BMI, Ewing and his colleagues find that a one unit increase in a county level sprawl index (i.e., a change toward more compact development) is significantly associated with a .003 decline in an adolescent’s risk of being overweight, holding other factors constant . These very modest sprawl effect sizes are also found in studies of adults [7, 8]. While the estimated effects of neighbourhood characteristics on BMI are small, the relationships are nonetheless important from the perspective of policymakers as changes in neighbourhood characteristics have the potential to affect the weight of thousands of residents.
Researchers typically acknowledge that residential selection may confound estimates of the causal relationship between the built environment and behaviours associated with healthy body weight. When available, they exploit the time ordering of longitudinal data to generate improved estimates of causal effects. The results of these longitudinal studies are mixed. Some studies find little or no evidence of a causal relationship between the built environment and physical activity or healthy body weight [9–11] while others find evidence of a reciprocal causal relationship, supporting both environmental and selection influences [7, 8, 12–16]. Investigations that compare cross-sectional analyses with longitudinal assessments find that statistical relationships between the built environment and physical activity or healthy body weight, sometimes change from significant to insignificant or vice versa when moving from cross-sectional to longitudinal analyses [6, 10, 17, 18]. This mixed evidence may reflect the small, sometimes idiosyncratic samples that often form the basis of these investigations. Regardless of the reasons, the existing longitudinal studies provide no clear consensus on the causal relationship between the built environment and physical activity, transportation mode choice, and/or healthy body weight.
Perhaps not surprisingly, there are a large number of cross-sectional studies that investigate the built environment and various outcomes related to healthy weight. According to recent reviews, few of these studies adopt procedures to assess the effects of residential selection [19, 20]. The few cross-sectional studies that do make adjustments adopt one of two general strategies. The first strategy is to use information about residential preferences to disentangle the cross-sectional relationships. Most often this information is acquired through survey questions (e.g., asking about the importance of having stores within walking distance of one’s home). Variables measuring these preferences are included as controls in the empirical work that relates features of the built environment to the outcomes of interest [18, 21–27]. Infrequently, researchers have attempted to adjust for residential preferences by controlling for unobserved heterogeneity [28, 29] or by comparing the estimates for individuals who can act on their residential preferences (e.g., young adults) to individuals who are far less able to act on their preferences (e.g., adolescents) .
Most of the studies that make use of preference information are focused on questions regarding how the built environment affects transportation choices [24, 25, 29, 31] and consequently they are only marginally relevant to our outcome of interest. More germane to the question at hand are studies where the outcome is physical activity and/or some measure of healthy body weight. These studies report that the relationship between the built environment and physical activity/BMI declines in magnitude and statistical significance once one adjusts for preferences [14, 23, 28, 30].
Results of studies that rely on direct questions represent some progress in addressing the residential selection issue. But, as Mokhtarian and Cao  highlight, such surveys may be trading off smaller sample size for greater detail on residential preferences. In addition, direct questions used to map residential preferences may generate new sources of bias if respondents’ answers are prone to error because post-relocation preferences are distorted by memory, dissonance reduction, and/or social desirability, or if preferences are endogenous with residential selection.
Often researchers working with cross-sectional data do not have measures of residential preferences or they may conclude that the measures they do have are subject to the measurement biases described above. In those instances, analysts turn to statistical strategies to control for residential selection bias. Several statistical methods have been proposed to make this adjustment in the cross-section including propensity scores and structural equation modelling [12, 29, 32]. Propensity scores create equivalent groups of “treatment” and “control” individuals by matching groups on multiple sources of differences under the assumption that there is no correlation between the unobservable characteristics and the outcome of interest (i.e., BMI). Structural equations modelling, often utilizing a two-stage least squares or a full-information maximum likelihood approach, corrects for selection by the use of variables called instruments. By definition, the instruments must be variables that relate to proposed predictors, such as choice of neighbourhood, but not to outcomes, such as BMI. Both approaches have advantages and disadvantages [32, 33] and the approach selected is often based on data availability.
In the current study, we build on the existing literature that makes use of cross-sectional data to assess the causal effect of neighbourhood characteristics on BMI by incorporating corrections for residential selection using an instrumental variables modelling approach. Specifically, we make use of two-stage least squares techniques to adjust for the possible endogeneity of neighbourhood selection and BMI. We ask whether and to what extent controlling for the effect of residential selection alters our estimates of neighbourhood walkability effects on BMI and overweight/obesity risk. We discuss the implications of our findings for researchers and policymakers concerned about reducing Americans’ overweight/obesity risk.