Identifying state-level policy and provision domains for physical education and physical activity in high school

Background It is important to quickly and efficiently identify policies that are effective at changing behavior; therefore, we must be able to quantify and evaluate the effect of those policies and of changes to those policies. The purpose of this study was to develop state-level physical education (PE) and physical activity (PA) policy domain scores at the high-school level. Policy domain scores were developed with a focus on measuring policy change. Methods Exploratory factor analysis was used to group items from the state-level School Health Policies and Programs Study (SHPPS) into policy domains. Items that related to PA or PE at the High School level were identified from the 7 SHPPS health program surveys. Data from 2000 and 2006 were used in the factor analysis. RESULTS: From the 98 items identified, 17 policy domains were extracted. Average policy domain change scores were positive for 12 policy domains, with the largest increases for “Discouraging PA as Punishment”, “Collaboration”, and “Staff Development Opportunities”. On average, states increased scores in 4.94 ± 2.76 policy domains, decreased in 3.53 ± 2.03, and had no change in 7.69 ± 2.09 policy domains. Significant correlations were found between several policy domain scores. Conclusions Quantifying policy change and its impact is integral to the policy making and revision process. Our results build on previous research offering a way to examine changes in state-level policies related to PE and PA of high-school students and the faculty and staff who serve them. This work provides methods for combining state-level policies relevant to PE or PA in youth for studies of their impact.


Background
In the United States, state and local governments have far-reaching responsibilities for public schools and the youth attending those schools, including their health and welfare. In recent years growing concerns about the epidemic of childhood obesity and low levels of physical activity (PA) have prompted the establishment of a large number of legislative and regulatory actions that aim to, directly or indirectly, increase PA in schools. In 2011, 41 states and the District of Columbia (DC) had legislation introduced that was related to PA or Physical Education (PE) in schools (Database of State Legislative and Regulatory Action to Prevent Obesity and Improve Nutrition and Physical Activity, accessed Jan 2012). While previous research has shown that some state-level legislation and local policies are positively related to PE time and PA levels of students [1][2][3][4][5] there is little empirical support for many of the legislative actions that are pending or have been enacted. This includes support for legislative action directly related to PA (e.g. allowing community access to school playgrounds and field) and legislation more peripheral to PA levels (e.g. creating a model framework for teacher and principal evaluation instruments or requiring public meetings about education issues). Without evidence for effectiveness it is not known which policy actions are useful and which are ineffectual, placing an undue burden on a system with limited resources.
As state budgets tighten, it becomes increasingly important to quickly and efficiently identify policies that are effective. This requires methods to quantify policies and policy change in a meaningful way to allow careful evaluation of implemented policies. This measurement task is difficult due to the large numbers and types of policies, many of which are strongly related to each other in terms of their specific goal, target behavior and/ or agent of change. While policies can be evaluated one-by-one, it seems obvious that related policies will interact with each other in real life settings and that examining each policy individually could yield misleading results. Indeed, previous research in this area has suggested that due to the complexity and reach of statelevel legislation it may be more effective to evaluate changes in policy factors or domains defined as combinations of individual policies that may overlap and tend to change and act together [6].
We are aware of two systems or policy scoring mechanisms that have been developed to group and quantify school-level PA and/or PE policies [6,7]. One of these, the Physical Education and Recess State Policy Classification System (PERSPCS), was developed to access the "nature and extent" of state-level PE statutes and regulations in six areas: PE time, PA time, staffing, curriculum, assessment and recess [7]. The system uses a rating scale (e.g. 0 to 4) that allows each policy area to be "graded" based on the strength, specificity and comprehensiveness of the legislation. Summary and area specific (e.g. PE Time, curriculum) scores can be computed for elementary, middle, and high school levels and for all grade levels combined. Currently, state-level ratings are available from 2003 to 2008 and 2010. While the development of this system was an important move forward, it is somewhat limited in scope, covering only a few policy domains, and may require specialized legal training to grade policy areas accurately.
A second policy scoring system was developed as a comprehensive measure of state-level, school-based obesity prevention policies using data collected as part of the 2006 School Health Policies and Programs Study (SHPPS; [6]). At the state-level, the purpose of SHPPS is to provide data that can be used to describe policies and programs from seven school health program components. Nanney et al. created a PA policy scoring system using 146 items from the PA and PE components of SHPPS. These items were grouped into 10 policy domains using principal components factor analysis, expert opinion, and the relationships among items and policy domains. This approach capitalized on the large number of policy and provision items to construct policy domain scores that combined multiple items to create robust measures of important policy areas. Policy domain specific and an overall summary score were computed using the proportion of policies characterized as "required" (score = 1). Despite several strengths, the system lacks grade specific policy domain scores, which are useful because PE requirements and implementation are different across grade levels. In addition the policy domain scores were developed using only items from the 2006 version of the SHPPS survey, making it difficult to use them to evaluate the frequency and impact of policy change if item content and response options change from one administration to the next.
In this paper we build upon this previous research to develop state-level high-school PE/PA policy domain scores specifically designed with a focus on policy change. We use information from both the 2000 and 2006 SHPPS surveys to identify the policy domains that can be used to assess change over that period. We describe a set of policy domain scores that can be computed using surveillance data collected as part of the SHPPS survey and present State-level policy domain scores and change. Exploratory factor analysis was used to identify groups of items or variables that were statistically related and together represented a concept or domain of interest. Items that grouped together have shared variance and can be combined, or modeled, as a single variable. This combination of information from multiple related items generally results in more robust variables and simplified statistical models that are representative of the relationships among the individual items but easier to interpret and apply to processes like policy evaluation.

and 2006 SHPPS data
Data for this study are from the 2000 and 2006 SHPPS [8][9][10][11]. This national survey is conducted by the Centers for Disease Control and Prevention every 6 years, and is designed to collect information on school health policies (e.g. Has your state adopted a policy…) and practices (e.g. Has your state provided funding or offered…) at the state, district, school, and classroom levels. For this work we use only state-level data for high schools. Although SHPPS provides data for many grade levels, this analysis was limited to high school to allow for future comparisons with the PA YRBS data, which is only available for high school students.
In the SHPPS survey, "policy" is defined as: "any law, rule, regulation, administrative order, or similar kind of mandate issued by the state board of education, state legislature, or other state agency with authority over schools in your state." SHPPS data were collected through computer-assisted telephone interviews or self-administered mailed questionnaires from state personnel who are considered most knowledgeable about the relevant policy area. In 87% of states, the PE component of the survey was completed by the self-identified coordinator of PE. All states and the District of Columbia (included in the term "states" from here on) participated in SHPPS in both 2000 and 2006.
SHPPS contains 7 health program component surveys: 1. Faculty and Staff Health Promotion; 2. School Policy and Environment; 3. Food Service; 4. Health Education; 5. Health Services; 6. Mental Health; and Social Services; 7. Physical Education. For this project, items from all 7 surveys were examined to identify questions that related to PA or PE at the high school level. In total, 151 items were identified (see Figure 1). Items were compared between the 2000 and 2006 surveys to ensure that policy domain change scores could be computed. Items were checked for wording (removal or addition of information), format, and response options at both time points. Of the items identified, 104 were sufficiently similar in the 2000 and 2006 surveys that they could be matched for the purpose of calculating change scores. Three of these items were found to have irrelevant or redundant information. In addition, we decided that the information from three pairs of questions (6 items) that were connected through skip patterns should be combined to create 3 items with 3-levels each (NO, Recommend, Require). These questions asked respondents if the state had a written policy about some topic, if they answered "YES" they were classified as having a policy. If they answered "NO" a follow-up question was asked about recommendation of this topic (YES/NO). Responses to the 98 items collected in the 51 states were used for this analysis.

State-level policy domains
Policy domains were developed using the results from several exploratory factor analysis models, item grouping from the SHPPS survey, and item/scale psychometrics. Analyses were conducted separately for data from the 2000 and 2006 SHPPS using available information from all 51 states. A summary of item selection, item grouping, and the final policy domains can be found in Figure 1.
All items were scored on a two (NO/YES) or three (NO, Recommend, Require) level scale. Details on the SHPPS scoring system are available in the technical documentation for the survey [12]. For the purpose of this project items were scored 0 for no policy or 1 for presence of a policy. Several items included a middle category, recommend/encourage; this was scored 0.5 to simplify the creation of factors. The ratio of the sample (51) to items (98) was small, which could reduce stability of the exploratory factor analysis results [13]. Therefore, we initially grouped items based on the structure of the SHPPS survey and previous research [6]. The grouping resulted in 12 exploratory factor analysis models with sample to item ratios ranging from 2.5:1 to 10:1. It was expected that by increasing this ratio the results of the exploratory factor analysis would be more stable.
One of the goals of this project was to develop policy domains that could be used to examine policy change.
To ensure this, decisions about item retention and factor selection were done systematically using both sets of results (2000 and 2006). The final factors from SHPPS 2000 contained the same set of items as the final factors from SHPPS 2006. During this process the exploratory factor analysis for a group of items was conducted in both samples. Results were then compared. Any item with no factor loading (correlation between the factor and the variable) greater than 0.40 in either sample was removed, and the exploratory factor analysis was repeated. The next steps involved identifying individual items that that did not fit well at one of the time points. These items were removed individually with the rule that final factors had to have the same items in both data sets. Most items were excluded due to low factor loadings (< 0.40) or large cross loadings (correlation with another factor) (> 0.40). For several factors, the final models produced estimates with negative error variance for an item. While not ideal, the occurrence of Heywood cases, items with negative variance estimates, is not unexpected given the size of the sample [13]. In each of these cases the final model and items were inspected for over-factoring and relationships among the items were examined using correlations, Cronbach's alpha, and itemtotal correlations. All exploratory factor analyses were conducted using a robust weighted least squares estimator (WLSMV), Geomin rotation, and variables classified as categorical. MPLUS v6 was used for these analyses.

State-level policy domain changes
Summaries and comparisons of policy domain and policy domain change scores were estimated using SAS v9.2. Scores were computed for each policy domain using the 2000 and 2006 data and Cronbach's alpha was calculated [14]. Policy domain change scores were computed as (score 2006score 2000), and were considered "no change" from 2000 to 2006 if the value changed by less than 20% of the policy domain change score standard deviation. Most states with no change had policy domain change scores of 0.

State-level policy domains
Based on results from the exploratory factor analysis 17 policy domains were extracted using 83 of the original 98 items selected (Table 1 and Figure 1). Sample sizes for the exploratory factor analyses ranged from 45 to 51 states, with 75% including at least 49 states. Three items did not have any variation in the 2000 sample, but were found to be significant in the model for 2006. These items were retained for their respective policy domain scores. Four of the final policy domains included only 2 items each, while 8 policy domains contain 5 or more items each. A complete list of the items in each policy domain is provided in Additional file 1.
Cronbach's alpha ranged from a low of 0.54 for "Exemptions from PE: religious or disability" (PD3) in 2000 to a high of 0.99 for "Goals and Objectives for PE" in 2000. About 67% of the policy domains had alpha values greater than 0.75 and all but one alpha was greater than 0.60. On average the alphas only differed slightly between years, 0.07 units, with 10 higher in 2000 and 7 higher in 2006. The largest difference between alphas at the two time points was about 0.2 units for "Exemptions from PE: religious" and "Provide PE information". The alpha for "Physical Activity Promotion for Staff" could not be computed in 2000 because two of the three items had zero variance. Average policy domain change scores were positive for 12 policy domains, with the largest increases for "Discouraging PA as Punishment", "Collaboration", and "Staff Development Opportunities". Using our criteria for meaningful change (at least 20% of the policy domain change score standard deviation) the average policy domain scores for only 6 policy domains changed from 2000 to 2006. Each of these policy domain scores increased overall, but did not increase in all states. In Figure 2 we show that the number of states that increased, decreased, or had no change from 2000 to 2006, varied considerably across the 17 Policy domains. For "Physical activity promotion for faculty and staff" 43 out of 49 states had no change, while 31 states showed an  increase in "Collaboration". At least 20 states also showed increases in "Provide PE information or materials" (PD7), "Staff development opportunities" (PD10), "Discourage physical activity as punishment" (PD8), and "Standards and compliance for PE" (PD11).

Discussion
Quantifying policy change and its impact is integral to the policy making and revision process. Building on previous work in this area, the results of this study were used to identify a set of 17 policy domains. They were developed to be specific to high-schools and to contain the same information over time, enhancing our ability to examine change in policy. Data from two administrations of the SHPPS survey (2000 and 2006), a national policy surveillance instrument, were used. The resulting policy domain scores can be applied during the evaluation process to summarize policy change related to student behavior and will be useful in gaining a better understanding of the similarities and differences among specific policies and provisions for PA and PE. In addition, it will be interesting to see how policy change progresses in each policy domain by applying these results to data from the 2012 administration of the SHPPS survey.

State-level policy domains
Previous work in this area provided guidance in developing state-level PE and PA policy domains. In their work, Nanney and colleagues identified 10 policy domains using state-level policy and practice data for elementary, middle, junior, and senior high schools from SHPPS 2006 [6]. Nine could be applied to senior high schools (walking to school was not applicable for high schools). Of these, five are similar to those identified in the current study. Three are nearly identical (Physical Activity as Punishment (PD8), Protective Gear (PD5), and    Adaptive PE (PD9)), while Testing (PD12) and Collaboration (PD2) are similar to the Assessment and Collaboration policy domains identified by Nanney et al. (2010), but contain fewer items. The difference in items is primarily due to the fact that in the previous study, items that applied to elementary, middle, and junior high school were included in the policy domain development.
While some items and policy domains will be similar across grade level, we feel that grade-specific policy domain scores are useful for several reasons. First, PE requirements and implementation are quite different across grade levels. This means that while PE policies may be related for middle-and high-schools they are likely not the same. Therefore, a state-level policy domain score for "standards" that includes all grades may not truly reflect the strength or weakness in policy at a given grade level, making it more difficult to assess policy impact. Second, the available data on PE and PA participation for different aged students are often collected in different ways (e.g. High Schools collect self-report like the YRBS; elementary schools rely on observation or proxy report). This makes it difficult to compute the state-level behavioral outcomes needed for comparison to a general (all-grade levels) policy domain score. Finally, differentiation of policy effects may be particularly important during different developmental periods. For example, requiring more PE or PA in school may be most beneficial during early to middle adolescents when overall activity levels decline more rapidly, especially in girls [15].
Having only general policy domain scores would make it hard, if not impossible, to identify potentially important effects of policy change during these influential periods. The final two policy domains identified by Nanney et al., Standards and Training, included a large number of items. In our work several smaller, more specific policy domains were identified within these larger groups of items. For example, the previous study created one training policy domain with 38 items, including 27 related to high school. Our analysis suggested that they should be separated into policy domains related to "PE certification" (PD15, PD16), "Coaches training" (PD2), and "Staff development" (PD10). Looking at our correlational and state-level change results it seems that these policy domains are distinct. For the Standards policy domain Nanney and Colleagues identified 35 items, 10 of which apply to High School. Our results suggest that these items may not represent a single policy domain, but rather, "General PE standards" (PD11), "PE goals" (PD13), and "PE teaching/time requirements" (PD17). In our correlational results, "General PE standards" and "PE Goals" had the strongest relationship (r~0.75). This suggests that these policy domains might be combined. Given the other data available, like item content, scatter plots, and policy domain change scores, it is difficult to tell if these factors should be merged or if they represent separate ideas and actions that are related but need to be differentiated. At this time we suggest that these policy domains be studied separately. Future research may show that these policy domains are related to behavioral outcomes or legislative change in similar ways, but for now they should be treated as distinct.

State-level policy domain changes
Averaged over all states, 11  While the average policy domain score results are similar, our data showed more variation between states. In our sample, every state changed on at least 4 policy domains with most having substantial change on at least 8 policy domain scores. The difference between the PERSPCS data and our results is likely related to differences in data collection and content coverage.
The PERSPCS data and scoring focus on laws and regulations in six key areas which were systematically scored by trained researchers. In contrast, SHPPS data were self-reported, and covered a greater number of policy domains and included more policy and provision items. Often, important changes in policies and provisions for PA in high schools may be implemented without specific changes to state laws and regulations. When this occurs the PERSPCS system is unlikely to detect change. It should also be noted that while one study has concluded that reliability and validity evidence for the SHPPS data is acceptable [16], measurement error could be inflating the amount of change estimated in the new policy domains. At this point it is safe to say that both scoring systems are important to understanding the relationship between policy and PA. Future research should help to pinpoint where each is most useful and how policy domain scores from each relate to behavioral outcomes.

Limitations
This research study benefited from the comprehensiveness of the data collected in the SHPPS survey, but the number of items compared to the number of respondents was less than ideal for factor development. This is the primary reason we conducted several smaller exploratory factor analysis models and used expert judgment and inter-item relationships when making final decisions about a specific policy domain or a questionable item. With only 51 possible respondents the robustness and usefulness of some domains could be questioned. We also recognize that the correlations between combinations of policies can be influenced by unmeasured policies or other unmeasured attributes. This type of problem is not unique to this analysis, but analyses of numerous combined policies in this area of study are relatively new, and important sources of bias and confounding may not yet be fully understood. We suggest that researchers continue to search for variables that influence associations between policies and their targets and that the policy domains proposed here be reevaluated after the SHPPS survey is re-administered in 2012.

Conclusions
Examining the effects of policy change on their intended targets is a major part of the policy evaluation-revision cycle. This research supports this type of future work by providing a means of examining changes in state-level policy domains related to PE and PA of high-school students and the faculty and staff that serve them. The results build on previous research to offer a new way to examine the effects of policy change on behaviors. Future research should to connect policy change not only to PE, but also overall PA, and to provide guidance to policy makers who seek ways to promote PA and health in children.