Measurement of availability and accessibility of food among youth: a systematic review of methodological studies

Background Comprehensive and psychometrically tested measures of availability and accessibility of food are needed in order to explore availability and accessibility as determinants and predictors of dietary behaviors. The main aim of this systematic review was to update the evidence regarding the psychometric properties of measures of food availability and accessibility among youth. A secondary objective was to assess how availability and accessibility were conceptualized in the included studies. Methods A systematic literature search was conducted using Medline, Embase, PsycINFO and Web of Science. Methodological studies published between January 2010 and March 2016 and reporting on at least one psychometric property of a measure of availability and/or accessibility of food among youth were included. Two reviewers independently extracted data and assessed study quality. Existing criteria were used to interpret reliability and validity parameters. Results A total of 20 studies were included. While 16 studies included measures of food availability, three included measures of both availability and accessibility; one study included a measure of accessibility only. Different conceptualizations of availability and accessibility were used across the studies. The measures aimed at assessing availability and/or accessibility in the home environment (n = 11), the school (n = 4), stores (n = 3), childcare/early care and education services (n = 2) and restaurants (n = 1). Most studies followed systematic steps in the development of the measures. The most common psychometrics tested for these measures were test-retest reliability and criterion validity. The majority of the measures had satisfactory evidence of reliability and/or validity. None of the included studies assessed the responsiveness of the measures. Conclusions The review identified several measures of food availability or accessibility among youth with satisfactory evidence of reliability and/or validity. Findings indicate a need for more studies including measures of accessibility and addressing its conceptualization. More testing of some of the identified measures in different population groups is also warranted, as is the development of more measures of food availability and accessibility in the broader environment such as the neighborhood food environment.


Background
The promotion of healthy dietary behaviors among youth is pivotal for the prevention of overweight/obesity as well as non-communicable diseases [1,2]; dietary behaviors learned in childhood are also found to track into adulthood [3]. To develop effective interventions targeting different dietary behaviors, it is imperative to understand the important correlates of these behaviors. Examples of such correlates include the availability and accessibility of food, as well as other factors such as self-efficacy, food preferences, parental modeling and parental rules. Availability and accessibility of foods as potential determinants of food choice and dietary intake are recognized in most up-todate theories aiming to predict and/or explain health behaviors including dietary behaviors. For example, social-ecological theories of health behavior [4] posit that the physical and social environment we live in importantly influences our health behaviors. The food environment children and adolescents live in -especially the home and school environments-define what foods are available and accessible to them. A further detailing and specification of social cognitive theory for energy balance-related behaviors including dietary behavior, was proposed by Kremers et al. [5]. They argue that the influence of such environmental factors -including availability and accessibilitymay be mediated and moderated by individual level, social and demographic determinants such as intentions, preferences, self-efficacy, and social environmental factors such as socioeconomic position, parenting, and modeling. Most major theoretical models aiming to explain food choice and dietary behaviors nowadays directly or indirectly recognize the importance of and interplay between physical environmental factors -such as availability and accessibility of foods-, social environmental factors and personal factors as important drivers of food choice and dietary behavior, which is why further insight in and overview of the measurement qualities of measures to assess these issues, is of importance. However, existing evidence suggests that there is a high variation in the conceptualization of correlates and determinants of dietary behaviors, as well as a common use of measurement instruments whose psychometric properties are not tested [6,7]. These issues are problematic for several reasons. The first is the difficulty to identify important determinants of dietary behaviors due to the presence of significant measurement errors. These errors might be particularly pronounced in studies involving children due to varying cognitive development that might affect comprehension and recall of the construct in question. The second is the inability to compare findings across different populations and settings when different measures are used to explore the same correlate.
Availability and accessibility of foods are among the correlates most consistently associated with dietary behaviors among youth [8][9][10][11]. In addition, their importance in explaining socioeconomic differences in dietary behaviors has been evidenced by several studies using formal tests of mediation [12][13][14]. However, the conceptualization of these constructs has not always been uniform, in particular in relation to the concept of accessibility of foods. Availability is related to the physical presence of food; this can include foods offered/ served in different settings. In a recent Delphi study aimed at clarifying food parenting practices related to snacking, different descriptions were given by experts in relation to availability including having food at home, offering food, serving food and making sure foods are prepared [15]. Accessibility on the other hand has been defined as reflecting whether foods are available in a form and location that facilitate their consumption [16]. The need for the food to be retrievable and ready to eat has also been highlighted [17]. In the aforementioned Delphi study, the following descriptions were put forward by experts as being related to the accessibility of snacks: "storing snacks in a location the child cannot access on his or her own", "not giving the child money for snacks at school", "avoiding going to shops where snacks are available", "putting snacks on the table all day" etc. [15]. These conceptualizations show the different dimensions of these constructs and the need to consider these while looking at instruments measuring these concepts and while summarizing evidence related to the role of these correlates in influencing dietary behaviors.
Previous reviews have looked at measurement properties of correlates of dietary behaviors among youth including availability and/or accessibility of food [18][19][20]. Findings indicate that several measures of availability and accessibility of food, in particular related to the home environment, do exist. While evidence of reliability exists for several of these instruments, a lack of validity assessment was documented across reviews. The present review includes studies published from 2010 onwards not included in previous reviews of studies exploring the psychometric properties of measures of the availability and/or accessibility of food among youth. Unlike the previous reviews, the focus is only on methodological studies, and, in addition to summarizing the psychometric properties of the instruments, the review will also describe differences in the conceptualization of these correlates across studies. Providing such an overview of existing measures and their psychometric properties will help to avoid unnecessary replication of existing measures and help identify gaps in the measurement of the constructs of interest.

Search strategy
The systematic steps outlined in the PRISMA guidelines were used in this review [21]. The studies of interest were those reporting on the psychometric properties of measures assessing the availability and/or accessibility of foods/drinks among youth. The location could be at home, at school and in the neighborhood (e.g. stores). Therefore, the search was conducted by combining, using the "AND" Boolean operator, five main groups of keywords: keywords for dietary behaviors (e.g. food habits, dietary habits, dietary behavior), keywords for the correlates (e.g. availability, accessibility), keywords for psychometric properties (e.g. validity, reliability), keywords for methods used (e.g. survey, questionnaire) and keywords for the population of interest (e.g. children, adolescents, youth). Within each of these categories, keywords were combined using the "OR" Boolean operator. The search strategy is available from the corresponding author upon request. The following databases were searched for relevant articles using keywords and Medical Subject Headings: Medline, Embase, PsycINFO and Web of Science. In addition, reference lists of relevant publications were manually searched.

Inclusion criteria
The following inclusion criteria were used in this review: i) methodological studies where the aim/one of the aims of the study is to evaluate at least one measurement property of an instrument measuring accessibility and/or availability of food, ii) the measurement instrument relates to the availability and accessibility of food in one or more of different food environments of youth (0-18 years), iii) studies published in English in peer-reviewed journals, iv) studies published between January 2010 and March 2016.

Identification of relevant studies and data extraction
Titles and abstracts of retrieved studies were screened to assess whether inclusion criteria were met. When the abstract was considered insufficient to make conclusions about inclusion, the full text was screened. A standardized form for extraction of detailed data from each included study was developed. The data extracted included study and sample description (including year of study publication, country, age of participants, gender composition, sample size, socioeconomic background (when available)). Information about the measures used was also extracted and included the type of construct assessed (availability and/or accessibility), the name (if any) of instrument, the type of instrument (e.g. self-report questionnaire, inventory, observation form), the number of items included and the methods for item development.
Information on the conceptualization of availability and accessibility was extracted, when explicitly presented in the studies. Information regarding internal consistency, test-retest and inter-rater reliability, as well as content (including face), construct and criterion validity was also extracted when and where available. Responsiveness, which refers to the sensitivity of the measure for the assessment of change [22], was also of interest. Results of test-retest and inter-rater reliability, often expressed as correlation coefficients, were interpreted using Landis & Koch's criteria: slight (r = 0.00-0.19); fair (r = 0.20-0.39); moderate (r = 0.40-0.59); substantial (r = 0.60-0.79); and almost perfect (r = 0.80-1.0) reliability. Kappa values were similarly interpreted [23].
Internal consistency reliability was defined as adequate when Cronbach's alpha coefficient above 0.6 was reported [24]. It was also considered adequate if exploratory factor analysis was conducted [25].
Face and content validity: face validity is an aspect of content validity and involves a subjective assessment with no specific standards as to how it should be assessed and cannot be quantified [22]. When the study sample was clearly described, and a clear conceptualization of the measure was provided or a previously validated instrument was used or the item development and refinement is clearly presented, the instrument was described as having face validity. Use of independent experts is an aspect of the assessment of content validity [22]; instruments in studies where experts were used in the development or refinement phase of development were therefore considered to have content validity.
Criterion validity and construct validity can be quantified using different parameters such as correlation coefficients or percentage of agreement values. For these constructs, a correlation of 0.3 or above was considered acceptable [26,27]; a correlation of above ≥ 0.7 was considered very good [27]. When percentage agreement values were used, agreement levels were categorized as follows: "good to excellent" (>75%), "moderate" (60-74%), or "poor" (< 60%) [28].

Assessment of study quality
In addition to the quality criteria for the assessment of reliability and validity described above, assessment of the quality of the methods of item development and refinement was done.
The assessment was based on how systematic the process of item development was, including the methods that were used in the process of item development (e.g. use of items from existing instruments, use of expert opinion, use of theory, use of existing literature, use of qualitative methods etc.). It also included an assessment of whether any method was used for item refinement (e.g. pilot testing, cognitive interviews or use of experts). The following scores were given: 4 = fully systematic process of item development and use of at least one method of item refinement; 3 = fully systematic process used for item development but no method reported for item refinement OR process not fully systematic but item refinement was done; 2 = process of item development was not fully systematic and no method reported for item refinement; 1 = no systematic process was reported for the development or refinement of items. This grading was modified (to fit the type of studies included in the present review (i.e. methodological studies only) and the constructs assessed) from the grading developed by Vaughn et al. [29].
Two researchers (MKG, CV) independently extracted data and assessed study quality; discrepancies were resolved through discussion.
Characteristics of measures/instruments assessing availability and accessibility Table 1 (columns 4-6) shows the characteristics of measures/instruments included. Different types of instruments were used. Over half of the studies included self-report questionnaires; other types of instruments included were: observational tools [36,38,40,43], checklists [35,42],  inventories [30,37] and a telephone interview [39]. The number of items included in the final instruments varied significantly between studies. Some studies used broader categories of foods and/or drinks with subcategories although reliability and validity were in some cases reported at the food category level only. For example, availability of different types of fruits was assessed by Hearst et al.; reliability was however reported at the category level (i.e. fruit) [35]. In many of the included studies, included instruments also measured other correlates of dietary behaviors; some also included measurements of the physical activity environment.

Conceptualization of availability and accessibility
With regards to availability, measured in 18 studies, participants were either asked to report whether a certain type of food/drink was present at the time of data collection by selecting options such as yes or no, or by reporting the usual presence of foods. In some studies, the participants were asked whether the food item of interest was served/offered or sold in specified places, e.g. canteens, vending machines etc. In three studies, operational definitions of availability were provided. In the study by Izumi et al., fruits and vegetables were considered to be available if a single portion of the item was present in the store in a ready-to-eat form; for some items, if at least one variety met the criteria, the item was considered available [42]. Nepper et al. defined availability as the existence of food in different locations regardless of whether it is readily visible or accessible to the child [37]. Boles et al. defined availability as whether food is physically located within the home [38]. In the study by Petty et al., the scale assessing availability of fruits and vegetables at home using parental report included items regarding parental consumption [34]. Accessibility was also defined in different manners in the four studies with measures of accessibility. Hearst et al. measured accessibility by exploring whether the food items were present on the kitchen counter or visible when opening the refrigerator door [35]. Nepper et al. used the following definition for accessibility: "a food that is retrievable, ready to eat, or in a location where it is easy for a child to reach it" [37]. In the study by Boles et al. accessibility was defined as whether the child could reach the food [38]. Accessibility was measured in terms of beliefs about family's affordance of food items; ability to eat everything in reasonable amount and ease of access to a variety of food items in the study by Bennaroch et al. [32].

Item development and refinement including quality assessment
Table 1 (column 7) provides details of the item development and refinement for each included study. Many of the included studies provided a clear description of how the items were chosen or developed and followed a systematic step in item development and refinement. Twelve of the included studies received the maximum quality score of 4 for item development and refinement [30-33, 36, 38, 39, 41-43, 46, 48], 6 studies received a score of 3 [35,40,44,45,47,49] and two studies received a score of 2 [34,37]. Methods used in the development or refinement of items included using or building on available instruments [30, 34, 36-38, 41, 45, 46, 49], literature review [31,32,39,41,42,47], expert opinion [30-33, 36, 38, 39, 46, 48] and use of qualitative methods [31]. Several studies combined two or more of these methods.

Assessment of reliability and validity
Reliability assessment Table 2 presents the results of reliability analysis in the included studies. Reliability assessment was conducted in 14 studies. Seven studies assessed test-retest reliability [31,33,34,37,[44][45][46]. The gap between the test and retest was 2 weeks in three studies [31,33,34], 1 week in two studies [37,45] and 2-4 weeks in one study [44]. Test-retest reliability was almost perfect in three studies [31,33,34]; it was moderate to substantial for all [44,45] or most [37] items in the other studies. Testretest reliability was slight to substantial in the study by Ward et al. [46], with most items having intra-class correlation (ICC) < 0.40 for 1-day ICC; 4-day ICC showed moderate to almost perfect reliability except for one item.
Six studies assessed internal consistency reliability [31-34, 37, 44]; an adequate internal consistency was reported in all studies where it was assessed except for 1 item in the study by Ding Ding et al. [44]. Six studies assessed inter-rater reliability [30,36,38,42,43,49]. Inter-rater reliability as measured by kappa revealed substantial agreement for all [30] or most [36,42] of the items included. Percentage agreement was moderate to good-to-excellent for most items in the studies where it was computed [43,49]. Items with inadequate kappa values (< 0.60) were removed in the study by Boles et al. [38].

Validity assessment
All of the studies either provided a clear description of the sample and provided a clear conceptualization of the measure, or provided a clear description of the item development and/or refinement, or used a previously validated measure. Therefore all of the measures were considered as addressing face validity. In addition, nine studies used experts in the development and/or refinement of survey items [30-33, 36, 38, 39, 46, 48]; the measures were therefore considered to have content validity.    Construct validity was assessed in six studies ( Table 2). Correlations of the measures with dietary behaviors were assessed in two studies [32,44] and associations were mostly weak (r <0.30). Petty et al. reported significant associations between the availability measure and dietary behaviors (reported as regression coefficients); convergent validity (comparing response with that of the other parent) was acceptable [34]. Acceptable construct validity was reported by Singh et al. (based on ICC and percentage agreement values) [45]. Dewar et al. explored factorial validity using confirmatory factor analysis and found fit indices that were a good or exact fit of the hypothesized model [31]. Boles et al. assessed the discriminant validity of their measure and found differences between obese and non-obese children for the availability and accessibility of vegetables but not for other included foods [38].
Criterion validity was assessed in ten studies ( Table 2). The criterion measure was either direct observation by trained research staff [30,[39][40][41]46] or comparison with responses reported by research staff trained to use the instrument [35,37,48]. In one study, digital photography was used in one sample and direct observation in the second sample [47]. Acceptable validity was achieved for all final items in some studies [30,40,41,48]. In other studies most items achieved adequate criterion validity [35,37,39,46,47].

Responsiveness
None of the included studies included an assessment of the responsiveness of the instrument.

Discussion
The review aimed to summarize the results of methodological studies exploring the psychometric properties of measures of the availability and/or accessibility of food among youth, and to describe differences in the conceptualization of these correlates across studies. A total of 20 studies were identified and only four assessed accessibility. There were differences in the conceptualization of the correlates between studies, in particular accessibility of food. Different assessments of reliability and accessibility were conducted in the included studies, and most measures were found to have adequate evidence of reliability and/or validity. None of the studies assessed responsiveness.

Samples included
Determination of sample size for studies of validity and reliability assessment has long been a matter of discussion and no strong consensus exists [50,51]. However, the sample size of some of the included studies [35,37] appears small even by the most liberal estimates [22]. Testing these instruments among a larger sample would therefore appear important. It is also important to report information about missing data and participation rate, which was not always the case in the included studies, as these might affect the interpretation of results. Many of the studies included participants or settings with different socioeconomic backgrounds although several studies had fairly homogeneous samples. Two studies focused on low income participants [35,40], another included participants with a predominantly low SEP [30]. Availability and accessibility of food are known to differ by socioeconomic position [52] and groups with low socioeconomic position often bear a disproportionate burden of overweight and obesity. More testing of instruments in specific socioeconomic subgroups or settings, and in particular in low socioeconomic groups and ethnic minorities, is thus needed. Studies, if powered to do so, could also look at similarities or differences between different socioeconomic subgroups within the same study.

Conceptualization of availability and accessibility
There were some variations in how availability was conceptualized in different studies, partly dependent on the setting explored. Availability and accessibility of food can be assessed in all the different environments that youth get exposed to (i.e. home, school or early care settings, stores or restaurants). Availability, which refers to the physical presence of food, should be overall conceptualized in the same manner across these settings. For the home environment, this implies that the food is present somewhere in the home; for schools, stores or restaurants, a food is available if sold or offered in these settings. However, further refinements should be made as appropriate to increase the feasibility of use of the measurement instrument as the type and volume of foods in these different settings vary significantly. In this regard, one of the issues that might arise with the assessment of availability is whether one should count all possible varieties of certain foods or whether there is some acceptable minimum number of varieties [53]. This is particularly relevant when environments such as food stores are assessed and when food types with a large number of categories are assessed. A clear definition of availability should be provided in such cases. There were also variations in the conceptualization of accessibility across studies. Accessibility can cover different aspects [15] including location (e.g. fruits and vegetables put on the table vs. the refrigerator), form (e.g. whole vs. sliced fruits) and cost. There is a need to uniformly define what is meant by accessibility of food in order to compare studies and make conclusions about the most important aspects of accessibility that influence dietary behaviors. For example, visibility has been used as a dimension of accessibility [35]; it has also been defined as a construct by itself, separate from accessibility [37]. Food could indeed be visible but not reachable. Even though visibility can increase a child's attention to food, its effect might be different to accessibility which included retrievable and reachable food [37] so this separation of constructs appears justified. Applying a theory or framework to identify relevant items to be included in the measurement instruments and conducting thorough analyses (e.g. factor analyses) to refine items when multiple items are used appears important. Another way to develop a thorough conceptualization of these concepts is the use of expert opinion, which was used in several studies, as well as the use of qualitative methods, which can also help generate theory. Due to the few studies that assessed accessibility of food, this review cannot provide further clarity to the conceptualization of accessibility.

Reliability and validity
In most of the included studies, there was a satisfactory evidence of internal consistency reliability which could among other things reflect the thorough process of development of items via the use of literature, expert opinion, theory and previously validated instruments. Evidence of test-retest reliability was also largely satisfactory although some variation between items was observed. Although there are specific cut points for the identification of reliability values that are acceptable, the item and the type of data collected should also be taken into consideration. Parameters of reliability such as the ICC are affected by inter-individual variation, being low in homogeneous samples [22]. This reflects the importance of assessing the ICC of an instrument in the same or comparable population to the one where it will be used [22]. The low reliability values for the measures of the availability of some foods might also indicate factors such as the day-to-day variability in the availability of those foods and not necessarily errors in reporting from the side of participants [46]. In the present review, the time gap between the two administrations for establishing test-retest reliability varied between studies, with a time gap of up to 4 weeks used. Although there is no single recommended time gap for the administration of instruments while assessing test-retest reliability, this time gap can have an effect on reliability estimates.
Previous reviews of measurement instruments of correlates of energy balance-related behaviors among youth have concluded that there was a limited assessment of criterion validity of measures, partly due to a lack of a gold standard [18,23,54,55]. The assessment of reliability and validity is influenced by different factors including the purpose for which the instrument is used. There are two pathways through which the food environment can affect behaviors: through perceptions of the environment or through actual physical characteristics of the environment [20]. The choice of measurement instrument and the type of validity assessed depends on which of these two aspects is explored. Perceptions are commonly measured using self-report instruments (e.g. surveys, interviews) [20]. The objective environment is best assessed via methods such as direct observation or documentation [20]; or via self-report instruments validated using methods such as direct observation (i.e. criterion methods). Several of the studies included in this review which developed measurement instruments of the availability of food assessed criterion validity. The criterion measures were direct observation by trained research staff, instrument filled in by trained staff and digital photography.. This assessment of validity is very valuable as it reflects the ability of these measures to provide a good assessment of the actual presence (i.e. availability) of foods, and measures with such evidence of validity [30,35,37,[39][40][41][46][47][48] should be recommended for use in future studies, or for further testing in other samples. However, criterion validity would not be useful when the interest is to explore perceptions. Unlike availability, which reflects the actual presence of foods, accessibility can involve the perception of whether what is available is actually easily accessible. For example, the assessment of the ease with which food can be consumed as well as the cost related to food, which might be components of food accessibility, are prone to differences in perceptions, making it difficult to use any criterion measure. In these cases, other measures of validity such as construct validity are more relevant, in particular when thorough empirical or theoretical information is used to define hypothesized associations. Construct validity was assessed in some studies included in this review with associations that varied from weak to strong. The weak or absent associations might reflect a genuine lack of association of the measures with the chosen outcome, in particular if the hypothesized associations were not based on thorough evidence or theory; they might also reflect lack of validity of the measures, of the hypothesized determinant (availability or accessibility) or of the outcome.

Responsiveness
Although the present review is limited to methodological studies, none of the included studies measured responsiveness. The lack of assessment of responsiveness is of concern, as evidence about sensitivity to change of such instruments is important, in particular for intervention studies. However, assessment of responsiveness requires time and resources as follow-up of participants is required.
Direct comparison of the psychometric properties of measures of food availability and accessibility within and across reviews is complicated due to the differences such as the type of measures, number of items, food types included and the age group targeted. Previous reviews have documented the presence of measures of food availability and accessibility among youth with one or more satisfactory psychometric properties tested in specific samples [18][19][20]. The present review similarly identified several systematically developed measures with adequate psychometric properties including several measures with criterion validity. Some of the survey instruments/measures were culturally adapted and new for the specific context they were used in. These include the instrument by Hua et al. used to assess the Chinese food environment [49] and the instrument by Petty et al. which represented a translation of the parental mealtime action scale into Portuguese and used in a Brazilian sample [34]. Hearst et al. similarly refined and validated an existing home food inventory for use in specific cultural groups (Spanish-and Somali-speaking low income families) [35].

Other considerations
Fluctuations in availability that might occur because of factors such as seasonal variations and shopping routines when actual availability is assessed (e.g. inventories) are important to consider. Surveys conducted directly after food purchase will give estimates that are different from surveys conducted at a later time point. This might be more problematic in low income settings where availability might be even more variable [53]. Different methods can be used to address this problem. One is to measure usual availability or availability over a given period. This was done by Nepper et al. who developed both a one-time assessment tool and a 30-day home environmental survey instrument [37]. The second option is to conduct inventories immediately after shopping episodes. Recording items on repeated occasions and adjusting data for the number of days since shopping was last done have also been suggested [19].
Some of the measures included, as evidence by the number of items included (Table 1), are lengthy and might require considerable time for participants to complete; their feasibility might thus be limited for some studies.

Strengths and limitations of review
This review was limited to methodological studies, as the main aim was to include instruments where the process of development as well as validity and reliability evidence were described in detail, and to focus mainly on newly developed instruments. It is however possible that there are measures with at least some evidence of test-retest reliability and/or construct validity that were missed by excluding non-methodological studies. This includes studies that might have used the instruments included in this review. The search strategy may not have captured all relevant articles. The strengths include the systematic process used in the identification of studies, the assessment of quality of processes for instrument development and evaluation, as well as the use of two independent researchers for the extraction of data and quality assessment.
Strengths and weaknesses of available instruments and recommendations for further research Several of the instruments in the included studies represented improvements from previous instruments; they were developed to increase feasibility of use and to improve use in different population groups as well as different settings (including diverse school types). While self-report questionnaires were predominant, as was also the case with older instruments assessing availability and accessibility of food among youth [18,19], some observational tools were also developed and tested in schools, afterschool programs, stores and at home. While observational tools using trained researchers might provide the most objective assessment of the food environment, they might be costly and hence using instruments that can be filled in by parents or available personnel at school or daycare is important [46]. There is a strong focus on the home environment in relation to the dietary behaviors of youth [9]. In line with that, methodological studies including measures of home availability and accessibility of food appeared predominant. As the obesogenic environment is being increasingly recognized as a driver of the obesity epidemic, including among youth, more studies addressing availability and accessibility outside the home are called for, although some good measures do already exist. This is particularly true for the neighborhood food environment such as stores that might be important sources of unhealthy food, in particular among adolescents. Few studies have focused on accessibility therefore more methodological studies looking at measures of food accessibility are needed, with a focus on operationalization of the construct in addition to testing psychometric properties. Studies in countries other than the US and in particular in non-Western settings and studies looking at particular population subgroups are also needed.

Conclusion
This review identified 20 methodological studies exploring the psychometric properties of measures of availability and/or accessibility of food related to youth. The studies included several food items, and most studies followed systematic steps in the development of the measures. More than half of the measures focused on the home environment, and only four studies included measures of accessibility. The majority of the measures had satisfactory evidence of reliability and/or validity. However, there were variations in the conceptualization of the correlates, in particular accessibility. More studies including measures of accessibility and addressing its conceptualization are needed. More testing of some of these measures in different population groups is also warranted, as is the development of measures of food availability and accessibility in the broader environment such as the neighborhood food environment.
Abbreviations ICC: Intra-class correlation; PRISMA: Preferred Reporting Items for Systematic reviews and Meta-Analyses