This work was approved by the Institutional Review Board of the University of North Carolina (NC) at Chapel Hill.
The sixth largest population of American Indians in the US and the highest concentration of American Indians east of the Mississippi River reside in NC (http://www.doa.state.nc.us/cia/). The US Census 2010 estimates that 122,110 American Indian/Alaskan Native individuals live in NC. The state is home to eight tribes and four urban Indian organizations. Seven of eight tribes agreed to participate in the American Indian Healthy Eating Project: the Coharie Indian Tribe, Haliwa-Saponi Indian Tribe, Lumbee Tribe of NC, Occanneechi Band of the Saponi Nation, Meherrin Indian Tribe, Sappony, and Waccamaw Siouan Tribe. The one federally recognized tribe in the State, which resides on a reservation, opted out of the study citing existing local efforts to address healthy eating. We did not examine food access for the four urban Indian organizations in NC since there was low American Indian concentration in these four metropolitan areas.
The Census uses State Designated Tribal Statistical Areas (SDTSAs) to represent a compact, contiguous area containing a statistically significant concentration of people who identify with a specific recognized tribe without a reservation and/or residing on off-reservation trust land (http://www.census.gov/geo/www/tsap2010/tsap2010_sdtsa.pdf). We used preliminary 2010 SDTSA maps, available in fall 2009, to determine our study areas. Sappony is physically located in NC and is recognized as a tribe in this state. Sappony is also physically located in Virginia but the state of Virginia has yet to recognize the tribe and Sappony does not have a SDTSA in Virginia. Therefore, for the data validation component of the study, we did not include food data gathered for Sappony in Virginia.
Using ArcGIS 9.3.1, ZIP Code and county boundaries were overlayed with SDTSA boundaries to identify NC ZIP Codes and counties that intersected or were co-located with the SDTSA. ZIP Codes (n=78) and counties (n=21) co-located with the seven SDTSAs were used to gather information by tribe on food outlets from one free, online directory (online Yellow Pages), two government sources (county health departments and the state agriculture department), and two commercial sources (ReferenceUSA and Dun & Bradstreet).
Our protocol for gathering information from online Yellow Pages was to enter “food” into the search box labeled “find” for each ZIP Code co-locating with each SDTSA. Only outlets physically located within our ZIP Code of interest were included. Food outlets listed in the following categories were included initially and then phone and Internet searches were used to establish all outlets sold food to the public: canners & food processors, convenience stores, fast food restaurants, food and beverage consultants, food banks, food delivery service, food facilities consultants, food processing and manufacturing, food processing equipment and supplies, food products, food products-wholesale, food service management, frozen food locker plants, frozen food, frozen food-wholesale, fruit and vegetable-wholesale, fruit and vegetable markets, grocers-ethnic foods, grocers-specialty foods, grocers-wholesale, grocery stores, health and diet food products, health and diet food products-wholesale, health food restaurants, Mexican food products, natural food, nuts-edible, restaurants, soul food restaurants, and vitamins and food supplements.
For local health county food inspection listings, all co-locating NC counties (n=21) were called in fall 2009. All 21 counties mailed, emailed, or faxed free copies of their latest inspection lists or directed us to a website where their local food inspection data could be accessed and downloaded for free via the Internet. Food outlets listed in the following categories were included initially and phone and Internet searches were used to establish all outlets sold food to the public: food stands, meat markets, mobile food units, pushcarts, and restaurants. For the NC Department of Agriculture and Consumer Services food inspection listings, the Department provided us with an up-to-date listing of all food establishments it inspects within all co-locating NC counties (n=21) in December 2009. Food outlets listed in the following categories were included initially and phone and Internet searches were used to establish all outlets sold food to the public: bakeries, farmers’ markets, and stores with packaged goods sold to the public.
Using our university’s e-research tools, we accessed ReferenceUSA. We conducted a custom search for our selected NAICS codes found within all co-locating NC ZIP Codes (n=78). We gathered all NAICS outlets by ZIP Code. The outlets identified through this search were reviewed and sorted to eliminate or flag any potential questionable food outlets or delete duplicates. Food outlets listed in the following NAICS were included initially and phone and Internet searches were used to establish all outlets sold food to the public: 445 Food and Beverage Sales, 4451 Grocery Stores, 445110 Supermarkets and Other Grocery (except Convenience) Stores, 445120 Convenience Stores, 4452 Specialty Food Stores, 445210 Meat Markets, 445220 Fish and Seafood Markets, 445230 Fruit and Vegetable Markets, 445291 Baked Goods Stores, 445292 Confectionery and Nut Stores, 445299 All Other Specialty Food Stores, 447 Gasoline Stations, 447110 Gasoline Stations with Convenience Stores, 72 Accommodation and Food Services, 722 Food Service and Drinking Places, 7221 Full-Service Restaurants, 722110 Full Service Restaurants, 7222 Limited-Service Eating Places, 722211 Limited-Service Restaurants, 722212 Cafeteria, Grills Buffets, and Buffets, 722213 Snack and Nonalcoholic Beverage Bars, 4299 Other General Merchandise Stores, 452910 Warehouse Clubs and Superstores, 452990 All Other General Merchandise Stores, 452112 Discounted Department Stores, and 446110 Pharmacies and Drug Stores. Using resources from the NC Department of Commerce, Economic Development Intelligence Systems, we accessed without charge Dun & Bradstreet. We conducted a custom search for our selected NAICS codes found within all co-locating NC counties (n=21). We gathered all NAICS outlets by county. Food outlets listed in the same NAICS codes noted above for RefereneUSA were included initially. Phone and Internet searches were used to establish all outlets sold food to the public.
Our general approach was to include any food outlet open and regularly selling publicly accessible food. For each food outlet, we gathered the name, address, city, state, ZIP Code, and phone number. We tracked discrepancies, such as differing names and addresses for outlets determined through phone calls and Internet searches to be the same. Each outlet was viewed in Google Street View, and any differences in name, address, and open/closed status were documented, and then verified through phone calls when possible. We separated conjoined outlets such as KFC/Taco Bell into two outlets. We noted that an outlet was closed if we could verify this in the field, through a phone call with the county health inspector, or a phone call with a new food outlet operating at or near the closed outlet’s location.
Intra-reliability was assessed by comparing the name, address, city, and ZIP Code for all food outlets against each other gathered for four ZIP Codes (n=110; 3% of the final number of secondary food outlets). These four ZIP Codes were co-located with two tribes before they were reconciled into one list per ZIP Code. Then, four reviewers (SF, GR, DS, AR) identified duplicates or non-food sources. Any outlet identified as questionable by the four reviewers was further examined before it was eliminated as a true duplicate, non-food source, or combined and modified to the most accurate name, address, city, state, and ZIP Code available through the phone, online, and community verification processes. Any outlet that was combined with another food outlet, modified, or edited was tracked separately and these changes were tracked by data source and type of changes. For example, if Dun & Bradstreet named a food outlet at 123 Jones Street a McDonald’s while InfoUSA identified a Burger King at a similar address and both data sources were found through phone calls or field observations to be referring to the same fast food outlet currently operating as a McDonald’s at 124 Jones Street, then the two outlets were combined as one food retail listing and the edits made to make this combination of food retail listings were commuted as edits to the secondary data sources. These combinations were not considered “true duplicates”, which we defined as outlets with the same exact name and address. Additional file 1 provides further details on our protocol development for each of the secondary data sources, our secondary data editing steps, and our inter-rater reliability procedures.
In ArcGIS (Esri, Redlands, CA), we used the addresses from secondary data sources and the 2009 TIGER/Line roads data from the Census Bureau to geocode the food sources identified by secondary data (n=3389). The geocoding process assigned geographic coordinates to addresses by matching them with a geospatial database. We were able to geocode 2816 of the 3389 outlets identified (83%). For the remaining unmatched outlets (n=573), we used the Excel Geocoding tool v3.1 from Juice Analytics (http://www.juiceanalytics.com/) and found 336 address-level precision geocodes. We were unable to geocode 237 outlets at the address-level using either geocoding tool. Ultimately, 3152 outlets out of 3389 outlets (93%) were geocoded and included in the analysis.
To directly observe the food environment, we developed a ground-truthing protocol to drive all roads and streets in each SDTSA (Additional file 2). The Census 2009 TIGER/Line roads data have been shown to be reliable. These road data were used to calculate the road mileage in each SDTSA and create a map of the roads to ground-truth in each SDTSA . The Lumbee Tribe of NC encompasses over 6000 miles, so we worked with the Lumbee Tribal Council and consulted with a demographer to focus on ground-truthing the largest US Census-Designated Place (CDP) in this tribe’s SDTSA with 75% or more American Indian (i.e., Lumberton, NC), along with another CDP with 75% or more of American Indian, considered the “heart” of the tribe where all tribal government and services are located (i.e., Pembroke, NC).
The following types of roads were not driven: private, industrial parks, unpaved, or residential roads such as apartment complexes, residential subdivisions, condominium complexes, and trailer parks. Roads not illustrated on the map but within the SDTSA, while few, were driven and documented by name, and their relative location was noted on the ground-truthing master map. GPS assisted in identifying a few unlabeled or unidentified roads while in the field. Usually, these new roads were small, residential blocks without any food outlets located on them.
We collected the latitude and longitude of each food outlet, completed a short survey of the outlet’s location and food classification, and used photography to help capture the outlet’s location and food classification. Outlets that appeared closed or had signs indicating that they were under renovation or coming soon were also captured. We determined whether these stores were in business through Internet searches, phone calls, re-visiting the area, or during the inter-rater reliability testing. Primary data collection was conducted from February through June 2010. Two independent research assistants (JSR, DS) conducted an inter-rater reliability process of our ground-truth protocol in September-October 2010 by driving 10% of all roads within the SDTSA for six of the tribes and 10% of all roads within Lumberton. GPS data were uploaded into Google Earth and then converted to a shapefile in ArcGIS using the Arc2Earth extension. A distance of 1600 meters was used to compare the outlets identified during the inter-rater process to the outlets identified during the primary ground-truthing data collection. Matches were determined by name. Minor reconciliations were made to differences in names between primary ground-truthed and inter-rater reliability data.
Categorizing the food outlets
Food outlet types identified by both secondary and ground-truthing were consolidated into six categories: (1) convenience stores, (2) general merchandise stores (e.g., dollar stores and discount department stores, such as Kmart, Target, and Wal-Mart, without a full grocery section), (3) grocery stores, (4) specialty markets & shops (e.g., meat markets, produce stands, bakeries, donut shops, and ice cream shops), (5) restaurants (e.g., fast food, full-service, and coffee shops), and (6) food banks and community gardens. To assist in classifying the secondary data, Internet searches were conducted, phone calls were made to questionable outlets, and experiential knowledge was utilized. During ground-truthing, information to classify chain food outlets was generally gathered from outside of the food outlet; for non-chain food outlets researchers generally went into the outlet and asked a store employee information about the foods sold and, for restaurants, the type of service provided. For some convenience stores in rural areas, researchers asked if gas was currently sold at the location.
To classify food outlets identified through secondary data sources or ground-truthing, we modified the Nutrition Environment Measurement Survey (NEMS) food store and restaurant classification codes [34, 35]. We used “other” to capture outlets not easily described with our modified NEMS codes. For restaurants, we used one or more of the following to describe the type of service provided: fast food restaurant (e.g., limited service, counter-only, McDonald’s); fast-casual restaurant (e.g., order at counter but delivered to your table, Corner Bakery); full-service restaurant (e.g., waiter comes to your table and takes your order); buffet-style restaurant (e.g., all you can eat buffet option); banquet (e.g., weddings, special events); catering (e.g., bring food to you); delivery (e.g., pizza); and to-go or drive-thru (e.g., pick up and go). Additional file 2 provides the complete list of food codes used in our study and also explains other approaches we used to classify the food outlets [13, 34, 35]. Inter-rater reliability for classifying all food outlets identified through secondary data sources and through ground-truthing was assessed by comparing percent agreement between two-raters for our modified NEMS and six category food classification coding system used for statistical analyses for all identified outlets.
Categorizing the level of urbanization
Using 2000 Rural–Urban Commuting Area (RUCA) codes obtained from the US Department of Agriculture, each outlet identified was categorized by its ZIP Code . Similar to other consolidations [19, 37], the 10-tiered RUCA system was consolidated into four levels: urban (RUCA 1), sub-urban (RUCA 2), large town (RUCA 3), and small town/rural (RUCA 4).
Matching ground-truthed data to secondary data
The ground-truthed and secondary data were merged into a single file. The point distance tool in ArcGIS was used to calculate the distance between all outlets identified in secondary data within 1600 meters of outlets identified in ground-truthed data. Internet searches and phone calls were made to confirm matches for convenience stores, diners, and smaller, non-chain venues that were questionably similar but not exact matches in name or relative distance. We also explored possible matches with secondary data that did not geocode or were not within 1600 meters of the ground-truthed outlet. In ArcGIS, we used the select-by-location tool to identify outlets that fell within the boundaries of the six SDTSAs and the two CDPs examined, excluding secondary data outlets identified outside of the SDTSA.
Sensitivity, kappa, positive predictive value (PPV), and concordance were calculated to assess the validity of secondary data sources. These were interpreted using the Landis and Koch criteria (<0.00 poor, 0.00-0.20 slight, 0.21-0.40 fair, 0.41-0.60 moderate, 0.61-0.80 substantial, and 0.81-1.00 almost perfect) . Sensitivity was calculated as the ratio of the number of ground-truthed outlets that matched secondary data outlets to the number of ground-truthed outlets that matched secondary data outlets plus the number of ground-truthed food outlets that did not match secondary data outlets. PPV was calculated as the proportion of the establishments listed by the secondary data sources that were observed on the ground. Concordance was calculated as the proportion of the establishments observed on the ground and listed by the secondary data sources among all the establishments either on the ground or listed. We calculated 95% confidence intervals for each of these proportions by approximating the binomial distribution with a normal distribution. Analyses were conducted using SAS software (version 9.2; SAS Institute, Inc., Cary, NC).