Systematic review of control groups in nutrition education intervention research

Background Well-designed research trials are critical for determining the efficacy and effectiveness of nutrition education interventions. To determine whether behavioral and/or cognition changes can be attributed to an intervention, the experimental design must include a control or comparison condition against which outcomes from the experimental group can be compared. Despite the impact different types of control groups can have on study outcomes, the treatment provided to participants in the control condition has received limited attention in the literature. Methods A systematic review of control groups in nutrition education interventions was conducted to better understand how control conditions are described in peer-reviewed journal articles compared with experimental conditions. To be included in the systematic review, articles had to be indexed in CINAHL, PubMed, PsycINFO, WoS, and/or ERIC and report primary research findings of controlled nutrition education intervention trials conducted in the United States with free-living consumer populations and published in English between January 2005 and December 2015. Key elements extracted during data collection included treatment provided to the experimental and control groups (e.g., overall intervention content, tailoring methods, delivery mode, format, duration, setting, and session descriptions, and procedures for standardizing, fidelity of implementation, and blinding); rationale for control group type selected; sample size and attrition; and theoretical foundation. Results The search yielded 43 publications; about one-third of these had an inactive control condition, which is considered a weak study design. Nearly two-thirds of reviewed studies had an active control condition considered a stronger research design; however, many failed to report one or more key elements of the intervention, especially for the control condition. None of the experimental and control group treatments were sufficiently detailed to permit replication of the nutrition education interventions studied. Conclusions Findings advocate for improved intervention study design and more complete reporting of nutrition education interventions. Electronic supplementary material The online version of this article (doi:10.1186/s12966-017-0546-3) contains supplementary material, which is available to authorized users.


Background
A major goal of nutrition education research is to elucidate factors that enable individuals to improve diet-related behaviors and/or cognitions associated with better health and greater longevity. These factors can then be incorporated in educational and health promotion interventions which, in turn, can be evaluated to determine whether the intervention effects change behaviors and/or cognitions among those assigned to the intervention vs. those in a control condition.
Well-designed research trials are critical for determining the efficacy and effectiveness of new interventions [1]. The basic components of educational research intervention trials include experimental variables, such as a novel curriculum; strong, measurable research questions or hypotheses; valid and reliable instruments for documenting change in behavior and/or cognitions; a strong data analysis plan; and an experimental design that minimizes threats to internal validity. To determine whether behavioral and/or cognition changes can be attributed to the intervention, the experimental design must include a control or comparison condition against which outcomes from the experimental group can be compared [2][3][4][5]. The randomized controlled trial (RCT) is typically considered the "gold standard" for ascertaining intervention efficacy and effectiveness [2].
Experts emphasize that to robustly minimize biases and variability of factors that may influence intervention trial outcomes, the control and experimental conditions must: 1) contain randomly assigned participants; 2) occur simultaneously to ensure both conditions experience the same history (i.e., external events, such as political change, natural disasters, scientific discoveries) and maturation (i.e., internal events, such as physical growth, memory decline with aging); 3) be structurally equivalent on as many non-specific factors as possible (i.e., factors other than the "active" ingredients in the experimental condition, such as participant time commitment, format and timeline of activities and data collection, and extent of attention and support from research staff ) [5]; and 4) offer equal value, attractiveness, credibility, and outcome expectations to keep participants blind to their condition assignment and thereby avoid novelty effects, differential dropout rates, disappointment arising from assignment to the control group, and/or efforts by control group participants to seek an alternate source of the treatment offered to the experimental group [1,3,4,[6][7][8][9][10][11][12][13][14][15][16]. The control condition also must not modify the intervention's specific factors (i.e., behavior and/or cognitions targeted in the experimental condition) [4,7].
To reduce the risk of a Type 1 error (acceptance of an ineffective intervention) [1,9,17], treatment received by control condition participants should differ from those in the experimental condition only in the non-receipt of the "active ingredient" of the intervention hypothesized to affect study outcomes [4,6]. Rigorous control of nonspecific factors, however, tends to increase intervention research costs because a plausible control intervention must be developed and implemented. Additionally, as the stringency of control exerted over non-specific factors increases, the risk of understating the effectiveness of the intervention rises because effect size is inversely associated with rigor of non-specific factor control [9,[17][18][19]. Therefore, to demonstrate statistically meaningful differences, larger sample sizes are needed to avoid Type 2 errors (failure to recognize an intervention is effective) and detect treatment effects when the control and experimental group treatments are structurally equivalent than when a less equivalent control treatment is used [1,9,17].
A key challenge to nutrition education researchers is selecting a suitable treatment for the control condition that is congruent with the research question, study resources, availability of standard treatment/usual care, and ethical considerations [7,9,10,12,20,21]. Control condition participants may receive treatment ranging from nothing at all to extensive treatment in an alternate "active" control condition unrelated to the experimental condition. As indicated in Table 1, the type of control condition selected can have important effects on study resources, participants, internal validity, and outcomes. For instance, resource investment in the treatment for the control condition can range from zero for the inactive control to considerable for active control. Ethical issues may be more highly problematic in inactive control conditions when participants in need of the intervention are denied treatment, but ethical issues are lessened when a standard or usual treatment can be offered. Preventing disappointed control group participants from seeking alternate sources of the treatment may not be possible, which weakens internal validity and undermines a true evaluation of the intervention's effect. Even in active control conditions where participants receive a contemporaneous intervention equal to the treatment condition in all aspects, except the "active ingredient", researchers may inadvertently treat control participants differently. Those delivering the intervention (e.g., research staff, educators) also may dislike being in the control condition [22] and seek opportunities to provide participants with treatment like that being given to the experimental group.
Clearly, the efficacy and "effectiveness of the experimental condition inherently depends as much on the control condition as on the experimental condition" [1],p.276. Despite the impact different types of control groups can have on study outcomes [23], the treatment provided to participants in the control condition has received limited attention in the literature [1,7,12,17,20,[24][25][26] and sometimes is not even described in research designs [27,28]; yet in the words of Mohr et al. with regard to psychological interventions, "inappropriate control conditions can overestimate the effectiveness of a treatment, or kill off a potentially useful treatment" [1],p.283. Thus, a systematic review of control groups in nutrition education interventions was conducted with the goal of better understanding how control conditions are described in peer-reviewed primary outcomes journal articles in comparison with experimental conditions. An additional goal of this investigation is to open discussions among colleagues as to how best to improve reporting of control and experimental condition treatments in intervention evaluation studies to facilitate advancement of the field.

Methods
A systematic literature search was conducted after review of guidance from the Nutrition Education Inactive Control: Control group receives no comparison treatment at all during the study or receives treatment after the study ends.
(+) No resource input for control condition development. a (+) Increased likelihood of yielding large effect size because least likely to change targeted cognitions or behaviors or these may worsen without treatment [1].
(−) Potential to overstate outcome of intervention because nearly all interventions are more effective in changing outcomes than simple passage of time [6,85].
(−) Increased risk of control group refusal to participate. (−) Increased risk of attrition and/or seeking alternate source of treatment during the waiting period [10].
No Treatment Control: Control participants receive no treatment. Additional Points: (−) Ethical issues when depriving a group in need of intervention of help when a suitable standard treatment/usual care is available; ethical problem lessens when no immediate risks (e.g., disease treatment) [20,90].
(−) Vulnerable to treatment fidelity issues (temptation of research staff/ clinicians to offer some treatment to needy participant) [1].
Wait-list (delayed treatment) Control: Closely related to no treatment control; control participants wait until the study concludes to receive treatment. During waiting period, wait list control participants may receive standard treatment/usual care which may impact study outcomes.
Additional Points: (+) No additional input for control condition development, but implementation costs must be considered (+) All participants receive the "active ingredient" treatment.
(−) Ethical issues lessened unless control group is in immediate need of treatment and available standard treatment/usual care not provided.
Active Control: Control group receives a different treatment contemporaneously with the experimental group. [4,17] (+) Considered a strong design [7]. (+) If control group is given a bona fide treatment, possibility of ethical issues diminished.
(+) Controlling non-specific treatment effects (e.g., participant burden, activity, and data collection format and scheduling, attention from researchers) [85] minimizes threats to internal validity and permits effects of the intervention to be more accurately attributed to the "active ingredient" hypothesized to affect the dependent variables [7]. (−) Creating a credible control treatment that is equally preferred by participants is difficult [20].
(−) Detrimental effects may occur if action control treatments lead to inaccurate conclusions about their personal health or other conditions and/ or lack of action to improve a health or other condition. See Pagoto et al. [91] for more detailed discussion.
Usual or Standard Treatment: Control participants receive a treatment that is typically offered.
Additional Points: (+) Limited additional resource input for control condition development.
(+) Provides opportunity to investigate whether new intervention is superior to existing treatment (−) Non-specific treatment effects likely different from intervention (e.g., differs in frequency of contact, type of intervention [e.g., passive vs active], time commitment, and/or provider qualifications, experience, and/or researcher/ clinician allegiance to the protocol) [6]. For instance, if the experimental condition requires greater effort, experimental group completers likely will be more motivated than control participants and confound results [6]. (−) "Usual" treatment interventions often do not exist in nutrition education, negating this as an option. (−) "Usual" treatment intervention components often insufficiently described (e.g., in peer-reviewed articles or implementation manuals) to permit comparison by external reviewers [1,6,25].
(−) Often no verification of fidelity of usual treatment to protocol implementation (e.g., process evaluation, manual, or oversight of providers) [1].
(−) Lack of equipoise (sincere uncertainty of whether intervention will be beneficial over usual practices) may affect research staff interactions with participants [1] (detailed implementation manuals, frequent process evaluation, and strong supervision can mitigate this) [1]. (−) Research staff personality differences and variations (even inadvertently) affect their behavior toward and expectations of control vs. experimental participants [6]. (−) Comparing "usual" practices to experimental condition is reasonable only if experimental participants are blind to the novelty of the experimental condition [2].
Systematic Review Project [29]. The study team then identified databases to use in the systematic review, search terms, and inclusion and exclusion criteria.

Search strategies
Search strategies were formulated according to the PRISMA guidelines [30]. Subject headings or search terms unique to each database were identified and searched in combination with keywords derived from the major concepts of "nutrition education intervention" and "control groups" or "study design". Table 2 shows the final search strategy for the selected databases (i.e., CINAHL, PubMed, PsycINFO, WoS, and ERIC). Searches were conducted in winter 2016. To be included in the systematic review, the articles had to report primary research findings of controlled nutrition education intervention trials from peer-reviewed journals. Included studies could address content other than nutrition, but nutrition had to be a key component. Additionally, included interventions had to focus on health promotion and disease prevention and have an education component. Inclusion criteria also required that interventions consist of more than one session and be conducted in the United States with free-living consumer populations. All included articles were published in English between January 2005 and December 2015. In cases where more than one article from the same study was located, only primary outcomes paper was included in the review to prevent overrepresentation of the type of control group used.
Excluded articles were studies reporting pilot, feasibility, cross-sectional, follow-up, or secondary analysis findings and those lacking a control or comparison group. Studies that focused on weight loss or disease management/treatment and those lacking an education component (e.g., those solely manipulating environmental factors) also were excluded. Additionally, all studies targeting professionals (e.g., health care, child care) or individuals recruited due to a pre-existing disease, such as diabetes, eating disorders, and obesity, or hospitalization, were excluded.

Data management
Citations for the 1164 articles returned by the systematic literature search were entered in a citation management tool (Fig. 1). After removal of duplicates (n = 46) and Table 1 Control Condition Treatments (Continued) Alternative Active Treatment: Control group receives an alternative treatment equal in non-specific treatment effects (e.g., participant burden, activity, and data collection format and scheduling, attention from researchers) to the experimental group and differs only in the nonreceipt of the "active ingredient" of the intervention hypothesized to affect the dependent variables (e.g., only the subject matter content of the intervention differs). [4,6] Additional Points: (+) Controls for non-specific treatment effects enhances ability to ascribe efficacy to the experimental treatment [7]. (−) Control treatment components often insufficiently described (e.g., in peerreviewed articles or implementation manuals) to permit comparison by external reviewers [1,6]. (−) Often no verification of fidelity to protocol implementation (e.g., process evaluation, manual, or oversight of providers) for either experimental or control groups [1]. (−) Research staff personality differences and variations (even inadvertently) affect their behavior toward and expectations of control vs. experimental participants [6]. (−) Comparing control and experimental condition is reasonable only if both are blind to treatment group assignment [2,92]. (−) Additional resource input for control condition development; using alternative active treatment when the effect of attention on participant outcome is unknown may be an unnecessary expense. (−) Rigorous control of non-specific treatment tends to contribute to study effects (i.e., control participant improvement), thus larger sample sizes or an increased risk of Type 1 error (e.g., p-level set higher than typical <0.05) is needed to prevent erroneously rejecting effective interventions as ineffective and to detect potentially small yet clinically important effect sizes [1,9,13,17,24].
Dismantling (or Additive) Component Attention Control: Typically used with a multi-part intervention where the individual parts are separated to identify which are most salient to the outcomes (often with the goal of increasing cost-effectiveness by paring down intervention parts). [7] Example: study of the effectiveness of a self-instructional guide accompanied by telephone counseling compared to the guide alone.
Additional Points: (+) Method is well suited if "usual" care is effective and desire is to improve on it; also overcomes ethical issue of denying treatment to those in need [93]. (−) Adequate sample size needed for each part of the multi-part intervention [1]. (−) Outcomes may be confounded if effect is due to differing exposure levels rather than the added component itself [24]. (−) Lower statistical power if added parts have small effect compared to existing intervention [7]. (−) Lack of equipoise (genuine uncertainty of whether individual intervention parts will be beneficial alone and/or better than usual practices) may affect research staff interactions with participants [1] (detailed implementation manuals, frequent process evaluation, and strong supervision can mitigate this) [1]. a Resources include time investment by participant and/or researcher, money, and research staff expertise publications that were not complete primary research articles (e.g., commentaries, viewpoints, editorials, letters, survey studies, abstracts, review articles, n = 50), two members of the study team independently conducted an initial screening of all article titles to identify those congruent with the study purpose. The title review yielded 195 articles that appeared to meet inclusion criteria. Next, article abstracts were independently reviewed by the same team members and 83 were identified as congruent with study purposes. Four team members scanned the articles and identified 53 articles meeting inclusion criteria. During data extraction, 10 additional articles were eliminated because they did not meet inclusion criteria thereby yielding a total of 43 reviewed articles.

Data collection and analysis
After scrutinizing guidance from the Nutrition Education Systematic Review Project [29] and Cochrane Collaboration [31,32] as well as previously published systematic reviews [33][34][35], data extraction tables were designed by the study team. These tables were iteratively pilot-tested and refined.
Data were extracted by one team member and independently checked for accuracy by two other team members. As shown in Table 3, the factors extracted included treatment provided to the experimental and control groups, overall intervention content, procedures used to tailor the intervention to participants, intervention delivery mode (e.g., group, individual), intervention format (e.g., curriculum, website, brochure) and duration, intervention setting, individual intervention session description (e.g., number of sessions or interactions, session duration, session frequency, content of each session, time allotment for each session component, overall duration of the intervention), procedures for standardizing intervention across   Web of Science #1 "Nutrition Education" OR "Nutrition Instruction" OR "Nutrition Intervention" AND #2 "Control Groups" OR "Research Design" OR "Quasi experimental Design" ERIC via EBSCO #1 "Nutrition Instruction" OR ("Nutrition" AND "Education") OR ("Nutrition" AND "Instruction") AND #2 "Control Groups" OR "Research Design" OR "Quasiexperimental Design" OR "Quasi experimental Design" a Search results were limited to English and publication from January 2005 to December 2015 multiple sites/practitioners, procedures for assessing fidelity of implementation across multiple sites, and procedures for blinding (masking) participants and/or intervention staff to participant group assignment, rationale for control group type selected, as well as sample size, attrition rate, and theoretical foundation. The goal of the factors extracted was to document the explicit presence or absence of each factor reported in the article. Additionally, only the 43 articles identified in the search were reviewed; extracting additional data from bibliographical references to previous developmental work cited in articles was beyond the scope of this study. A written narrative describing the treatment groups was prepared for each study. Extraction tables were content analyzed by team members to identify themes used to prepare a narrative synthesis of findings.

Results
The treatment provided to the experimental and control conditions in the studies meeting the inclusion criteria are described in Table 4. For accuracy, these descriptions used verbiage from the original research inasmuch as possible [36]. More than one-third of the 43 studies in the review had an inactive control condition; that is, the control group received no treatment or delayedtreatment (or wait-list). Because a key goal of this study was to compare how control and experimental conditions are described in peer-reviewed literature, results will focus on the 28 studies that had an active control condition. Of these studies, 7 had a usual or standard treatment for the control group, 12 offered an alternative active treatment to control participants, and 9 were dismantling (or additive) component active controls (2 of the 9 were mixed in that control groups received an alternative active treatment whereas the experimental groups received additive treatments).

Factors extracted in reviewed articles
Additional file 1 Table S5 compares the presence of factors extracted in the systematic review of articles. Each factor is described below, citing examples of studies demonstrating the factor

Description of overall intervention content
Reviewed articles commonly included a description of the overall intervention content provided. Content tended to focus on increasing fruit and/or vegetable intake, lowering fat intake, and healthy eating in general. The extensiveness of the overall content description for experimental groups ranged from only naming the general topic area (e.g., fruits and vegetables) [37] to listing topics and content addressed [38,39] to reporting content and participant activities [40][41][42] and teaching strategies [43][44][45][46]. Descriptions of the overall content for the control conditions tended to provide much less detail compared to experimental conditions. For example, among those employing usual or standard treatment, one study indicated only that "control classrooms did not receive vegetable-related instruction" [40],p.39 whereas another study reported that health education with no nutrition content was given [43], with neither indicating what control group participants received. Other descriptions of the control condition of usual treatment studies were equally vague indicating these participants received "traditional", "regular", or "normal" lessons [37,38,41,47]. Descriptions of treatment provided to the control groups in some alternative active treatment studies also were vague (e.g., control received pamphlet on fruits and vegetables [48], "packet of 5 printed commercially available booklets [49]," videos on sleep disorders [50]). However, several 9. … procedures for assessing intervention implementation with fidelity to that planned (e.g., staff supervised during implementation or videotaped/observed to ensure implementation was as planned; staff surveys to describe what they did during the intervention) 10. … procedures for blinding participant and researcher to treatment group assignment. If researchers were not blind, procedures for preventing differential treatment. 11. …rationale for selection of control group type.   Table 4 Description of experimental and control group treatments of nutrition education interventions (n = 43) (Continued) Treatment: Low-income middle school children received face-to-face instruction using the Harvest of the Month exposure-based nutrition education intervention that promotes F/V intake with monthly in-class F/V tasting activities, informational materials provided to teachers, parent newsletters, promotional posters and banners, related books in the school library, informative pages in the students' day planners, and school bulletin announcements; program lasted 7 months.
Alaimo et al., 2015 [69] Year 1: N = 320 baseline, 281 posttest. Year 2: 367 baseline, 281 posttest Attrition: 12% (year 1), 23% (year 2) Treatment: 3rd, 4th, and 5th grade teachers were trained and encouraged to offer 20 h of classroombased nutrition education per year to their students; teachers were given nutrition education resources/ support including newsletters and classroom nutrition education kits, healthy eating coaching in the cafeteria, and taste testing; teachers were encouraged to sign up for the YMCA "Nutrition in Action" program (a 6 week nutrition education program taught in the classroom by YMCA representatives), provided with non-food reward boxes, and social marketing materials (e.g., Project FIT health messages through mini-media, branded promotional materials, and wellness event ideas); the program also provided wellness training for after-school staff; improvements in school policies, programs, and environment though Health School Action Tools with a trained facilitator; and parent nutrition education.   Treatment: Over a 10-week period, college students had access to 21 mini-web-based lessons to foster healthy weight-related lifestyle behaviors (eating behavior, physical activity, stress management, and non-diet approach to weight management; viewing lessons was not required) and received 3 weekly email nudges (short, entertaining, stage-tailored messages with videos personalized to participant stage of change for F/V consumption, physical activity, and stress management) and 1 nudge reminding them view new lessons, and set goals each week for 1 to 3 targeted behaviors. N = 583 baseline Attrition: 24% a Treatment: Children in 3rd-5th grade enrolled in lowincome school districts were taught for 12-weeks by a registered dietitian using the EB4 K with Play, a multicomponent school-based nutrition and energy balance intervention that included food tastings, physical activity games, strategies to help students meet physical activity and nutrition goals; a registered dietitian worked with school staff and parents to implement wellness policies and improvements in school food service; a play coach offered structured active recess activities before and during school and led a physical activity sessions every other week and 4 afterschool 5-week long sports leagues throughout the year. Teachers were trained to implement Play works games and management strategies in students' physical education sessions.       alternative active treatment investigations were more informative, including content similar in detail to the experimental group [46,49,[51][52][53]. Dismantling studies tended to provide the greatest detail about the control condition largely because most experimental conditions were additive to the base formed by the control.

Description of how the intervention was tailored
Unless a goal of an investigation was to determine the effects of tailoring, little information on this factor was reported for experimental or control conditions regardless of whether a usual or other active control condition was used. In usual treatment control conditions, only one study mentioned tailoring for the experimental group [37]. A few alternative active treatment control condition studies tailored experimental and control treatments to demographic characteristics (e.g., older adult learners, African American women) [51,52]. Some investigations tailored treatments for experimental groups by allowing participants to choose topics or materials [45,49], with one study giving both experimental and control groups the ability to select topics [51]. The aim of most dismantling studies was to assess the effects of tailoring (experimental groups) vs not tailoring (control group); thus, tailoring descriptions for the control group generally were not applicable. On the other hand, the relative importance of the tailoring method to study aims made reasonably complete descriptions of this process requisite to report for experimental groups. Gans et al. reported [54] that tailoring was based on participant's fat, fruit, and vegetable intake and related behaviors, self-identified needed behavior changes, personal motivators, barriers, and other psychosocial issues associated with healthy eating, needs, and interests. Resnicow et al.'s [53] report is notable in that these authors provided a table describing messages and graphic images used to tailor study newsletters.

Description of intervention delivery mode, material type used, duration, and setting
Across all types of control conditions, investigators consistently reported the intervention delivery mode, with the most common being group sessions or online. Descriptions for experimental conditions tended to express delivery mode in explicit terms whereas for control conditions, it was often left to the reader to decide on the mode using implicit clues. This was particularly the case when the control group received a "usual" treatment without further clarification [40,41,43,47,55]. The type of material that provided intervention content directed to participants tended to be printed (e.g., brochures, pamphlets, manuals, newsletters) and online (e.g., websites, videos). Interventions delivered by instructors to groups used mostly curricula and "lessons." Some of the reviewed articles gave bibliographical references, internet links, or other means for obtaining intervention materials, with sources for instructional materials more commonly given for experimental than control groups [38, 40-43, 47, 55-59]. An examination by control group type found that references for resources used to deliver usual treatment to control groups were not included. Among alternative active treatment studies, the material types used with both experimental and control groups had comparably detailed descriptions [39,42,51,60], with some exceptions where great detail about the materials used by the experimental group was provided while giving only limited descriptions of those intended for the control group [44,48]. Material type descriptions tended to be more even across dismantling studies.
Total duration of the intervention delivered to the experimental group was explicitly stated in nearly all studies reviewed. For control groups, total duration was less likely to be clearly described and frequently had to be deduced from a review of the study timeline (e.g., when the baseline and post-test was administered) and comparison to statements made about the experimental group. The setting where group sessions were delivered normally was overtly indicated (e.g., school, community center). Interventions directed to individuals who received mailed materials or used websites generally only implied the setting as being home or worksite [49,50,56,57] and did not report where participants generally used intervention materials.

Description of individual intervention sessions
Across all types of control groups, the number of sessions or interactions (e.g., newsletters) usually was explicitly stated for both treatment groups. The duration of individual sessions or length of materials was more commonly reported for experimental than usual treatment control groups; for other types of control groups, duration was somewhat more consistently reported for both treatment groups [48,61]. Reporting of frequency of sessions was fairly even across experimental and control groups in all types of control conditions except usual treatment, where this information was rarely included.
Reports of the content of individual sessions/interactions were provided in about half the active control articles reviewed with most descriptions being abbreviated for the experimental group and virtually non-existent for the control group. In a few cases, researchers provided a table or figure listing concepts/topics/objectives addressed in each session/interaction for the experimental group [40,41,54,61,62]. Only 2 studies provided a table describing the content of both the experimental and control treatments [46,49]. Descriptions of the duration of each main component of individual sessions/ interactions were rare. The exceptions were Ratcliffe et al. [61] who stated "[e]ach hour-long session consisted of approximately 20 min of instruction followed by 40 min of hands-on garden experiences"p.38, Herbert et al. [38] who reported "Energize engages children in 1, 60minute class once a week … by involving them in 15 minutes of nutrition education, a 10-minute warm-up … and 35 minutes of aerobic exercise activities and fitness games"p.781, and Pobocik et al. [41] who indicated "[a]pproximately 20 minutes of the 45-minute class were allotted to presenting information … remaining time … for testing, activities, and demonstrations"p.22. Comparable descriptions for control groups were not included.

Procedures for assessing fidelity of implementation
Only about half of active control studies addressed fidelity of adherence to procedures, with most of these including information about procedures for both the experimental and control conditions. Methods used to establish fidelity of implementation for both experimental and control groups in active control studies where teachers or instructors delivered the treatment included detailed/scripted presentations [43,46], frequent meetings with researchers [38,46,47], random observation/ videotaping of instructors [43,46,55], teaching/feedback logs [43,52], and audiotaping [57]. Methods used in active control group studies in which participants selfdirected their engagement with pre-established treatments (e.g., web-based, printed materials) included completing forms documenting usage of treatment materials immediately after use [50,64,65], selfreport posttest survey items that gauged extent of treatment use [53,58], and website tracking data [59].
The vast majority of active control studies provided little detail about fidelity procedures. One notable exception was McCaughtry et al. [43], who described fidelity procedures as including "very detailed (nearly scripted) lessons in the curriculum…a research assistant [who] conducted randomized school visits to observe each health education teacher's instruction to guarantee that the control teachers were not teaching nutrition content and that the intervention teachers were implementing the curriculum with fidelity,"p.279. Another noteworthy example was provided by Wolf et al.: "Treatment fidelity checks were conducted on 200 (41%) of the intervention calls. Trained raters listened to audio recordings of the calls and completed a checklist documenting whether specific points were covered and whether the interventionist spoke at an appropriate pace, responded to questions with clear answers and probed at appropriate times" [57],p.34.

Procedures for blinding participants and researchers to treatment group assignment
Limited attention was given to the issue of blinding participants or researchers in the reviewed articles. In many cases, it was not clear whether participants were blinded (or aware there was a control vs experimental group), although this is a typical component of informed consent procedures. None of the studies providing the control group with usual treatment addressed participant blinding. Two articles blinded participants to group assignment by explaining that they were getting one of two programs or using alternate names for "control" or "experimental" groups. In specific, McCarthy et al. stated "A portion of the script used by project staff read … This is a cancer prevention study to compare two programs designed to help black women reduce their risk of cancer and improve their appearance. The first program involves 8 weekly 2-h sessions on diet and exercise. The second program involves 8 weekly 2-h sessions on current health topics of interest to black women, such as breast cancer and menopause. Both programs will be conducted by black women physicians and other professionals. We'll decide which group you'll be assigned to randomly, for example, by flipping a coin…" [51], p.247. In McClelland et al.'s crossover design study, these researchers assigned participants "to either the Apples Group (n=6) with the treatment curriculum … delivered first or the Beans Group (n=7) with the control curriculum … delivered first" [42], p.2. Another study reported that participant blinding efforts may not have worked. These researchers stated that "[g]irls, mothers, and troop leaders were masked to their group membership assignment;" but went on to say "because the project was called the Osteoporosis Prevention Project, some individuals in the control troops may have determined their status owing to the generic health focus of the sessions" [46],p.158.
The issue of blinding research staff likely is less important when interventions are automated and participant exposure to staff is minimal or non-existent. However, even when there was significant interaction with staff (e.g., in interventions delivering in-person or phone-based treatments), studies rarely addressed staff blinding. A few investigators reported using different instructors for experimental and control conditions [51,52], whereas others indicated that instructors were not blind to condition due to the nature of the intervention [46,55,57]. Blinding also would have been difficult in some of the dismantling studies where part of the treatment for only one of multiple experimental groups involved live interactions with staff [59,63]. In a few cases, articles reported that study evaluators were kept blind to participant study group assignment [57,58,64].

Rationale for selection of control group type
Reviewed studies seldom provided a rationale for the type of control group used and for those that did, various reasons were cited. These included convenience and comparability (e.g., "Three comparison [college] courses … were selected because they also were upper-level Human Biology courses, were delivered the same quarter, and were taught by experienced health promotion researchers and focused on a health message" [44], p.544) and relative strength (e.g., "Control group participants received fewer follow-up mailings … [that] resulted in a difference in "attention" between treatment arms, it is nonetheless a stronger design than a no-treatment control group" [60], p.62). Appropriateness to setting and participants also was considered (e.g., "Employees … were … assigned to the Web-based … or the print condition. It was recognized that the print materials could also be effective instruments of health behavior improvement (unlike a no-treatment control group) and could be a challenge as a control group … [and] would be a likely workplace alternative to an online program; therefore, the print group was thought to be an appropriate control group for the study" [49], p.e17). Yet, after finding both interventions yielded similar improvements, the article added to the control group rationale by stating… "[b]ecause it was originally thought that the print materials would form a relatively weak intervention compared to the Web program, a no-treatment control was not included in the design" [49],p.e17. Only 3 studies indicated the rationale for the control group was to control for non-specific effects (i.e.,"[t]he control group provided an intervention of identical intensity and program delivery format as the experimental group, ruling out "attention" effects in the experimental group" [52],p.386, "we used an attention control group to take into account the effect of participation" [65],p.37, and "[t]he purpose of this group was to control for any nonspecific effects from being educated about healthy lifestyles and from contact time and number of sessions … with professionals [46],p.158."

Behavior change theory use
Nearly one-quarter of all reviewed studies did not indicate whether a theory was used to guide the intervention. Of those that indicated application of a behavior change theory, more than half used the social cognitive theory and about one-quarter used the transtheoretical model. Most studies named the theory used with little additional explanation of how it was operationalized. The most explicit reporting of theory application was by Pobocik et al. [41], who included a table listing social cognitive theory constructs, definition of the construct, and an example of how the construct was operationalized in the Do Dairy intervention. Of those reporting how theories were applied, several used the stage of change construct for tailoring materials [48,63,66] and/ or selecting assessment scales [40,48,50,54,64]. Particularly illustrative of theory use in assessment were the tables Wall et al. [40] and Elder et al. [64] provided that listed theory constructs and corresponding evaluation items.

Comparison across control condition types
In the 7 investigations using a usual or standard active control condition, consisting of "traditional" or "regular" instruction, participants tended to be children enrolled in school or participants in government sponsored programs-perhaps because these systems have an ongoing program available for comparison. Articles gave fairly complete descriptions of the intervention provided to the experimental group, which were mostly curriculum based. They tended not to indicate if or how interventions were tailored and rarely provided information on the content of each session/interaction or how time was apportioned in each session, although this information may be available in the curricula referenced. With regard to the control group intervention, other than the overall intervention content, delivery (individual or group), and setting, little other information was provided. In most cases, too little information was provided about the usual treatment to determine whether the control group's treatment was comparable on non-specific factors to that received by the experimental group [38,40,41,47,55]. Descriptions in one study, which compared differences in teaching strategies (e.g., traditional vs. tailored online) indicated fairly similar attention to nonspecific factors [37].
In the 12 studies providing an alternative active treatment to the control group, investigators included a fairly even description of the treatments given to both experimental and control groups-a notable exception for both groups was a lack of specificity regarding the amount of time in each session devoted to the main components of the treatment. Additionally, many of the interventions were mail-or web-based and did not explicitly indicate the intervention setting. A comparison of the intensity of the treatments offered indicates that in some studies, the control group received "lighter" treatment doses than the experimental group (e.g., control group received a single pamphlet whereas the experimental group received tailored monthly magazines for 8 months [48], packet of printed booklets vs. highly interactive web-based program [49], manual vs manual coupled with coaching calls, tailored newsletters, and personalized feedback [56]). Many studies appeared comparable across a range of non-specific factors that could affect study outcomes [42,51,52]. One example of comparable treatment is Wolf et al. [57] who provided both experimental and control groups with a brochure (different topics) and tailored telephone education. Healy et al. [39] offers a second example in which both groups received a treatment that was the same length of time (7 50-min sessions over 1.5 weeks), used similar teaching strategies (i.e., lecture, discussion, question/answer, group activities), and differed only on content taught.
The 7 dismantling (additive) component active control studies tended to have 2 or more experimental groups. Interestingly, in all but one of these studies, the differences between the experimental and control treatments hinged on tailoring [61]. The control, or comparison, group in nearly all of these studies received less personalized and less intensive treatment than the experimental group [54,59,61,63,64]. In one study, for example, 3 groups of women either received non-tailored newsletters, tailored newsletters, or tailored newsletters and visits with lay health advisors [64]. Because of the derivative nature and increasing intensity of treatment provided by most dismantling studies [54,59,61,63,64], there was an imbalance in non-specific factors between/ among study groups. The in-person and frequent phone contact received by one experimental group vs ongoing access to the project website and automated individual risk profiling given to a second experimental group vs printed materials provided one time to control participants demonstrated the imbalanced attention across study groups [59]. Among dismantling studies, the greatest balance in non-specific effects was achieved by Resnicow et al. [53] in that both experimental and control groups received the same newsletters except the tailoring of the experimental newsletters was more specific.
An additional two dismantling studies were classified as "mixed" [46,65] because the control participants received an alternate treatment that was not a derivative of the experimental group but was similar to treatment provided to control participants in alternative active treatment conditions. For instance, control condition participants in one study received 2 45-min web sessions on anatomy whereas those in the 2 experimental groups received 2 45-min web sessions on nutrition or 2 45-min web sessions plus a 45-min booster session [65]. The comparability of treatment provided to control groups in these 2 mixed dismantling active control studies tended to be more balanced on non-specific factors than the other 7 dismantling studies that did not have an alternative treatment.
More than 3 out of 4 studies reviewed had random assignment of participants or intact groups (e.g., classrooms). Of the 10 non-randomized trials, half had no treatment control conditions. Of the remainder, one did not address randomization [41], one indicated the experimental group was comprised of students in classrooms with teachers who volunteered to participate [38], and another involving college students used intact classes and did not randomize the classes [44]. Two studies offered more explanation. One that was offered in WIC clinics indicated randomization was impractical and stated that "the practicality of being able to actually study comparisons of nutrition education intervention modalities in a typical clinic setting overcompensated for the lack in ability to develop a randomized design" [37],p.754. Authors of the second study offered this rationale, "The high cost and limited availability of randomized controlled trials in community settings highlight a need to evaluate and report on nonrandomized interventions that can be implemented in existing community settings" [45],p. 265.
Terminology used to describe control groups was not always consistent with definitions in Table 1. For example, two papers referred to control groups who received usual instruction as no treatment controls [37,43]. Another provided an alternative active treatment, yet referred to it as a standard treatment [48]. Still another referred to the alternative active treatment control group as an attention placebo group [65]. A placebo should have no effect on a person, however because learning likely occurred in this and other alternate education-related control conditions, the term placebo does not accurately describe the control condition.

Discussion
The goal of this study was to conduct a systematic review of control groups in nutrition education interventions and describe how control conditions are reported in peer-reviewed primary outcomes journal articles in comparison with experimental conditions. The findings of this systematic review indicate that the articles sampled focused on a wide array of controlled nutrition education intervention studies. Most addressed fruits and vegetables, fat intake, and healthy eating and tended to target school children as well as limited resource youth and families enrolled in government sponsored programs. Overall, descriptions of experimental conditions, regardless of type of active control condition, tended to be far more complete than descriptions of control conditions. Studies tended to report nearly all key factors (i.e., intervention content, delivery mode, material type, total duration, setting, individual session/interaction components [e.g.,, number, duration or length, frequency, content], standardization procedures, procedures for assessing fidelity of implementation, references for materials, theoretical underpinnings, and randomization) for the experimental condition. However, descriptions of the experimental group commonly lacked procedures for blinding and tailoring (except when the study was comparing differences in the effect of tailoring). In contrast, control conditions lacked descriptions of many key factors, with the most commonly omitted factors being individual sessions/interactions (e.g., number, duration, frequency, content of individual sessions), procedures for standardization, procedures for assessing implementation fidelity, blinding procedures, rationale for the type of control group selected, and references for instructional materials. Additionally, the factors that were reported for control conditions tended to be less explicit and included fewer details than provided for the experimental condition. In many cases, too little information was provided to determine the comparability of the control group vis-à-vis non-specific factors. Overall, the descriptions of both control and experimental group treatments became more complete as the type of active control became stronger and more complex; that is, alternative active treatments and dismantling studies provided the most detailed descriptions of the control group condition whereas usual or standard control conditions provided the least detail.
One-third of the 43 reviewed studies had inactive control conditions (i.e., no treatment or delayed treatment), a research design that is considered weak [7,17]. The Food and Drug Administration instructs that a notreatment control be used only when investigation outcomes are entirely objective and cannot be biased by lack of blinding [74]-although this advice is directed at drug trials, it can be reasonably applied to education trials using inactive controls. For instance, in one delayed treatment study, researchers stated that a lack of blinding among those teaching the educational intervention was problematic (i.e., they "generally did not like to be randomized to the control condition [22],p.31"). Failure to implement procedures to prevent differential treatment, commitment, and engagement of both experimental and control condition instructors has the potential to confound results [75]. Likely many researchers conducting the 43 reviewed studies had implemented appropriate blinding procedures for participants, instructors, and researchers; however descriptions of procedures for blinding and/or prevention of differential treatment were not reported in most studies.
Active control conditions, considered a stronger research design than inactive [7], were used in two-thirds of the reviewed studies. In this and other studies [76], usual treatment was considered an active control whereas some researchers categorize usual treatment as inactive (or passive) because it typically is not structurally equivalent on non-specific factors to the experimental condition [6,7,24,32,76]. All usual treatment conditions in reviewed studies offered control groups traditional or regular instruction that did not include content offered to the experimental group. As Street and Luoma point out, it usually is not possible to equalize all non-specific factors (particularly credibility and outcome expectations) when using education about an unrelated topic as the usual treatment [6]. The limited information about the usual treatment given to control participants negated the possibility of confidently affirming equivalency of intensity and structure of control and experimental treatments.
A hallmark of evidence building is replicability. Similar to findings by researchers in other fields [12,26], none of the experimental and control group treatments were sufficiently detailed to permit replication of the nutrition education interventions studied. About half of the experimental treatment descriptions included a reference for the intervention materials and a third of the control treatment descriptions included this information; these materials may mitigate replication issues associated with missing information in the reviewed article. Another alternative is to contact authors to obtain intervention details. When Glasziou et al. contacted authors who published non-pharmacological medical treatment intervention outcomes, treatment descriptions improved significantly; however one-third of the studies they reviewed still had insufficient detail, in part because study authors did not respond despite repeated attempts or were unwilling to provide additional information [26].
Standardization and fidelity procedures are equally important for control and experimental conditions-without these procedures, either group may receive more or less than the research protocol intended which likely will confound outcomes [75,77,78]. The limited reporting of standardization procedures (e.g., use of manuals, standard operating procedures) and process evaluation activities in the reviewed studies, and noted by others in psychological therapy research [77], indicates that either reports are incomplete or these procedures were not implemented-neither of which are helpful when trying to weigh the value of the study outcomes and determine whether treatment groups received differential treatment from unblinded research staff.
Random assignment is considered critical to minimizing biases in trial outcomes and maximizing accuracy of analysis of intervention effects. One-quarter of the reviewed studies did not randomly assign participants, and likely suffered from at least some selection bias [79]. Compounding the lack of randomization is that many of these same studies did not address participant or researcher blinding and/or procedures for assessing intervention implementation fidelity, all of which impair internal validity [79].
Reporting sample size seems like a fairly straightforward task, regardless of how complex an intervention design may be. Indeed, CONSORT flow diagrams [80] make reporting changes in sample size at each stage of the study clear and easy to report. Yet, many of the studies reviewed lacked key sample size information, a phenomenon noted by others [81,82]. In some cases, sample size was not declared in tables reporting data [38,47,51,72].
It is interesting that so few articles provided a justification for the type of control condition used, especially given this is a conscious decision made during study planning. A systematic review of psychosocial interventions with substance use disorders also found studies gave little justification for control group choice or considerations for how this choice may have affected study outcomes [24].
The classic work of Campbell and Stanley identifies the Solomon 4 group design as offering the greatest internal and external validity checks [5]. This design includes these groups: experimental (pretest-interventionposttest), no pretest experimental (intervention-posttest), control (pretest-posttest), no pretest control (posttest). Comparison of posttest scores across the 4 groups reveals whether changes are the result of the intervention and/or learning from the test [5,79]. None of the reviewed studies had non-pretested comparison groups. This lack of control for testing may have important implications; indeed researchers note that repeated measurements may encourage control condition participants to reflect on behaviors and initiate the behavior targeted in the experimental condition [83,84]. Another research group suggested that the research design for psychosocial treatments that most closely equates to a double-blind design is one that compares "two bona fide interventions … delivered by advocates for those interventions" [24],p. 426-427. In the reviewed studies, just one study met these criteria [57]. That is, Wolf et al. reported that immigrant men were given either a fruit/vegetable or prostate cancer prevention brochure [57]. Both groups received 2 tailored telephone education calls that could be considered to be delivered by an "advocate" because callers use a standardized telephone protocol and were audiotaped as a check for fidelity of delivery (however, no mention was made as to whether different callers were used for each treatment). Still another research group felt that to disentangle effects of the "active ingredient" from effects of non-specific factors, studies should include 3 groups: wait list control, attention control, and experimental group [85]. Many of the reviewed studies had 2 of these groups, but none had all 3.
Dismantling designs make it possible to separately account for the effects of each intervention component. However, the reviewed dismantling studies were mostly additive-that is, the treatment groups received increasingly intensive treatments thereby making it impossible to ascertain whether it was the greater dose of the additive treatment that contributed to changes or just the additional element [46,54,58,63,65]. For instance, one had a control group who received 12 weekly nontailored newsletters by mail, an experimental group received 12 weekly tailored newsletters by mail, and another experimental group received the 12 weekly tailored newsletters plus weekly home visits from a promotora (lay health advisor/counselor) [64]. There was not a group who received only promotora visits, thus differentiating between intensity and independent effects of the promotora was not possible.
In the words of Montgomery et al., "[p]oor reporting limits the ability to replicate interventions, synthesise evidence in systematic reviews, and utilise findings for evidence-based policy and practice. The lack of guidance for reporting the specific methodological features of complex intervention RCTs contributes to poor reporting" [86],p.99. To improve reporting, the CONSORT extension underway for randomized controlled trials of social and psychological interventions may be appropriate and/or adaptable for health and nutrition education and promotion programs [86]. Methods for overcoming deficiencies in reporting design and execution of both control and experimental conditions reported by others may serve as models for reporting nutrition education interventions [7,87]. One research group has even suggested creating a repository of treatment descriptions, citing the Centers for Disease Control and Prevention's Replicating Effective Programs (https://www.cdc.gov/hiv/research/ interventionresearch/rep/index.html) as an example, and establishing a detailed checklist of characteristics to be included in intervention descriptions [26]. In fact, the supplementary table published by Greaves and colleagues is an excellent reporting method that ensures all salient elements are included [87]. Table 3 in this paper is another tool for ensuring key information is reported in nutrition education outcomes papers.
Strengths of this review lie in the large number of papers included and the extensive extraction of data contributing to this comprehensive description of control groups in nutrition education interventions and how they and experimental conditions are recounted in peerreviewed journals. Additionally, it is the first study to explore control conditions in nutrition education and is among the first in any field to examine this critically important research intervention study design and reporting component [7,26,85]. This study is, however, limited to studies conducted in the United States. Furthermore, the studies reviewed likely included at least some of the extracted factors reported as missing in Additional file 1: Table S5, but did not explicitly report them in the published paper. Also, no attempt was made to examine cited sources, which may supplement the information provided in the reviewed papers. Examination of the appropriateness of outcome measures, adequacy of sample sizes, and effect of control condition on study outcomes were beyond the scope of this review, but are important targets for future investigations.

Conclusions
Calls for more transparency and detail in reporting interventions have occurred sporadically since at least 1991, yet little has changed [77,88]. In this day and age of ever constricting research funding, coupled with the dire need for interventions that effectively improve nutritional status and associated outcomes, it is imperative that intervention research use more robust study designs that permit us to understand the effects of each component of the intervention [26,85]. Additionally, researchers and journal editors should assume the responsibility for ensuring that practitioners can easily access the details needed to implement effective interventions with fidelity. The key historic barrier to reporting this data in printed form has been overcome with electronic publishing [26,89]. Clearly there is a great deal of opportunity to improve intervention study design and reporting-seizing this opportunity can only help to advance the field and improve consumer health. A goal set at the outset of the investigation reported here is to open a dialogue among nutrition education researchers that leads to improved reporting of control and experimental condition treatments in intervention evaluation studies to promote advancement and impact of our work.

Additional file
Additional file 1: Table S5. Factors extracted in systematic review of articles. (DOCX 37 kb)