This narrative systematic review has described the range of evidence on 'second' and 'third' generation computer-tailored primary prevention interventions targeting physical activity behavior change in adults. Common characteristics of interventions that produced significant between-group effects and interventions with good retention rates were considered, as were internal and external validity of studies as measures of quality and generalizability.
This review differs from previous systematic reviews on computer-tailored interventions [5, 6] in the following ways: our review was exclusive to primary prevention interventions; first generation computer-tailored interventions and studies in which tailoring was not generated through an expert system were excluded; the review was not limited to RCTs, but also included quasi-experimental studies and studies with limited interpersonal contact that did not involve counseling. To our knowledge, previous reviews have not attempted to gauge the external validity of such intervention studies, although they have included varying measures of external validity in their quality criteria. Doing so is important in determining their generalizability and relevance to health promotion practice .
The volume of evidence has grown since the publication of previous reviews which had indicated that the evidence of effectiveness of computer-tailored interventions in the promotion of physical activity was limited [5, 6]. Several more recent studies reported significant positive effects either in comparison to a control group or over time, which boosts the overall evidence of efficacy. Overall, just over half of the studies reported positive short-to-medium term effects in comparison to a control group for physical activity behaviors or weight reduction, the majority of others reporting positive effects within the treatment group over time or positive effects on behavior mediators.
The efficacy of computer-tailored interventions is dependent on many factors such as the intervention quality, duration, exposure, intensity, use of theory, method of tailoring, source credibility and mode of delivery. Due to the small number of studies that isolated the effect of tailoring or the technology in their study design no conclusions can be drawn on their relative importance for success. Comparing participant's behavior to current recommendations, tailoring according to the participant's stage of change and tailoring feedback in more than one way were common in studies reporting significant positive between-group outcomes but using these tailoring methods was not necessarily predictive of success. More research is required to determine why and when tailoring is effective . In agreement with the findings of a previous review  it seems that whilst the intervention should be based on theory, no one theory has proven to be more applicable or effective. Ensuring multiple intervention exposures may also be important but was neither necessary nor predictive of success.
The quality, intervention intensity, duration and mode of delivery differed widely for the seven computer-tailored interventions reporting significant positive between group effects on physical activity outcomes. Success of the intervention does not appear dependent on the technology used in its delivery or in its intensity, with little evidence that interventions of greater intensity had better outcomes. This was the case whether additional support was delivered through the technology or interpersonal communication. However very few studies compared intervention groups of differing intensity, a similar finding to that reported by a previous review . Therefore there is insufficient evidence to determine the optimal intensity for computer-tailored interventions and to determine the best way of delivering interventions targeting more than one behavior change. More research is needed in this area [24, 26].
It has been recommended that studies use a combination of validated self-reports with more objective measures of behavior change , however less than half the studies included objective measures of physical activity. It appears that the use of objective measures of physical activity may be important in determining whether self-reported changes found are actual, with only one third of studies using objective measures of physical activity finding positive between group effects on behavior.
The real-life effectiveness of such interventions is dependent on the external validity of studies, including the setting in which it was conducted, the characteristics and representativeness of the targeted and recruited population sample and methods of recruitment, all factors which influence the generalizability of findings to practice . The external validity of reviewed studies was generally poor, resulting in uncertainty about such interventions' generalizability. This finding is not surprising given the majority of studies were RCTs as such designs aim to maximize internal validity and can sacrifice external validity, with results only generalizable to those participants who are willing to accept randomization . A stronger focus on effectiveness and dissemination may assist in the development of programs in population-based effectiveness settings. Future RCTs should attempt to increase their external validity by including representative participants and answering real-world questions . However this review found such characteristics of design lacking, with the common use of either small, homogenous or unrepresentative samples, restrictive exclusion criteria and for some a lack of comparison conditions relevant to real-world decisions, that is, comparison to no treatment controls only. Such characteristics significantly limit the dissemination of such interventions into practice .
Although determining cost-effectiveness was not the purpose of this review we recommend future studies at the very least report on basic economic measures such as costs, which are relevant to decision-makers and can assist in intervention uptake, dissemination and inform more advanced cost-effectiveness studies [18, 37]. Cost-effectiveness analyses are recommended as they will be important in determining the additional value of such intervention delivery modes over the more traditional delivery modes such as face-to-face counseling. The presumed cost savings for participants due to no travel time or costs may be particularly important for those living in rural or isolated areas.
There was a fundamental lack of long-term post-intervention follow-up, with only one study demonstrating intervention effects were maintained at two years post-baseline. However the generalizability of this study's findings and application to practice may be limited. More studies with long term follow-up of 12 months post-intervention are needed .
Previously noted poor retention rates of computer-tailored interventions, in particular web-based interventions [5–7, 11] prompted consideration of characteristics of interventions that might maintain engagement and retention such as the intervention's interactivity, duration, intensity, setting and study sample characteristics. However with the small number of studies comparing retention rates became problematic due to their varied follow-up length and therefore we could not form any definite conclusions. However based on our findings and other published reviews it seems the following intervention characteristics may be important in enhancing participant retention: ensuring multiple exposures to the intervention material, preferably evolving intervention materials or using controlled program delivery; the use of incentives; prompts through another medium; interactive and dynamic web components; and individualized tailoring [6, 11, 31]. Each of these characteristics may be insufficient on its own to result in good retention and therefore all will need to be considered in intervention design, sample size calculations and probable retention rates in the future.
Engaging and retaining interest using the Internet and email mediums, which are increasingly busy and through which many other information sources compete for attention will be a challenge. Attracting people to return to a website is challenging in particular for websites that do not offer new information at each visit . Telephone support in Internet-based interventions as an intervention strategy and a maintenance strategy has been shown to be as effective as face-to-face contact and to result in greater adherence and maintenance to the intervention and thus future research in this area is warranted as a way of increasing Internet engagement .
The limitations of this review must also be acknowledged. Firstly, this review did not actively seek unpublished studies, although one such study was included. Therefore when considering the findings of this review, the possibility of publication bias must be noted, resulting in a bias of studies with positive findings. However given the fairly high proportion of published studies reviewed that did not have significant findings, it is believed that the likelihood of publication bias is minimal.
Secondly, this review did not include articles in which physical activity behavior was not a primary outcome. This meant that articles were excluded in which psychological indicators, behavior mediators or process measures were the only outcome measures reported. Although behavior mediator effects, where available, were examined when behavioral change outcome effects were absent or conflicting, process measures were not described. This limits our discussion on retention, engagement and acceptability of computer-tailored interventions and their components in different population subgroups and settings. Although this was not the purpose of the review, reviewing research in this area would be worthwhile as it may indicate different levels of acceptance and the relative effectiveness in different population subgroups. This may be particularly important given the majority of reviewed studies had predominantly female, Caucasian, well educated samples.
Thirdly, we have not attempted to estimate a pooled effect size or to calculate and compare effect sizes of different studies due to the heterogeneity of studies in terms of their intervention design, delivery method, exposure and intensity, participants, study design and methods, and outcome measures. Such factors make such comparisons difficult  and inadequate ; hence a narrative systematic review was conducted. The two previous reviews on computer-tailored health behavior interventions most relevant to this review reported small to medium effect sizes [5, 6]. We agree with these and other authors that despite the small effect sizes found, such interventions can have substantial impact at a population health level, with their potential for wide distribution at low cost [5, 6]. However it will be critical to determine whether such findings are generalizable, can be replicated and to ensure adequate reach and engagement within varied population groups for such interventions.
In addition, our findings on common characteristics of successful interventions and those with good retention are limited due to the small number of heterogeneous studies included and our reliance on varying levels of detail provided in each article. Only a small proportion of the retrieved articles were included in this review. The main reasons for this include: many studies were duplicated in the databases that were searched, broad search terms were used and the exclusion criteria were specific and detailed. For example, the search terms did not distinguish between first, second and third generation interventions and first generation interventions which make up a substantial proportion of the literature were not considered in this review.
Lastly, due to the small number of studies reporting positive weight reduction outcomes, the relative contribution of nutrition and physical activity behaviors to such outcomes were not examined in this review.
Future research should endeavor to replicate studies in different populations to indicate effectiveness and generalizability. Following the example of Vandelanotte and colleagues where the same theory-based intervention was trialed and adapted in different population groups & settings and followed up long term [23, 24, 26, 27, 39] in addition to their reports on the acceptability and feasibility of these interventions in individuals of different age, sex, education level and computer literacy [9, 10] is important in building the evidence base.