We set out to investigate the cost-effectiveness for weight reduction and cost-utility of a lifestyle program utilizing e-mail or phone counseling in comparison with self-help among overweight employees. Adherence to both interventions was limited. ICERs and ICURs implied that both interventions were more effective but also more costly than self help. However, the ICER and ICUR of the internet group were lower (respectively, €16/kg and €1337/QALY) than those of the phone group (€1009/kg and €245,243/QALY) and quite favorable. The phone group had the lowest probability of cost-effectiveness and cost-utility of all groups, whereas the internet group had the highest probability of cost-effectiveness at most willingness to pay thresholds, ranging from 47% at €0/kg to 80% at €450/kg, and 60% at €20,000/QALY. The sensitivity analyses generally confirmed the results from the main analysis, with some showing results that favored the internet group more than in the main analysis. The internet-based program therefore appears to be the preferred intervention.
Participants finished about half of the ten modules, with lower adherence in the internet group. The latter may be related to satisfaction with the different formats. At six months after baseline we conducted a process evaluation in which we asked the participants how satisfied they were with their group allocation: 91% of the phone group participants were satisfied compared with 78% of the internet group. The general appreciation, on a scale of 0 (lowest) to 10 (highest), was 7.4 for the phone format and 6.9 for the internet format.
In the main analyses we found no significant differences in body weight and QALYs gained, in comparison with the control group. Conversely, the complete case analysis showed significant weight loss in the internet group, and a trend towards significant weight loss in the phone group, compared with the control group. However, self-selection seems to have played a role in this result, judged by the differences in baseline and follow-up body weight between complete and incomplete cases. In addition, compared to the imputed cases, within-group weight loss in the complete cases of the internet group was similar, while weight loss decreased in the control group and increased in the phone group. This is surprising as we expected selection effects in the complete cases to result in higher within-group weight losses among all groups. The significant result among complete cases should be treated with caution.
Baseline health utility values were, on a scale from 0.00 (representing death) to 1.00 (representing perfect health), already high with values around 0.91. A problem of the EQ-5D utility index is that it does not discriminate between health statuses at the high end of the healthy utility range . It is therefore not surprising that, in our relatively healthy population, differences in QALYs gained were small and not statistically significant. Research is going on to develop quality-of-life outcomes that are more sensitive to the immediate effects associated with preventive interventions .
When the UK tariff was applied, somewhat more QALYs were gained than with the NL tariff. Dutch respondents ascribe less weight than UK respondents to most dimensions on the EQ-5D . This could mean that the UK-tariff is more sensitive to improvements in the EQ-5D dimensions than the NL tariff. Nevertheless, incremental gains remained small.
Health care costs in the internet group differed significantly from controls. Otherwise, no significant differences were found. Like most economic evaluations conducted alongside a RCT, our study was not powered to detect statistically significant differences in costs .
Results of the current study confirm those of two other studies that compared phone counseling of healthy adults on weight-related behaviors and concluded that it was not cost-effective compared with no intervention [12, 13]. Both studies did not include societal costs nor had follow-up beyond the duration of the intervention. Regarding e-mail counseling interventions, no economic evaluations of these were identified. However, three trials found a combination of e-mail and phone counseling to be cost-effective in comparison usual care [9, 10] or another intervention . This suggests that a combination might be more cost-effective than the single interventions separately. Another explanation might lie in the methodological differences. First, conclusions in the three studies were based on complete cases (29% to 82% of all randomized participants) instead of imputed data sets, possibly leading to inflated effectiveness. Second, two of the studies [10, 11] based their conclusion on the ICER but did not explore uncertainty around these outcomes . Third, these studies did not include costs of productivity loss or all health care costs. Finally, all three studies reported post-intervention outcomes, as opposed to 18-months post-intervention in the current study. Weight rebound after initial weight loss is common, and was also seen in our sample [17, 39].
The main purpose of the current economic evaluation was to identify which counseling mode produced the greatest amount of additional health at acceptable costs. It is not clear how much social decision makers (i.e., the Netherlands Ministry of Health, Welfare and Sport) are willing to pay for a kg of body weight lost. Furthermore, in the Netherlands, no maximum societal ceiling ratio per QALY gained is defined. A recent review commissioned by the Dutch government used a threshold of €20,000/QALY for preventive interventions , but higher thresholds have been proposed for both curative and preventive interventions, depending on the burden of disease . Uncertainty about the cost-utility of the internet-based weight control program was appreciable, i.e. 40% at the €20,000/QALY threshold. The probability of its cost-effectiveness was a respectable 80% at €450/kg, but it seems unlikely that society is willing to pay this much. In addition, from the perspective of a Dutch company cost-effectiveness of this intervention was fairly uncertain, with a probability of 66% at zero WTP, for both QALYs and kg weight loss.
A limitation of this study is the rate of missing data. Missing data were multiply imputed for the main analysis. This method gives more valid results than complete case analysis and simple imputation methods such as baseline value carried forward [41, 42]. Multiple imputation assumes that the available data are sufficient to predict missing costs and clinical outcomes, and that the costs and outcomes of those who provided data are similar to those who did not provide data. The latter assumption may not necessarily hold true, but cannot be tested. This makes it impossible to draw firm conclusions about the cost-effectiveness of the studied interventions.
Retention to the study is challenging in behavioral weight control studies. In the current study 45% of participants had dropped out after two years. Few previous studies in this field had a follow-up beyond one year. A modeling study estimated that 50% of participants in weight control studies will have dropped out after two years, which is comparable to the dropout we found . This indicates that conclusions regarding efficacy and (cost-)effectiveness in the weight control field are seriously hampered. Future studies should prevent loss to follow-up. Upcoming technologies, like weighing scales that are connected to the internet, could make measurement of body weight for study-purposes less burdensome. Research is needed to optimize cost diary and questionnaire design . Finally, participants should be selected on motivation for continued participation in the trial  and motivation for completion of the study could be enhanced .
Another possible limitation of the study is that all cost data, except the costs of the intervention, were self-reported and that the cost diaries covered a relatively long period. More objective data, such as health claims data, are practically inaccessible in the Netherlands, so self-report of resource utilization is the common method. However, it is possible that participants completed the diaries retrospectively at the moment they had to return them instead of completing them prospectively. This could have resulted in a recall bias. Contradictory results on the influence of (period of) recall on the precision of self-reported sick leave and health care and medication use have been reported [47–50], but under-reporting of utilization seems likely. Nevertheless, we do not expect under-reporting to systematically differ between the intervention groups.
Strong points of the study are the randomized controlled design, the large study population of nearly 1400 participants, the relatively long follow-up period of two years, and the thorough presentation of uncertainty around the outcomes.