Skip to main content

Table 4 Summary of outcome measures and results

From: A systematic review of artificial intelligence chatbots for promoting physical activity, healthy diet, and weight loss

No.

First author/published year/Country

Main outcome measures

Secondary outcome measures

Physical activity (PA)

Diet

Weight

Engagement

Acceptability and satisfaction

Adverse event

Other outcomes

Results

Results

Results

Results

Results

Results

Results

Randomized controlled trials

1

Kramer J/a 2020/ Switzerland [31]

OM (Daily step count obtained from smartphone)

NR

NR

OM (Rate of individuals who stopped using the app)

NR

NR

NR

Daily cash incentives increased step-goal achievement by 8.1% (CI: [2.1, 14.1]) and, only in the no-incentive control group, action planning increased step-goal achievement by 5.8% (CI: [1.2, 10.4]).

NR

NR

30% of participants stopped using the app over the course of the study.

NR

NR

NR

2

Kunzler F/b 2019/ Switzerland [33]

OM (Daily step count obtained from smartphone)

NR

NR

OM (Just-in-time-response rate, overall response rate, conversation engagement, response delay obtained from the chatbot)

NR

NR

NR

Physical activity goal completion rate was correlated with overall response rate (r = 0.53, p < 0.001), just-in-time response rate (r = 0.42, p < 0.001), conversation rate (r = 0.38, p < 0.001) and average response delay (r = − 0.27, p < 0.001).

NR

NR

Intrinsic factors: Device type, age, and personality traits had a significant effect on the just-in-time response rate, conversation rate, and total response rate.

Extrinsic factors: Time and day of the delivery, phone battery, device interaction, and location had significant effects on just-in-time response, conversation engagement, and response delay.

NR

NR

NR

3

Piao M/ 2020/ South Korea [32]

SR (Self-Report Habit Index)

NR

NR

NR

NR

NR

NR

After 4 weeks of intervention without providing the intrinsic rewards in the control group, the change in SRHI scores was 13.54 (SD ± 14.99) in the intervention group and 6.42 (SD ± 9.42) in the control group (p = .04).

When all rewards were given to both groups, from the fifth to twelfth week, the change in SRHI scores of the intervention and control groups was comparable at 12.08 (SD ± 10.87) and 15.88 (SD ± 13.29), respectively (p = .21)

The level of physical activity showed a significant difference between the groups after 12 weeks of intervention (p = .045)

NR

NR

NR

NR

NR

NR

4

Carfora V/ 2019/ Italy [34]

NR

SR (Self-reported RPMC; intention; attitude; regret on RPMC)

NR

NR

NR

NR

NR

NR

The emotional condition had stronger anticipated regret and higher intention to reduce RPMC, as compared to the control condition (p = .01 and p = .02 respectively). Both emotional and informational groups showed lower self-reported RPMC as compared to control. (p = .03 and p = .05 respectively).

NR

NR

NR

NR

NR

Non-randomized studies

5

Maher CA/ 2020/ Australia [20]

SR (Active Australia Survey)

SR (14-item Australian Mediterranean Diet Adherence)

OM (Seca 703)

OM (Number of weekly check-in obtained from the chatbot)

NR

NR

OM (Feasibility of subject enrollment)

Increased MVPA 109.8 (95% CI 1.9 to 217.7, p = .005) minutes per day from baseline to 12 weeks.

Increased 5.7 (95% CI 4.2 to 7.3, p < .001) points in diet adherence from baseline to 12 weeks.

Lost 1.3 (95% CI − 25 to − 0.7, p = .01) kg from baseline to 12 weeks.

Mean weekly chatbot interaction 6.9 times out of 11 possible interactions.

NR

No adverse events reported

Enrolled 31 out of 99 screened participants in the 6- week enrollment period

6

Fadhil A/2019/NR [35]

SR (Physical activity intention)

SR (Healthy diet intention)

NR

NR

SR (TAM questionnaire)

NR

SR (AttrakDiff questionnaire)

Results showed no difference between the three weeks; the scores remained unchanged for the physical activity.

Results showed no difference between the three weeks; the scores remained unchanged for the healthy diet.

NR

NR

The scales “ease of use,” “attitude,” and “intention” towards using the system were significantly higher than the middle score (respectively: t(17) = 4.9, p < .01; t(17) = 2.5, p < .05; t(17) = 3.1, p < .01).

NR

Average scores were statistically higher than 4 for each dimension: pragmatic (t(17) = 5.41, p < .01), hedonic (t(17) = 3.4, p < .01), appealing (t(17) = 4.2, p < .01), and social (t(17) = 2.6, p < .05).

7

Stephens/2019/ U.S. [21]

SR (Target goal progress)

NR

NR

OM (Duration of conversation, Quantity of messages exchanged, Number of hours support exchanged, Percentage of exchanges outside of typical office hours obtained from the chatbot, Ratio of chatbot-initiated vs. patient-initiated conversations obtained from the chatbot)

SR (Helpfulness)

NR

NR

Adolescent patients reported experiencing positive progress toward their goals 81% of the time.

NR

NR

A total of 4123 messages were exchanged between participants and Tess. The average duration of conversations between Tess and patients was approximately 12.5 min (SD = ± 15.62 min). The median length of conversations was nearly 6 min, Tess provided about 55 h and 45 min of support for the adolescent patients, 17.8% of which was provided outside of typical office hours.

A majority of the conversations were Tess initiated (73.6%) compared to patient initiated.

Patients indicated that Tess was helpful 96% of the time.

NR

NR

8

Casas J/2018/ Switzerland [36]

NR

SR (Meal consumption)

NR

NR

SR (Chatbot Effectiveness)

NR

NR

NR

Only 11% of participants succeeded with their goals. In 65% of the cases the person has improved his consumption. In 12% of cases, consumption remained stable and in the remaining 24%, their consumption has worsened.

NR

NR

82% of participants said that Rupert allowed them to think and be aware of their consumption. 86% reported answering honestly to the daily requests of the chatbot. 70% thought the chatbot intervention was efficient.

NR

NR

9

Kocielnik R/ 2018/ U.S. [37]

SR (Habituation Action; Understanding; Reflection; Critical reflection adapted from Kember et al. 2000)

OM (Step count obtained from fitness trackers)

SR (Physical activity awareness)

NR

NR

OM (Participant interactions with the system: 1) number of dialogues responded to, 2) the time until a response was made, 3) the length and content of responses obtained from the chatbot)

SR (Willingness to use the system for additional 2 weeks without compensation)

NR

NR

SR (Mindfulness)

Significant difference in Habitual Action (HA) for pre (M = 3.16, SD = 1.06) to post (M = 3.53, SD = 0.89) study measurements; t(32) = −2.04, p < 0.05.

A weakly significant increase in Understanding (U) from pre (M = 3.60, SD = 0.98) to post (M = 3.92, SD = 0.84); t(32) = −1.90, p = 0.07.

Step count difference was not significant.

Physical activity awareness difference was not significant

NR

NR

Participants responded to 96% of all initial questions and to 90% of the follow-up questions sent by the system.

16 out of the 33 participants elected to continue using the system for 2 additional weeks without reward.

NR

NR

No significant changes were observed between pre- and post measurements

  1. Studies a and b employed the same chatbot named Ally
  2. PA physical activity, SR self-report, OM objective measure, MVPA moderate to vigorous physical activity, RPMC red and processed meat consumption, NR not reported