QUICK REVIEW

[论文解读] Structured Exploration vs. Generative Flexibility: A Field Study Comparing Bandit and LLM Architectures for Personalised Health Behaviour Interventions

Dominik P. Hofer, Haochen Song|arXiv (Cornell University)|Mar 6, 2026

Digital Mental Health Interventions被引用 0

一句话总结

该研究在为期4周的现场研究中比较五种日常信息传递方法，结合上下文 bandit 与 LLM 提供个性化健康行为干预（N=54）。LLMs 的表现优于模板；bandit 优化未带来明显的额外感知收益。

ABSTRACT

Behaviour Change Techniques (BCTs) are central to digital health interventions, yet selecting and delivering effective techniques remains challenging. Contextual bandits enable statistically grounded optimisation of BCT selection, while Large Language Models (LLMs) offer flexible, context-sensitive message generation. We conducted a 4-week study on physical activity motivation (N=54; 9 post-study interviews) that compared five daily messaging approaches: random templates, contextual bandit with templates, LLM generation, hybrid bandit+LLM, and LLM with interaction history. LLM-based approaches were rated substantially more helpful than templates, but no significant differences emerged among LLM conditions. Unexpectedly, bandit optimisation for BCTs selection yielded no additional perceived helpfulness compared with LLM-only approaches. Unconstrained LLMs focused heavily on a single BCT, whereas bandit systems enforced systematic exploration-exploitation across techniques. Quantitative and qualitative findings suggest contextual acknowledgement of user input drove perceived helpfulness. We contribute design suggestions for reflective AI health behaviour change systems that address a trade-off between structured exploration and generative autonomy.

研究动机与目标

在数字健康干预中，阐明如何选择并交付行为改变技巧（BCTs）
探究上下文 bandits 是否提升 BCT 选择相较于基于 LLM 的信息传递
考察在健康信息传递设计中结构化探索与生成灵活性之间的权衡
在真实场景中评估不同信息传递方法的用户感知帮助性

提出的方法

进行为期4周、共54名参与者的现场研究，评估五种日常信息传递方法
比较随机模板、带模板的上下文 bandit、LLM 生成、混合 bandit+LLM，以及带交互历史的 LLM
测量感知帮助性与动机结果，辅以事后访谈（N=9）
分析定性反馈以识别影响感知有效性的因素
在技术层面对比不受约束的 LLM 注意力聚焦与 bandit 强制性探索-利用 across 技术的差异

实验结果

研究问题

RQ1基于 LLM 的信息传递方法是否在感知帮助性上优于随机模板和 bandit 辅助方法？
RQ2在仅 LLM 方法之上加入 bandit 优化用于 BCT 选择，是否带来额外的感知收益？
RQ3用户输入确认在感知帮助性和动机中扮演何种角色？
RQ4在设计 AI 健康行为改变系统时，结构化探索与生成灵活性如何权衡？

主要发现

基于 LLM 的方法被认为比模板明显更有帮助
LLM 条件之间没有显著差异
用于 BCT 选择的 bandit 优化相比仅 LLM 的方法没有带来额外的感知帮助性
不受约束的 LLM 往往聚焦于单一 BCT，而 bandit 系统在跨技术上强制性地进行探索-利用
对用户输入的上下文性确认提升了感知帮助性

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。