Skip to main content
QUICK REVIEW

[论文解读] Sensing What Surveys Miss: Understanding and Personalizing Proactive LLM Support by User Modeling

Ailin Liu, Yesmine Karoui|arXiv (Cornell University)|Jan 31, 2026
Digital Mental Health Interventions被引用 0
一句话总结

论文提出一个自适应、前瞻性的在LLM辅助下的问卷支持系统,利用生理信号(EDA)和行为信号(鼠标)来预测用户何时需要帮助,按用户个性化阈值,并展示与时间对齐的准确性和用户体验提升。

ABSTRACT

Difficulty spillover and suboptimal help-seeking challenge the sequential, knowledge-intensive nature of digital tasks. In online surveys, tough questions can drain mental energy and hurt performance on later questions, while users often fail to recognize when they need assistance or may satisfy, lacking motivation to seek help. We developed a proactive, adaptive system using electrodermal activity and mouse movement to predict when respondents need support. Personalized classifiers with a rule-based threshold adaptation trigger timely LLM-based clarifications and explanations. In a within-subjects study (N=32), aligned-adaptive timing was compared to misaligned-adaptive and random-adaptive controls. Aligned-adaptive assistance improved response accuracy by 21%, reduced false negative rates from 50.9% to 22.9%, and improved perceived efficiency, dependability, and benevolence. Properly timed interventions prevent cascades of degraded responses, showing that aligning support with cognitive states improves both the outcomes and the user experience. This enables more effective, personalized LLM-assisted support in survey-based research.

研究动机与目标

  • 说明在自助管理的调查中需要实时、个性化帮助以防止认知过载和数据质量下降。
  • 开发一个将生理信号(EDA)与行为信号(鼠标)融合的自适应系统,以预测受访者何时需要帮助。
  • 通过基于梯度的微调和基于规则的阈值自适应,将干预时机个性化到个体用户。
  • 通过同一被试内的研究,比较对齐自适应、错位自适应和随机自适应时机的系统表现。
  • 证明恰当时机的主动帮助可以提高准确性和用户体验,同时降低认知负担。

提出的方法

  • 在网页多项选择任务中收集多模态数据(EDA、鼠标动态、眼动、心电)并调节难度。
  • 使用 SelectKBest 与 f_regression 选择信息量大的特征,以驱动两个单模态基线模型(基线为静态EDA变化与鼠标移动特征)。
  • 将最终过载分数计算为基于EDA和基于鼠标的预测值的最大值:y_Final = max(y_Mouse, y_EDA)。
  • 通过 calibration 阶段和一次性梯度式更新,利用带L2正则化的梯度下降实现按用户的模型个性化。
  • 通过基于规则的更新,动态调整干预阈值以基于用户交互和结果来优化时机。
  • 在本地托管的 LLaMA-2-7B 模型中,当用户在干预后选中文字时生成解释(解释文本)。
Figure 1 . Screenshots of the experimental interface: (a) calibration phase, (b) rest break, (c) condition task question, (d) adaptive helper triggered, (e) waiting for user selecting text, and (f) assistance provided by the system.
Figure 1 . Screenshots of the experimental interface: (a) calibration phase, (b) rest break, (c) condition task question, (d) adaptive helper triggered, (e) waiting for user selecting text, and (f) assistance provided by the system.

实验结果

研究问题

  • RQ1对齐自适应时机的前瞻性LLM辅助是否在任务准确性上优于错位自适应和随机自适应时机?
  • RQ2对齐自适应干预是否在用户体验(效率、可靠性、仁慈性)上更好并且工作负载更低?
  • RQ3生理(EDA)和行为(鼠标)信号是否能够充分预测瞬时认知过载以触发及时支持?
  • RQ4个性化(每用户阈值调优)在影响干预效果方面对不同参与者有何作用?
  • RQ5时机对防止调查式任务中响应级联下降的影响是什么?

主要发现

  • 对齐自适应时机将响应准确性从41%提升到62%。
  • 与对照相比,对齐自适应时机降低了错过帮助机会的概率。
  • 参与者在效率、可靠性和仁慈性方面对对齐自适应系统给出更高评价。
  • 在对齐自适应时机下,系统获得最高的接受率。
  • 通过 calibration 与梯度更新实现的个性化有助于使预测与个体认知负载保持一致。
  • 当与实时认知状态对齐时,主动、及时的LLM解释能够带来更好的结果和用户体验。
Figure 2 . Rule-based threshold adaptation logic per trial: positive changes increase the threshold (fewer future interventions), negative changes decrease it (more future interventions).
Figure 2 . Rule-based threshold adaptation logic per trial: positive changes increase the threshold (fewer future interventions), negative changes decrease it (more future interventions).

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。