[论文解读] Don't Leave Me Alone: Retrospective Think Aloud supported by Real-time Monitoring of Participant's Physiology
本研究通过实时监测皮肤电导、皮肤温度和血容量脉搏,评估了两种回顾性口头报告(RTA)方案——严格指导模式与生理支持干预模式。结果显示,生理支持RTA显著增加了报告的可用性问题数量,而未改变参与者在会话期间的自我报告情绪评分或生理信号。
Think aloud protocols are widely applied in user experience studies. In this paper, the effect of two different applications of the Retrospective Think Aloud (RTA) protocol on the number of user-reported usability issues is examined. To this end, 30 users were asked to use the National Cadastre and Mapping Agency web application and complete a set of tasks, such as measuring the land area of a square in their hometown. The order of tasks was randomized per participant. Next, participants were involved in RTA sessions. Each participant was involved in two different RTA modes: (a) the strict guidance, in which the facilitator stayed in the background and prompted participants to keep thinking aloud based on his judgement and experience, and (b) the physiology-supported interventions, in which the facilitator intervened based on real-time monitoring of user's physiological signals. During each session, three participant's physiological signals were recorded: skin conductance, skin temperature and blood volume pulse. Participants were also asked to provide valence-arousal ratings for each self-reported usability issue. Analysis of the collected data showed that participants in the physiology-supported RTA mode reported significantly more usability issues. No significant effect of the RTA mode was found on the va-lence-arousal ratings for the reported usability issues. Participants' physiological signals during the RTA sessions did not also differ significantly between the two modes.
研究动机与目标
- 探究生理支持RTA是否相较于传统严格指导RTA能提升可用性问题的检测效果。
- 考察RTA模式对参与者自我报告情绪状态(通过愉悦度-唤醒度评分)的影响。
- 评估不同RTA模式下生理信号(皮肤电导、皮肤温度、血容量脉搏)是否存在差异,以作为情绪状态的指标。
- 通过整合实时生理反馈以指导操作,解决标准RTA在反应性和回忆能力方面的局限性。
- 为可用性实践者提供基于证据的指导,以优化在用户体验评估中应用RTA协议。
提出的方法
- 在30名参与者中开展被试内实验,使用希腊国家地籍与测绘局网络应用程序。
- 参与者先完成一组任务(如测量土地面积),随后参与两次RTA会话:一次采用严格指导模式,一次采用基于生理信号的干预模式。
- 在每次RTA会话期间,实时记录生理信号(皮肤电导、皮肤温度、血容量脉搏)。
- 在生理支持模式下,指导员通过实时观察生理数据来判断干预时机,同时保持与严格指导模式的一致性。
- 参与者使用10分制对每个报告的可用性问题的愉悦度和唤醒度进行评分。
- 统计分析采用参数检验(t检验)与非参数检验(Mann-Whitney检验、Wilcoxon符号秩检验),比较不同RTA模式下的结果差异。
实验结果
研究问题
- RQ1RQ1:RTA模式是否对用户在RTA会话中报告的可用性问题总数有影响?
- RQ2RQ2:RTA模式是否对用户在报告可用性问题时的自我情绪评分(愉悦度-唤醒度)有影响?
- RQ3RQ3:RTA模式是否对用户在RTA会话期间的情绪状态产生影响,表现为生理信号差异?
- RQ4参与者在生理支持模式下是否报告了更多处于压力状态的可用性问题(定义为愉悦度<5且唤醒度>5)?
主要发现
- 在生理支持RTA模式下,参与者报告的可用性问题显著更多(M=3.96,SD=1.57),高于严格指导模式(M=2.96,SD=1.49),p值为0.033,效应量r=0.43。
- 两种RTA模式下,报告的可用性问题在愉悦度-唤醒度评分上无显著差异,Mann-Whitney检验的p值分别为0.420(愉悦度)和0.325(唤醒度)。
- 通过Wilcoxon符号秩检验确认,不同RTA模式下生理信号(皮肤电导、皮肤温度、血容量脉搏)无显著差异(所有p值>0.05)。
- 尽管平均生理信号无显著差异,但生理支持模式下所有三项信号的平均值均较低(GSR:3.75 vs. 4.05;TEMP:27.96 vs. 28.43;BVP:-23.18 vs. -22.91)。
- 在生理支持模式下,参与者报告的处于压力区的可用性问题更多(N=33),高于严格指导模式(N=22),但该差异未达统计显著性(愉悦度p=0.311,唤醒度p=0.190)。
- 可用性问题报告的分布非正态,因此采用非参数检验分析情绪评分与生理数据是合理的。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。