QUICK REVIEW

[论文解读] Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models

Evan King, Haoxiang Yu|arXiv (Cornell University)|May 16, 2023

AI in Service Interactions参考文献 49被引用 7

一句话总结

该论文研究使用大型语言模型来解释-under-specified的智能家居目标，识别失败模式，并提出 Sasha，一种迭代推理的智能家居助手，在真实用户研究中进行了评估。

ABSTRACT

Smart home assistants function best when user commands are direct and well-specified (e.g., "turn on the kitchen light"), or when a hard-coded routine specifies the response. In more natural communication, however, human speech is unconstrained, often describing goals (e.g., "make it cozy in here" or "help me save energy") rather than indicating specific target devices and actions to take on those devices. Current systems fail to understand these under-specified commands since they cannot reason about devices and settings as they relate to human situations. We introduce large language models (LLMs) to this problem space, exploring their use for controlling devices and creating automation routines in response to under-specified user commands in smart homes. We empirically study the baseline quality and failure modes of LLM-created action plans with a survey of age-diverse users. We find that LLMs can reason creatively to achieve challenging goals, but they experience patterns of failure that diminish their usefulness. We address these gaps with Sasha, a smarter smart home assistant. Sasha responds to loosely-constrained commands like "make it cozy" or "help me sleep better" by executing plans to achieve user goals, e.g., setting a mood with available devices, or devising automation routines. We implement and evaluate Sasha in a hands-on user study, showing the capabilities and limitations of LLM-driven smart homes when faced with unconstrained user-generated scenarios.

研究动机与目标

探索 LLM 如何在智能家居中支持 loosely-constrained 用户目标。
在使用 LLM 进行智能家居控制时识别实际挑战和失败模式。
开发 Sasha，利用迭代推理来改进行动计划并降低目标偏移。
通过动手用户研究和真实世界部署来评估 LLM 驱动的智能家居控制。

提出的方法

原型系统使用零-shot 提示与 JSON 家居模板，从自然语言指令生成可执行的行动计划。
对 20 名参与者的经验研究，分析 600 个标签和自由形式推理以识别失败模式和满意度。
Sasha 引入迭代推理以引导 LLM 朝向高质量计划并减少幻觉。
在一个具有不受约束用户指令的测试家中的实现，以评估在现实场景中的能力和局限性。
使用提示工程和基于 JSON 的规划，以及后处理来确保有效的行动计划。

实验结果

研究问题

RQ1RQ1：使用 LLMs 进行智能家居控制时解锁了哪些独特能力？
RQ2RQ2：基于 LLM 的系统会带来哪些实际挑战？
RQ3RQ3：哪些系统设计选择可以应对这些实际挑战？
RQ4RQ4：在不受约束的情景下，这种新型智能家居对用户目标的支持程度如何？

主要发现

LLMs 能灵活地生成对未充分指定的指令的创造性行动计划。
LLMs 表现出降低有用性和用户满意度的失败模式。
Sasha 的迭代推理减少了误报和设备定位错误。
系统在测试家庭中通过自动化实现了即时目标和更高层次的持续目标。
真实世界的用户研究表明，在不受约束情景下，基于 LLM 的智能家居具备一定能力，但仍有局限。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。