QUICK REVIEW

[论文解读] ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane

Shivendra Agrawal, Suresh Nayak|arXiv (Cornell University)|May 30, 2024

Robot Manipulation and Learning被引用 1

一句话总结

ShelfHelp 是一种社会性辅助智能手杖，通过结合基于计算机视觉的产品定位系统与两种新型语音操作引导规划器，使视障人士能够在无需视觉依赖的情况下独立完成购物。该系统在引导时间与指令效率方面达到人类水平表现，两种规划器在感知能力与智能程度上均与人类基准相当，同时显著减少了在密集触觉空间中的盲目搜寻行为。

ABSTRACT

The ability to shop independently, especially in grocery stores, is important for maintaining a high quality of life. This can be particularly challenging for people with visual impairments (PVI). Stores carry thousands of products, with approximately 30,000 new products introduced each year in the US market alone, presenting a challenge even for modern computer vision solutions. Through this work, we present a proof-of-concept socially assistive robotic system we call ShelfHelp, and propose novel technical solutions for enhancing instrumented canes traditionally meant for navigation tasks with additional capability within the domain of shopping. ShelfHelp includes a novel visual product locator algorithm designed for use in grocery stores and a novel planner that autonomously issues verbal manipulation guidance commands to guide the user during product retrieval. Through a human subjects study, we show the system's success in locating and providing effective manipulation guidance to retrieve desired products with novice users. We compare two autonomous verbal guidance modes achieving comparable performance to a human assistance baseline and present encouraging findings that validate our system's efficiency and effectiveness and through positive subjective metrics including competence, intelligence, and ease of use.

研究动机与目标

解决视障人士（PVI）在依赖人类协助时所面临的独立性与隐私缺失问题，尤其是在购物过程中。
克服因产品密度高及触觉空间中存在环境风险而导致的基于触觉的产品检索挑战。
将智能手杖的功能从导航扩展至精细操作引导，以实现产品检索。
开发一种无需重新训练即可支持新产品的离线运行系统，确保可扩展性与实用性。
通过一项针对新手用户与人类基准的试点研究，评估用户偏好与系统有效性。

提出的方法

为智能手杖配备 RealSense D455 和 T265 摄像头，实现实时视觉感知与定位。
实施两阶段计算机视觉流程，利用包装上的视觉特征与语义信息检测目标产品。
设计一种基于马尔可夫决策过程（MDP）的新规划器，以优化引导时间与指令数量，生成语音操作引导指令。
开发两种不同的引导模式：连续模式（定期发送“停止”指令以确认）与离散模式（精确、分步的移动指令）。
集成音频反馈用于操作引导，同时通过触觉反馈实现导航功能，使单一设备具备双重功能。
开展人类受试者研究，对比两种规划器与人类引导员在性能、可用性及主观指标方面的表现。

实验结果

研究问题

RQ1基于视觉的智能手杖系统是否能在不重新训练新产品的前提下，有效定位密集超市环境中的目标产品？
RQ2在引导时间、指令数量及用户感知方面，连续与离散两种语音操作引导策略有何差异？
RQ3在感知能力、智能程度与易用性方面，该系统的性能在多大程度上可与人类引导员相媲美？
RQ4用户对规划器属性（如确认机制、精确度与交互性）的偏好如何？
RQ5该系统是否可通过智能引导显著减轻在触觉空间中产品检索的认知与身体负担？

主要发现

离散规划器在引导时间与指令数量方面与人类基准相当，性能无统计学显著差异。
用户对连续与离散规划器的评价均显示高度胜任与智能，且在这些指标上与人类引导员无显著差异。
连续规划器通过提供确认指令（如“停止”）提升了感知交互性，但用户更偏好离散规划器所提供的精确、无歧义的指令。
该系统显著减少了在密集触觉空间中盲目触觉搜寻的需求，提升了检索效率。
参与者报告系统易用性高、信心强、挫折感低，且心理与时间负担小，表明系统具有出色的可用性。
产品定位系统成功实现实时视觉特征识别，可在引入新产品时无需重新训练。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。