QUICK REVIEW

[论文解读] Understanding and Mitigating the Security Risks of Voice-Controlled Third-Party Skills on Amazon Alexa and Google Home

Nan Zhang, Xianghang Mi|arXiv (Cornell University)|May 3, 2018

Spam and Phishing Detection参考文献 36被引用 56

一句话总结

本文识别了两种基于远程语音的攻击——语音抢占（voice squatting）和语音伪装（voice masquerading）——针对 Alexa 和 Google Home 第三方技能，验证其可行性，并提出防御机制，包括一个音位名名扫描器和一个上下文敏感检测器。

ABSTRACT

Virtual personal assistants (VPA) (e.g., Amazon Alexa and Google Assistant) today mostly rely on the voice channel to communicate with their users, which however is known to be vulnerable, lacking proper authentication. The rapid growth of VPA skill markets opens a new attack avenue, potentially allowing a remote adversary to publish attack skills to attack a large number of VPA users through popular IoT devices such as Amazon Echo and Google Home. In this paper, we report a study that concludes such remote, large-scale attacks are indeed realistic. More specifically, we implemented two new attacks: voice squatting in which the adversary exploits the way a skill is invoked (e.g., "open capital one"), using a malicious skill with similarly pronounced name (e.g., "capital won") or paraphrased name (e.g., "capital one please") to hijack the voice command meant for a different skill, and voice masquerading in which a malicious skill impersonates the VPA service or a legitimate skill to steal the user's data or eavesdrop on her conversations. These attacks aim at the way VPAs work or the user's mis-conceptions about their functionalities, and are found to pose a realistic threat by our experiments (including user studies and real-world deployments) on Amazon Echo and Google Home. The significance of our findings have already been acknowledged by Amazon and Google, and further evidenced by the risky skills discovered on Alexa and Google markets by the new detection systems we built. We further developed techniques for automatic detection of these attacks, which already capture real-world skills likely to pose such threats.

研究动机与目标

评估在 Amazon Alexa 和 Google Home 上由语音控制的第三方技能的安全风险。
演示通过恶意技能进行远程大规模攻击的可行性。
开发检测和防止语音抢占和伪装攻击的缓解技术。

提出的方法

分析调用和技能调用机制，以识别技能审核和语音命令解释中的薄弱环节。
进行用户研究（对156名 Amazon Echo/Google Home 用户的调查）和现实世界部署，以评估攻击可行性。
在市场上开发语音抢占和词汇抢占攻击部署以测试脆弱性。
实现基于音位的 Skill Name Scanner，使用 ARPABET 来检测跨技能的抢占风险。
创建一个上下文敏感的检测器，包含 Skill Response Checker (SRC) 和 User Intention Classifier (UIC)，以缓解伪装攻击。

实验结果

研究问题

RQ1是否可以远程启动恶意的第三方技能来冒充合法技能或 VPA 服务？
RQ2在真实世界的 Alexa/Google Home 部署中，语音抢占和语音伪装是否可行？
RQ3哪些防御措施能够在不降低用户体验的情况下有效检测和缓解这些攻击？
RQ4在技能市场和调用名称中，抢占风险的普遍程度如何？

主要发现

语音抢占可以通过注册音位相近或改写的名称来劫持调用命令（例如 Capital One 与 Capital Won）。
语音伪装允许恶意技能模仿系统技能或合法技能来窃取数据或进行窃听。
调查显示用户使用自然语句，有时错误地切换上下文，导致误调用风险；大约 85% 使用自然语句，28% 打开了未预期的技能。
现实世界部署表明可以上传并测试4个攻击技能，当识别错误时会触发恶意调用。
一个音位名称扫描器在 19,670 个 Amazon 技能中检测到 4,718 个具有抢占风险的技能，表明现实世界风险相当显著。
SRC 和 UIC 检测器提供两层防护，利用音位分析和上下文感知的意图分类来对抗伪装攻击。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。