QUICK REVIEW

[论文解读] A Rate-Distortion view of human pragmatic reasoning

Noga Zaslavsky, Jennifer Hu|arXiv (Cornell University)|May 13, 2020

Natural Language Processing Techniques参考文献 21被引用 60

一句话总结

该论文将 Rational Speech Act (RSA) 重新表述为交替最大化，展示 RSA 在期望效用与传达努力之间的权衡，且将 RSA 基于 Rate–Distortion (RD) 理论从而形成 RD-RSA，比较其动态和行为预测。

ABSTRACT

What computational principles underlie human pragmatic reasoning? A prominent approach to pragmatics is the Rational Speech Act (RSA) framework, which formulates pragmatic reasoning as probabilistic speakers and listeners recursively reasoning about each other. While RSA enjoys broad empirical support, it is not yet clear whether the dynamics of such recursive reasoning may be governed by a general optimization principle. Here, we present a novel analysis of the RSA framework that addresses this question. First, we show that RSA recursion implements an alternating maximization for optimizing a tradeoff between expected utility and communicative effort. On that basis, we study the dynamics of RSA recursion and disconfirm the conjecture that expected utility is guaranteed to improve with recursion depth. Second, we show that RSA can be grounded in Rate-Distortion theory, while maintaining a similar ability to account for human behavior and avoiding a bias of RSA toward random utterance production. This work furthers the mathematical understanding of RSA models, and suggests that general information-theoretic principles may give rise to human pragmatic reasoning.

研究动机与目标

澄清 RSA 递归推理背后的优化原则。
展示 RSA 动态作为在效用与传达努力之间的权衡的交替最大化过程。
将 RSA 基于 Rate–Distortion 理论推导 RD-RSA，并将其预测与 RSA 进行比较。
评估 RD-RSA 是否在保持 RSA 解释力的同时，减少偏向非信息性随机话语的偏差。

提出的方法

将 RSA 表述为 G_alpha = H_S(U|M) + alpha E_S[V_L] 的优化问题，显示 RSA 递归实现交替最大化（S_t 和 L_t 更新）。
通过最小化 F_alpha[S,L] = I_S(M;U) - alpha E_S[V_L] 来推导 RD-RSA，并推导自洽更新规则 S(u|m) ∝ S(u) exp(alpha V_L(m,u))，S(u) = sum_m P(m) S(u|m)，L(m|u) ∝ S(u|m) P(m)/S(u)。
分析 RSA 与 RD-RSA 的渐近行为，作为 alpha 的函数，包括 alpha = 1 的临界点。
将 RSA 与 RD-RSA 的预测与来自参考游戏实验的人类数据进行比较，在不同的递归深度上评估拟合。
讨论对信息理论解释务实推理的意义，以及与相关框架如最优传输（Optimal Transport）可能的联系。

实验结果

研究问题

RQ1RSA 递归是否只最大化期望效用，还是包含传达努力的权衡？
RQ2RSA 是否可以在 Rate–Distortion 理论下得到 RD-RSA，并导致说话者更新的改变？
RQ3当递归深度增加、以及 alpha 变化时，RSA 与 RD-RSA 的动态表现如何？
RQ4RD-RSA 的预测是否在保持对人类数据的解释力的同时，更好地避免非信息性随机话语？
RQ5RSA 与 RD-RSA 与参考游戏数据中的人类务实推理相比如何？

主要发现

RSA 递归实现交替最大化，在期望效用与传达努力之间优化权衡，而非严格的期望效用最大化。
RD-RSA 为 RSA 提供了有原则的 RD 理论基础，对说话者更新规则做了轻微但有意义的修改。
RSA 与 RD-RSA 都呈现出 alpha 相关的动态，临界点在 alpha = 1，在某些条件下 RD-RSA 可以在 alpha = 1 全局收敛。
RD-RSA 在人类数据上的预测准确度与 RSA 相近，但避免了 RSA 向非信息性随机话语产出倾向的偏差。
与参考游戏数据集的经验比较表明，在早期递归阶段，RSA 和 RD-RSA 都比字面听者的预测更优，随着深度增加性能趋于平稳或下降。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。