QUICK REVIEW

[论文解读] Monte Carlo Gradient Estimation in Machine Learning

Shakir Mohamed, Mihaela Rosca|arXiv (Cornell University)|Jun 25, 2019

Machine Learning and Algorithms参考文献 149被引用 85

一句话总结

论文综述蒙特卡洛梯度估计器，用于对分布参数的期望梯度，详细介绍得分函数、路径导数和测度值方法及其联系和跨领域的方差化减技术。

ABSTRACT

This paper is a broad and accessible survey of the methods we have at our disposal for Monte Carlo gradient estimation in machine learning and across the statistical sciences: the problem of computing the gradient of an expectation of a function with respect to parameters defining the distribution that is integrated; the problem of sensitivity analysis. In machine learning research, this gradient problem lies at the core of many learning problems, in supervised, unsupervised and reinforcement learning. We will generally seek to rewrite such gradients in a form that allows for Monte Carlo estimation, allowing them to be easily and efficiently used and analysed. We explore three strategies--the pathwise, score function, and measure-valued gradient estimators--exploring their historical development, derivation, and underlying assumptions. We describe their use in other fields, show how they are related and can be combined, and expand on their possible generalisations. Wherever Monte Carlo gradient estimators have been derived and deployed in the past, important advances have followed. A deeper and more widely-held understanding of this problem will lead to further advances, and it is these advances that we wish to support.

研究动机与目标

动机化并形式化关于对分布参数求期望梯度的问题。
综述三大主要的蒙特卡洛梯度估计量及其推导：得分函数、路径导数和测度值。
解释方差化简技术以及在学习、推断和决策问题中应用这些估计量的实际考虑。
展示估计量之间的联系与推广，并为未来研究提供指引。

提出的方法

利用得分函数推导对分布参数的期望梯度，从而得到得分函数估计量。
描述路径法（成本的导数）方法，包括在可微性允许时何时使用。
引入测度值梯度估计量，作为测度的导数，包括耦合和方差特性。
在简单高斯示例上比较估计量，以说明方差与成本之间的权衡。
讨论方差化简技术，如对比分变量用于得分函数估计量，以及对测度值估计量的耦合。
概述估计量如何相互关联、结合和推广，以适用于更广泛的应用。

实验结果

研究问题

RQ1当没有封闭解时，我们如何计算对分布参数的期望梯度？
RQ2蒙特卡洛梯度估计的基本估计量（得分函数、路径导数、测度值）有哪些，它们的假设和局限性是什么？
RQ3在代表性问题中，这些估计量在方差、偏差和计算成本方面的比较如何？
RQ4哪些方差化减策略对这些估计量有效，并在何种条件下适用？
RQ5这些梯度估计量如何在变分推断、强化学习、灵敏度分析和实验设计等领域应用和推广？

主要发现

分析了三大主要梯度估计量：得分函数、路径导数和测度值，每种都有不同的假设和权衡。
在各自条件下，估计量是一致且无偏的，但它们表现出不同的方差特征和计算成本。
通过对比分变量来降低得分函数估计量的方差，以及通过耦合来降低测度值估计量的方差，方差可以显著降低。
代价函数的可微性和问题结构会影响每种估计量的适用性与性能。
本文将这些估计量联系起来，讨论它们如何结合或推广以应对更广泛的问题。
在变分推断、强化学习、灵敏度分析和离散事件系统等领域的应用被强调为这些估计量的核心场景。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。