QUICK REVIEW

[论文解读] Gradient Surgery for Multi-Task Learning

Tianhe Yu, Saurabh Kumar|arXiv (Cornell University)|Jan 19, 2020

Domain Adaptation and Few-Shot Learning参考文献 56被引用 108

一句话总结

提出 PCGrad，一种梯度手术方法，通过将冲突任务梯度投影到其他任务梯度的法线平面以减轻多任务学习中的干扰，从而在监督学习和强化学习任务中提高数据效率和性能。

ABSTRACT

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge. Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks to enable more efficient learning. However, the multi-task setting presents a number of optimization challenges, making it difficult to realize large efficiency gains compared to learning tasks independently. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. In this work, we identify a set of three conditions of the multi-task optimization landscape that cause detrimental gradient interference, and develop a simple yet general approach for avoiding such interference between task gradients. We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient. On a series of challenging multi-task supervised and multi-task RL problems, this approach leads to substantial gains in efficiency and performance. Further, it is model-agnostic and can be combined with previously-proposed multi-task architectures for enhanced performance.

研究动机与目标

在多任务学习中识别由梯度干扰引起的优化挑战。
表征冲突梯度、支配梯度和高曲率这三者的悲剧三联。
开发一种梯度手术技术（PCGrad）以缓解梯度冲突。
证明 PCGrad 在监督学习和强化学习任务中的与模型无关的适用性。

提出的方法

定义导致梯度干扰的多任务优化景观的三个条件。
引入 PCGrad，当两任务梯度的余弦相似度为负时，将一个任务梯度投影到另一个任务梯度的法线平面上。
提供一个简单的算法（Algorithm 1）在任何基于梯度的优化器中应用 PCGrad。
在两任务的凸且可微设置中对 PCGrad 进行理论分析，并推导出改进优化的充分条件。
展示将 PCGrad 与现有多任务架构（如 MTAN、routing networks）结合时的兼容性和改进。
在多任务监督学习、多任务 RL 以及目标导向的 RL 上评估 PCGrad，以评估数据效率和性能。

实验结果

研究问题

RQ1PCGrad 是否在多任务学习场景中降低梯度干扰并改善数据效率？
RQ2PCGrad 能否与现有的多任务架构结合以获得进一步的性能提升？
RQ3所提出的梯度三联（冲突梯度、支配梯度、高曲率）是否是多任务学习优化困难的主要因素？
RQ4与基线相比，PCGrad 在监督学习与强化学习设置中的表现如何？

主要发现

PCGrad 在多任务监督学习和多任务 RL 问题上显著改善数据效率和最终性能。
在 CIFAR-100 上，将 PCGrad 与 routing networks 结合可实现测试精度的绝对提升 2.8 个百分点。
在 CelebA 上，PCGrad 取得的多任务分类平均误差优于之前的方法 Sener 和 Koltun（8.69 vs. 8.95）。
MTAN 与 PCGrad 在 NYUv2 任务的 9 类中取得了 8 项的最好分数。
在 Meta-World MT10/MT50 基准测试中，SAC+PCGrad 在成功率和数据效率方面优于基线，能够用更少的样本解决更多任务。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。