QUICK REVIEW

[论文解读] Space-Time Correspondence as a Contrastive Random Walk

Allan Jabri, Andrew Owens|arXiv (Cornell University)|Jun 25, 2020

Human Pose and Action Recognition参考文献 113被引用 116

一句话总结

本文提出一种自监督方法，通过将视觉时空对应性问题视为在视频派生的时空图上的对比随机游走来实现，在回文式循环一致性指导下，并通过边缘随机丢弃和测试时自适应来增强。

ABSTRACT

This paper proposes a simple self-supervised approach for learning a representation for visual correspondence from raw video. We cast correspondence as prediction of links in a space-time graph constructed from video. In this graph, the nodes are patches sampled from each frame, and nodes adjacent in time can share a directed edge. We learn a representation in which pairwise similarity defines transition probability of a random walk, so that long-range correspondence is computed as a walk along the graph. We optimize the representation to place high probability along paths of similarity. Targets for learning are formed without supervision, by cycle-consistency: the objective is to maximize the likelihood of returning to the initial node when walking along a graph constructed from a palindrome of frames. Thus, a single path-level constraint implicitly supervises chains of intermediate comparisons. When used as a similarity metric without adaptation, the learned representation outperforms the self-supervised state-of-the-art on label propagation tasks involving objects, semantic parts, and pose. Moreover, we demonstrate that a technique we call edge dropout, as well as self-supervised adaptation at test-time, further improve transfer for object-centric correspondence.

研究动机与目标

从未标注的视频中学习捕捉跨时空的视觉对应性的表征。
将对应关系形式化为在视频patch的时空图上的路径搜索问题。
在回文序列上使用循环一致性来提供无标签的监督。
通过边缘随机丢弃和测试时自适应来提高鲁棒性与迁移能力。

提出的方法

构建一个有向的时空图，其节点是视频帧中的patch，边根据学习得到的相似性将相邻帧中的patch连接起来。
学习一个patch的嵌入phi，使成对的相似性定义一个随机游走的随机转移矩阵。
使用回文序列进行训练，以提供零-shot目标，强制前向和后向游走之间的循环一致性。
将学习表述为沿路径返回到起始节点的似然性最大化，与对比学习目标等价。
在转移矩阵上引入边缘随机丢弃，鼓励游走者依赖替代路径并改善对共同命运区域的分组。
可选地在测试时进行自监督适应，在标签传播之前对无标签视频微调嵌入。

实验结果

研究问题

RQ1自监督表示是否能够从原始视频数据中学习出鲁棒的视觉对应？
RQ2使用回文序列的循环一致性是否可以在没有地面实标签的情况下提供监督？
RQ3引入边缘随机丢弃是否能改善以对象为中心的对应和分割任务？
RQ4测试时自监督是否进一步提升向下游标签传播任务的迁移能力？

主要发现

所学习的表示被用作标签传播的相似性度量，在涉及对象、姿态关键点和语义部位的任务上，且无需任务特定适配，超越了最先进的自监督方法。
在训练中增加游走长度可提高下游表现，表明更长范围的上下文有益。
边缘随机丢弃通过强制模型依赖多条合理路径来提高鲁棒性，改善以对象为中心的对应。
测试时自监督适应在对象传播质量上带来进一步提升，特别是在分割质量的召回方面。
该方法可通过更长的游走进行扩展，并且可以通过简单的扩展在无需复杂监督的情况下扩展。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。