Skip to main content
QUICK REVIEW

[论文解读] Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency

Zhenheng Yang, Peng Wang|arXiv (Cornell University)|Nov 10, 2017
Advanced Vision and Imaging参考文献 30被引用 104
一句话总结

本文提出一个无监督框架,通过强制几何一致性和边缘感知的平滑性,联合从单目视频估计深度和表面法线,在 KITTI 2015 上超越了现有方法。

ABSTRACT

Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years. In this paper, we introduce a surface normal representation for unsupervised depth estimation framework. Our estimated depths are constrained to be compatible with predicted normals, yielding more robust geometry results. Specifically, we formulate an edge-aware depth-normal consistency term, and solve it by constructing a depth-to-normal layer and a normal-to-depth layer inside of the DCN. The depth-to-normal layer takes estimated depths as input, and computes normal directions using cross production based on neighboring pixels. Then given the estimated normals, the normal-to-depth layer outputs a regularized depth map through local planar smoothness. Both layers are computed with awareness of edges inside the image to help address the issue of depth/normal discontinuity and preserve sharp edges. Finally, to train the network, we apply the photometric error and gradient smoothness for both depth and normal predictions. We conducted experiments on both outdoor (KITTI) and indoor (NYUv2) datasets, and show that our algorithm vastly outperforms state of the art, which demonstrates the benefits from our approach.

研究动机与目标

  • 激发从单目视频中无监督学习场景几何(深度和法线)
  • 将视图合成为监督信号以强化几何一致性。
  • 将深度-法线一致性作为正则化项以提高深度和法线估计。
  • 通过边缘感知平滑项和图像梯度项来处理深度不连续性和低纹理区域。

提出的方法

  • 端到端 CNN,它从单目视频序列学习相机运动、深度和表面法线。
  • 基于 3D 反向扭曲的光度扭曲损失,用于从源视图合成目标视图。
  • 边缘感知平滑损失,遵循图像梯度以保留深度不连续性。
  • 图像梯度匹配损失,用于促进清晰深度和更好对齐的图像梯度。
  • 显式的 depth2normal 和 normal2depth 层以实现深度和法线之间几何一致性。

实验结果

研究问题

  • RQ1是否可以在无监督的方式下,利用几何和光度约束,从单目视频联合估计深度和表面法线?
  • RQ2显式深度-法线几何正则化如何影响深度和法线估计的质量?
  • RQ3在低纹理区域,边缘感知项对深度平滑性和不连续性的影响是什么?

主要发现

  • 该框架在 KITTI 2015 的深度和法线评估指标上达到最先进水平。
  • 通过专用层加入深度-法线一致性提升了深度和法线图的质量。
  • 边缘感知平滑和基于梯度的损失有助于保留与图像边缘对齐的深度不连续性。
  • 视图合成监督(光度扭曲)为从单目视频学习提供了强几何信号。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。