QUICK REVIEW

[论文解读] Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes

Yihong Sun, Bharath Hariharan|arXiv (Cornell University)|Oct 29, 2023

Advanced Vision and Imaging被引用 11

一句话总结

Dynamo-Depth 通过从未标注视频中联合学习单目深度、3D 独立光流、 ego-motion、和运动分割，以解耦动态对象并在移动区域提升深度估计。

ABSTRACT

Unsupervised monocular depth estimation techniques have demonstrated encouraging results but typically assume that the scene is static. These techniques suffer when trained on dynamical scenes, where apparent object motion can equally be explained by hypothesizing the object's independent motion, or by altering its depth. This ambiguity causes depth estimators to predict erroneous depth for moving objects. To resolve this issue, we introduce Dynamo-Depth, an unifying approach that disambiguates dynamical motion by jointly learning monocular depth, 3D independent flow field, and motion segmentation from unlabeled monocular videos. Specifically, we offer our key insight that a good initial estimation of motion segmentation is sufficient for jointly learning depth and independent motion despite the fundamental underlying ambiguity. Our proposed method achieves state-of-the-art performance on monocular depth estimation on Waymo Open and nuScenes Dataset with significant improvement in the depth of moving objects. Code and additional results are available at https://dynamo-depth.github.io.

研究动机与目标

解决动态场景中无监督单目深度估计的深度-运动模糊/歧义。
利用3D场景流框架，将相机自运动与独立物体运动解耦。
引入一种运动初始化策略，在无标签情况下自起动运动分割。
在 Waymo Open 和 nuScenes 上达到最新的深度估计性能，在移动对象上获得显著提升。

提出的方法

从未标注的单目视频预测深度、相机自运动和3D独立光流。
通过一个完整的光流网络和一个运动掩码来建模独立运动，以门控残余光流。
基于深度和自运动计算刚性光流，并与独立光流结合以重建目标帧。
使用两阶段的运动初始化，在早期冻结深度更新以引导运动分割。
以光度重建损失为主，并结合边缘感知平滑、运动一致性、稀疏性和地面平面惩罚等正则项进行优化。

实验结果

研究问题

RQ1在存在动态对象且无监督的情况下，能否可靠地学习到无监督单目深度估计？
RQ2与静态场景假设相比，显式建模3D独立光流和运动掩码是否能提升移动对象的深度？
RQ3早期阶段的运动初始化是否能防止深度和运动共同解释重建而导致的退化解？
RQ4在 Waymo Open 和 nuScenes 上，移动对象的深度精度和运动分割能实现哪些提升？

主要发现

Sem	D	误差绝对相对	误差平方相对	误差 RMSE	误差 RMSE log	准确率 δ<1.25	准确率 δ<1.25^2	准确率 δ<1.25^3
	K	0.115	0.882	4.701	0.190	0.879	0.961	0.982
	K	0.101	0.729	4.454	0.178	0.897	0.965	0.983
m	K	0.141	1.026	5.290	0.215	0.816	0.945	0.979
m	K	0.115	0.785	4.698	0.192	0.871	0.959	0.982
b	K	0.114	0.876	4.715	0.191	0.872	0.955	0.981
m	K	0.113	0.835	4.693	0.191	0.879	0.961	0.981
m	K	0.113	0.704	4.581	0.184	0.871	0.961	0.984
	K	0.110	0.719	4.486	0.184	0.878	0.964	0.984
	K	0.120	0.864	4.850	0.195	0.858	0.956	0.982
	N	0.193	2.285	7.357	0.287	0.765	0.885	0.935

在 Waymo Open 和 nuScenes 数据集上达到最先进的深度精度。
在移动对象上有显著提升，准确率最高提升至62%，误差相对降低68%。
在无监督条件下，运动分割的F1分数高达71.8%。
在明确处理动态区域的同时，展示了强的整体深度性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。