QUICK REVIEW

[论文解读] BA-Net: Dense Bundle Adjustment Network

Chengzhou Tang, Ping Tan|arXiv (Cornell University)|Jun 13, 2018

Advanced Vision and Imaging参考文献 56被引用 129

一句话总结

BA-Net 引入一个可微分的特征-度量 bundle adjustment 层，以及通过学习的基础深度图实现的密集深度参数化，使得对多视图的结构从运动（SfM）能够端到端训练。

ABSTRACT

This paper introduces a network architecture to solve the structure-from-motion (SfM) problem via feature-metric bundle adjustment (BA), which explicitly enforces multi-view geometry constraints in the form of feature-metric error. The whole pipeline is differentiable so that the network can learn suitable features that make the BA problem more tractable. Furthermore, this work introduces a novel depth parameterization to recover dense per-pixel depth. The network first generates several basis depth maps according to the input image and optimizes the final depth as a linear combination of these basis depth maps via feature-metric BA. The basis depth maps generator is also learned via end-to-end training. The whole system nicely combines domain knowledge (i.e. hard-coded multi-view geometry constraints) and deep learning (i.e. feature learning and basis depth maps learning) to address the challenging dense SfM problem. Experiments on large scale real data prove the success of the proposed method.

研究动机与目标

将多视几何约束纳入一个可学习的 SfM 流程，通过可微分的 BA 层。
学习针对束束调整优化鲁棒性的特征表示。
开发紧凑、可学习的基础深度参数化，用于密集深度图，以实现端到端训练。

提出的方法

引入一个可微分的 BA-Layer，最小化跨多视图的特征-度量误差。
构建基于 CNN 的特征金字塔（学习的特征），为 BA 优化提供稳定的多尺度输入。
将密集深度参数化为由编码器-解码器网络生成的128个基础深度图的线性组合。
通过 MLP 预测 LM 阻尼因子 lambda，以实现可微分的 Levenberg–Marquardt 优化。
进行粗到细的优化，在特征金字塔和重投影上进行可微分 LM 步骤，5 次每个层级（总共 15 次）。
使用有监督的 pose 和 depth 损失端到端训练骨干网络、特征金字塔、阻尼预测器和基础深度生成器。

实验结果

研究问题

RQ1可微分的特征-度量 BA 层在强制多视几何约束的同时，是否允许端到端学习用于 SfM 的特征？
RQ2学习基础深度参数化是否能改善多视场景下的密集深度恢复和优化收敛？
RQ3针对 BA 的特征学习与摄影度/几何 BA 以及先前的 SfM 网络在真实数据集上的比较如何？

主要发现

方法	旋转 (度)	平移 (厘米)	平移 (度)	绝对相对差异	平方相对差异	RMSE（线性）	RMSE（对数）	RMSE（对数，尺度的倒数）
Ours	1.018	3.39	20.577	0.161	0.092	0.346	0.214	0.184
Ours*	1.587	10.81	31.005	0.238	0.176	0.488	0.279	0.276
DeMoN*	3.791	15.5	31.626	0.231	0.520	0.761	0.289	0.284
Photometric BA	4.409	21.40	34.36	0.268	0.427	0.788	0.330	0.323
Geometric BA	8.56	36.995	39.392	0.382	0. -	0.876	0.366	0.357

BA-Net 在 ScanNet 和 KITTI 数据集上优于 DeMoN、LS-Net 以及传统 BA 基线。
具有学习特征的特征-度量 BA 提供更平滑的目标函数景观和比 RGB 或预训练 CNN 特征更好的收敛性。
密集深度有效地表示为基础地图的学习线性组合，与对象边界的一致性有所提高。
可微分 LM 结合学习的阻尼因子，使端到端训练和通过 BA 过程反向传播成为可能。
在 KITTI 上，BA-Net 相对于有监督和无监督基线，在相机轨迹和深度指标上表现更优。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。