QUICK REVIEW

[Paper Review] BA-Net: Dense Bundle Adjustment Network

Chengzhou Tang, Ping Tan|arXiv (Cornell University)|Jun 13, 2018

Advanced Vision and Imaging56 references129 citations

TL;DR

BA-Net introduces a differentiable feature-metric bundle adjustment layer and a dense depth parameterization via learned basis depth maps, enabling end-to-end training for structure-from-motion on multiple views.

ABSTRACT

This paper introduces a network architecture to solve the structure-from-motion (SfM) problem via feature-metric bundle adjustment (BA), which explicitly enforces multi-view geometry constraints in the form of feature-metric error. The whole pipeline is differentiable so that the network can learn suitable features that make the BA problem more tractable. Furthermore, this work introduces a novel depth parameterization to recover dense per-pixel depth. The network first generates several basis depth maps according to the input image and optimizes the final depth as a linear combination of these basis depth maps via feature-metric BA. The basis depth maps generator is also learned via end-to-end training. The whole system nicely combines domain knowledge (i.e. hard-coded multi-view geometry constraints) and deep learning (i.e. feature learning and basis depth maps learning) to address the challenging dense SfM problem. Experiments on large scale real data prove the success of the proposed method.

Motivation & Objective

Incorporate multi-view geometry constraints into a learnable SfM pipeline via a differentiable BA layer.
Learn feature representations tailored for bundle adjustment to improve optimization robustness.
Develop a compact, learnable basis-depth parameterization for dense depth maps to enable end-to-end training.

Proposed method

Introduce a differentiable BA-Layer that minimizes feature-metric error across multiple views.
Construct a CNN-based feature pyramid (learned features) to provide stable, multi-scale inputs for BA optimization.
Parameterize dense depth as a linear combination of 128 basis depth maps generated by an encoder-decoder network.
Predict the LM damping factor lambda through an MLP to enable differentiable Levenberg–Marquardt optimization.
Perform coarse-to-fine optimization with differentiable LM steps across a feature pyramid and warping, for 5 iterations per level (15 total).
Train the backbone, feature pyramid, damping predictor, and basis-depth generator end-to-end with supervised losses on pose and depth.

Experimental results

Research questions

RQ1Can a differentiable feature-metric BA layer enforce multi-view geometry constraints while allowing end-to-end learning of features for SfM?
RQ2Does learning a basis-depth parameterization improve dense depth recovery and optimization convergence in multi-view scenarios?
RQ3How does feature learning tailored for BA compare to photometric/ geometric BA and prior SfM networks on real datasets?

Key findings

Method	Rotation (degree)	Translation (cm)	Translation (degree)	abs relative difference	sqr relative difference	RMSE (linear)	RMSE (log)	RMSE (log, scale inv.)
Ours	1.018	3.39	20.577	0.161	0.092	0.346	0.214	0.184
Ours*	1.587	10.81	31.005	0.238	0.176	0.488	0.279	0.276
DeMoN*	3.791	15.5	31.626	0.231	0.520	0.761	0.289	0.284
Photometric BA	4.409	21.40	34.36	0.268	0.427	0.788	0.330	0.323
Geometric BA	8.56	36.995	39.392	0.382	0. -	0.876	0.366	0.357

BA-Net outperforms DeMoN, LS-Net, and conventional BA baselines on ScanNet and KITTI datasets.
Feature-metric BA with learned features yields smoother objective landscapes and better convergence than RGB or pretrained CNN features.
Dense depth is effectively produced as a learned linear combination of basis maps, improving consistency with object boundaries.
The differentiable LM with a learned damping factor enables end-to-end training and back-propagation through the BA process.
On KITTI, BA-Net achieves superior camera trajectories and depth metrics compared to supervised and unsupervised baselines.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.