QUICK REVIEW

[Paper Review] Propagating Confidences through CNNs for Sparse Data Regression

Abdelrahman Eldesokey, Michael Felsberg|arXiv (Cornell University)|May 30, 2018

Advanced Neural Network Applications75 citations

TL;DR

Introduces an algebraically-constrained normalized convolution layer for CNNs that handles sparse inputs by propagating continuous confidences through layers, enabling dense outputs and pixel-wise confidence maps for depth completion with far fewer parameters.

ABSTRACT

In most computer vision applications, convolutional neural networks (CNNs) operate on dense image data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open problem with numerous applications in autonomous driving, robotics, and surveillance. To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. Furthermore, we propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. Comprehensive experiments are performed on the KITTI depth benchmark and the results clearly demonstrate that the proposed approach achieves superior performance while requiring three times fewer parameters than the state-of-the-art methods. Moreover, our approach produces a continuous pixel-wise confidence map enabling information fusion, state inference, and decision support.

Motivation & Objective

Address the challenge of regression on sparse, irregular input data in vision tasks.
Develop an algebraically-constrained normalized convolution operator that propagates continuous confidence through layers.
Enforce non-negativity constraints on weights to maintain valid confidences and design a loss balancing data error with output confidence.
Propose a multi-scale network that shares weights to remain compact while improving receptive field.
Demonstrate state-of-the-art depth completion on KITTI with far fewer parameters and provide pixel-wise confidence maps for fusion and decision support.

Proposed method

Use a normalized convolution framework with a confidence mask to handle sparse inputs.
Learn the applicability (localization) function a non-negatively using differentiable nonlinearity (e.g., softplus) during training.
Define forward pass Z= (sum over neighborhood of Z_prev * C_prev * Gamma(W)) / (sum over neighborhood C_prev * Gamma(W)) + eps.
Propagate confidences using a geometric ratio of Grammian determinants, instantiated as C_out = (sum over neighborhood C_prev * Gamma(W) + eps) / (sum over neighborhood Gamma(W)).
Introduce a loss combining data error (Huber norm) with a confidence term that grows with epoch to maximize output confidence without exploding.
Adopt a hierarchical multi-scale architecture with shared weights and scale-fusion via normalized convolution that leverages confidence maps.

Experimental results

Research questions

RQ1Can continuous confidences be propagated through CNN layers for sparse data regression tasks?
RQ2Does an algebraically-constrained normalized convolution improve depth completion with sparse inputs while reducing parameter count?
RQ3What is the effect of multi-scale fusion leveraging confidence information on reconstruction accuracy and uncertainty estimates?
RQ4How does the proposed method compare to state-of-the-art sparse-depth completion methods on KITTI in terms of accuracy and model size?

Key findings

Method	MAE [m]	RMSE [m]	MRE	delta<1.01	delta<1.01^2	delta<1.01^3	#Params	Output Conf.
CNN	0.78	2.97	-	-	-	-	2.5e4	No
CNN+mask	0.79	2.24	-	-	-	-	2.5e4	No
SparseConv	0.58	1.80	0.035	0.33	0.65	0.82	2.5e4	No
Sparse-To-Dense	0.70	1.68	0.039	0.21	0.41	0.59	3.4e6	No
DCCS-1-Layer	0.83	2.77	0.054	0.30	0.47	0.59	1.0e3	No
DCCS-2-Layers	0.47	1.45	0.028	0.41	0.68	0.80	1.8e3	No
DCCS-3-Layers	0.43	1.35	0.024	0.48	0.73	0.83	1.7e3	No
NConv-1-Scale(16ch)	0.40	1.58	0.022	0.60	0.81	2.5e4	Yes
NConv-1-Scale(4ch)	0.42	1.59	0.022	0.59	0.80	2.0e3	Yes
NConv-HMS	0.38	1.37	0.021	0.60	0.81	4.8e2	Yes
NConv-SF-STD	0.53	3.0	0.037	0.59	0.80	4.8e2	No

The proposed NConv-HMS architecture achieves state-of-the-art results on KITTI depth benchmark while using only 480 parameters.
Single-scale NConv-1-Scale(16ch) outperforms comparable methods on MAE, MRE, and delta metrics, demonstrating the benefit of continuous confidences over binary masks.
The compact NConv-1-Scale(4ch) maintains competitive performance with substantially fewer parameters.
Multi-scale fusion with confidence-aware normalized convolution (NConv-HMS) improves RMSE close to the best multi-layer methods while keeping parameter count very low.
Confidence-based scale fusion (NConv-HMS) significantly outperforms standard fusion (NConv-SF-STD) that ignores confidence information.
On the test set, the proposed method surpasses published state-of-the-art methods, including DCCS-3-Layers, in overall performance.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.