QUICK REVIEW

[Paper Review] In Defense of Classical Image Processing: Fast Depth Completion on the CPU

Jason S. Ku, Ali Harakeh|arXiv (Cornell University)|Jan 31, 2018

Advanced Vision and Imaging13 references25 citations

TL;DR

This paper proposes a fast, non-learning, CPU-based depth completion algorithm using classical image processing techniques—such as dilation, hole filling, and Gaussian blurring—on sparse LIDAR depth maps. It achieves state-of-the-art performance on the KITTI benchmark, ranking first with an RMSE of 1350.93 mm, outperforming deep learning-based methods while running at 90 Hz without training data or GPU dependency.

ABSTRACT

With the rise of data driven deep neural networks as a realization of universal function approximators, most research on computer vision problems has moved away from hand crafted classical image processing algorithms. This paper shows that with a well designed algorithm, we are capable of outperforming neural network based methods on the task of depth completion. The proposed algorithm is simple and fast, runs on the CPU, and relies only on basic image processing operations to perform depth completion of sparse LIDAR depth data. We evaluate our algorithm on the challenging KITTI depth completion benchmark, and at the time of submission, our method ranks first on the KITTI test server among all published methods. Furthermore, our algorithm is data independent, requiring no training data to perform the task at hand. The code written in Python will be made publicly available at https://github.com/kujason/ip_basic.

Motivation & Objective

To demonstrate that well-designed classical image processing algorithms can outperform deep learning-based methods in depth completion.
To develop a fast, real-time depth completion algorithm that runs efficiently on CPU without requiring GPU acceleration.
To create a data-independent method that requires no training data, avoiding overfitting and improving robustness.
To provide a strong, interpretable baseline for depth completion that is simpler and more efficient than complex neural networks.
To validate the effectiveness of traditional image processing in modern computer vision tasks like depth completion.

Proposed method

The algorithm begins with inversion and dilation of the sparse depth map to expand depth regions and fill small holes.
Small holes are closed using morphological operations, followed by extension of depth values to the top of the image frame to reduce artifacts.
Large holes are filled using a combination of morphological dilation and Gaussian blur to propagate depth values while preserving structure.
A two-stage blurring process applies median and Gaussian filters to reduce noise and smooth depth planes without distorting object edges.
The final output is obtained by inverting the processed depth map to restore original depth values.
The method relies solely on standard image processing operations—no neural networks, no training data, and no image guidance.

Figure 1: A flowchart of the proposed algorithm. Clockwise starting at top left: Input LIDAR depth map (enhanced for visibility), inversion and dilation, small hole closure, small hole fill, extension to top of frame, large hole fill and blur, inversion for output, image of scene (not used, only for

Experimental results

Research questions

RQ1Can classical image processing techniques outperform deep learning-based methods in depth completion?
RQ2Is it possible to achieve state-of-the-art performance on the KITTI depth completion benchmark using only CPU-based, non-learning algorithms?
RQ3How does the choice of morphological kernel shape and size affect the performance of a classical depth completion pipeline?
RQ4What is the optimal combination of blurring techniques (e.g., median, Gaussian, bilateral) for minimizing depth error in sparse input scenarios?
RQ5Can a data-independent, non-trainable algorithm achieve real-time performance (90 Hz) on CPU while maintaining high accuracy?

Key findings

The proposed algorithm achieved an RMSE of 1350.93 mm and MAE of 305.35 mm on the KITTI depth completion benchmark, ranking first among all published methods at the time of submission.
The algorithm runs at 90 Hz on CPU, demonstrating real-time performance without requiring GPU acceleration or model inference hardware.
Using a combination of median and Gaussian blur reduced RMSE by 150+ mm compared to no blur, with minimal runtime overhead (0.011 seconds).
The Gaussian blur variant achieved the lowest RMSE (1350.93 mm), while the bilateral blur version preserved object structure better and is recommended for practical applications.
The algorithm outperformed a custom sparsity-invariant convolutional neural network (SIC-Net) by a significant margin, despite being non-learning and non-trainable.
The method is robust to image quality and calibration errors since it does not rely on color images or synchronized sensors, making it suitable for embedded deployment.

Figure 2: A toy example summarizing the problem formulation described in equation 1 . Empty values are coloured in red, and filled by applying the function $f$ to $D_{sparse}$ .

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.