QUICK REVIEW

[论文解读] A low-complexity method for efficient depth-guided image deblurring

Ziyao Yi, Diego Valsesia|arXiv (Cornell University)|Jan 7, 2026

Image Enhancement Techniques被引用 0

一句话总结

EDIBNet 引入一种紧凑的小波域编码器-解码器，并配备深度引导的适配器，以在边缘设备上实现深度感知去模糊，质量具有竞争力，同时 FLOPs、运行时和内存显著减少约两个数量级。

ABSTRACT

Image deblurring is a challenging problem in imaging due to its highly ill-posed nature. Deep learning models have shown great success in tackling this problem but the quest for the best image quality has brought their computational complexity up, making them impractical on anything but powerful servers. Meanwhile, recent works have shown that mobile Lidars can provide complementary information in the form of depth maps that enhance deblurring quality. In this paper, we introduce a novel low-complexity neural network for depth-guided image deblurring. We show that the use of the wavelet transform to separate structural details and reduce spatial redundancy as well as efficient feature conditioning on the depth information are essential ingredients in developing a low-complexity model. Experimental results show competitive image quality against recent state-of-the-art models while reducing complexity by up to two orders of magnitude.

研究动机与目标

Motivate edge-friendly image deblurring with depth guidance from mobile LiDAR depth maps.
Develop a low-complexity encoder–decoder architecture that operates in the wavelet domain.
Fuse depth information through lightweight adapters to improve structure-aware restoration.
Demonstrate substantial reductions in FLOPs, runtime, and memory while maintaining competitive image quality.

提出的方法

Use 2-level Haar DWT to decompose images into sub-bands and process only the low-frequency LL(2) and related subbands to reduce computations.
Process LL(2), LH(2), HL(2), HH(2) in a lightweight encoder–decoder backbone with residual blocks and skip connections.
Inject depth information via efficient depth adapters that modulate decoder features using a two-branch approximation of guided-filter statistics.
Fuse depth features with image features through concatenation, a light convolution, and channel attention, with depth features propagated to subsequent stages.
Skip high-frequency subbands to the inverse transform to save computation while maintaining quality.

Figure 1 : Proposed EDIBNet architecture. An efficient encoder-decoder neural network operates in the low-frequency wavelet sub-bands. Efficient adapters are added on each level of the decoder part to modulate image features with depth features.

实验结果

研究问题

RQ1Can a wavelet-domain, low-complexity network achieve competitive deblurring quality using depth guidance from mobile LiDAR?
RQ2What is the impact of wavelet level, wavelet basis, and depth adapters on performance and efficiency?
RQ3How does EDIBNet compare to state-of-the-art models in PSNR/SSIM while minimizing FLOPs, runtime, and memory on edge devices?

主要发现

Model	PSNR (dB) ↑	SSIM ↑	LPIPS ↓	Parameters (M)	FLOPs (G)	Runtime (s)	Memory (MB)
Restormer	34.52	0.9318	0.2596	26.1	4083	46.56	32456
Depth-Restormer	36.62	0.9446	0.2223	30.0	8786	55.84	41304
NAFNet	37.24	0.9430	0.2474	17.1	673	4.54	4216
Depth-NAFNet	37.28	0.9434	0.2433	23.7	1388	7.28	11260
EDIBNet (w/o depth & adapter)	34.59	0.9667	0.3093	1.45	15	0.12	280
EDIBNet (channel=16)	34.73	0.9673	0.3117	2.84	44	0.20	358
EDIBNet (channel=32)	35.10	0.9681	0.2971	11.3	178	0.40	816

The proposed EDIBNet achieves competitive PSNR/SSIM with an order-of-magnitude reduction in FLOPs, runtime, and memory compared to state-of-the-art models.
Incorporating real-world depth information via lightweight adapters yields incremental PSNR improvements (~0.14 dB) with minimal runtime increase.
A 2-level wavelet decomposition provides the best accuracy-efficiency balance among 1-level and 3-level configurations.
Haar wavelets perform comparably to rbio1.1 and bior1.1 bases with the Haar being the easiest to implement.
The depth adapter designed here reduces parameters and FLOPs while improving SSIM over a competing adapter.

Figure 2 : Architecture of the proposed efficient Adapter. The module takes as input both image features and depth features. Each input is first normalized and passed through lightweight bias adjustment layers. The features are then concatenated and processed through a chunking and spatial condition

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。