[论文解读] Learnable Multi-level Discrete Wavelet Transforms for 3D Gaussian Splatting Frequency Modulation
这篇论文通过引入多级可学习的离散小波变换(DWT)来在3D高斯散点中调节频率,仅使用一个缩放参数来降低高斯数量,同时保持渲染质量,从而扩展 AutoOpti3DGS。
3D Gaussian Splatting (3DGS) has emerged as a powerful approach for novel view synthesis. However, the number of Gaussian primitives often grows substantially during training as finer scene details are reconstructed, leading to increased memory and storage costs. Recent coarse-to-fine strategies regulate Gaussian growth by modulating the frequency content of the ground-truth images. In particular, AutoOpti3DGS employs the learnable Discrete Wavelet Transform (DWT) to enable data-adaptive frequency modulation. Nevertheless, its modulation depth is limited by the 1-level DWT, and jointly optimizing wavelet regularization with 3D reconstruction introduces gradient competition that promotes excessive Gaussian densification. In this paper, we propose a multi-level DWT-based frequency modulation framework for 3DGS. By recursively decomposing the low-frequency subband, we construct a deeper curriculum that provides progressively coarser supervision during early training, consistently reducing Gaussian counts. Furthermore, we show that the modulation can be performed using only a single scaling parameter, rather than learning the full 2-tap high-pass filter. Experimental results on standard benchmarks demonstrate that our method further reduces Gaussian counts while maintaining competitive rendering quality.
研究动机与目标
- Motivate reducing Gaussian counts in 3D Gaussian Splatting (3DGS) without sacrificing rendering quality.
- Introduce a multi-level discrete wavelet transform (DWT) framework for data-adaptive coarse-to-fine frequency modulation in 3DGS.
- Mitigate gradient competition by limiting learnable parameters to a single scaling factor for the high-pass filter.
- Demonstrate that deeper DWT levels yield stronger early supervision and further Gaussian count reductions.
- Provide an empirical evaluation on standard 3DGS benchmarks showing reduced Gaussian counts vs. baselines.
提出的方法
- Extend AutoOpti3DGS with multi-level DWT to enable deeper frequency modulation curriculum by recursively decomposing the LL subband.
- Fix DWT filters (using Haar) and learn a single scaling parameter alpha on the high-pass analysis filters to control HF content.
- Impose Perfect Reconstruction-inspired residual losses (alias-cancellation and no-distortion) to regularize alpha and maintain reconstruction fidelity.
- Integrate the multi-level DWT-based modulation into the 3DGS training objective: L = L_3DGS + lambda_PR * L_PR, where L_PR combines alias and distortion terms.
- Evaluate on LLFF (3 views) and Mip-NeRF 360 (12 views), comparing against Vanilla 3DGS, Opti3DGS, and AutoOpti3DGS.
实验结果
研究问题
- RQ1Does a multi-level DWT provide a deeper and more effective frequency modulation curriculum than 1-level DWT in 3DGS?
- RQ2Can the HF modulation be effectively achieved by learning only a single scaling parameter for the high-pass filter, reducing gradient conflicts?
- RQ3How does multi-level DWT impact Gaussian counts and rendering quality across standard 3DGS benchmarks?
- RQ4What is the trade-off between deeper DWT levels and PSNR/SSIM/LPIPS performance in practice?
主要发现
| PSNR ( ↑ ) | SSIM ( ↑ ) | LPIPS ( ↓ ) | #G ( ↓ ) | Time ( ↓ ) (s) | |
|---|---|---|---|---|---|
| Opti3DGS | 19.59 | 0.660 | 0.228 | 247K | 105 |
| AutoOpti3DGS | 19.59 | 0.703 | 0.200 | 249K | 131 |
| Ours | 20.34 | 0.687 | 0.222 | 218K | 142 |
| 3DGS | 20.40 | 0.706 | 0.197 | 272K | 109 |
| Opti3DGS | 19.19 | 0.552 | 0.360 | 636K | 151 |
| AutoOpti3DGS | 19.24 | 0.541 | 0.381 | 615K | 182 |
| Ours | 19.29 | 0.560 | 0.355 | 589K | 200 |
| 3DGS | 19.30 | 0.564 | 0.352 | 701K | 155 |
- The multi-level DWT reduces Gaussian counts more than 1-level baselines (≈50K–100K fewer Gaussians across LLFF and Mip-NeRF 360).
- Using a single scaling parameter for the high-pass filter further reduces Gaussian counts compared to learning full high-pass filters.
- Deeper DWT levels yield progressively lower Gaussian counts but may cause PSNR to drop due to coarser initial reconstructions, indicating a trade-off between modulation depth and rendering fidelity.
- The method achieves competitive PSNR/SSIM/LPIPS while reducing Gaussian counts relative to Opti3DGS and AutoOpti3DGS, with some additional training time (≈10–20s).
- Best-performing configuration balances DWT depth with reconstruction quality, showing the proposed approach preserves rendering quality while reducing Gaussian proliferation.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。