QUICK REVIEW

[论文解读] S-NeRF: Neural Radiance Fields for Street Views

Ziyang Xie, Junge Zhang|arXiv (Cornell University)|Mar 1, 2023

Advanced Vision and Imaging被引用 13

一句话总结

S-NeRF 将 NeRF 扩展到无界街景视图，通过对大尺度背景和移动车辆的联合建模，利用带有学习深度置信度和位姿变换的嘈杂稀疏 LiDAR，在 nuScenes/Waymo 上实现了行业领先的结果。

ABSTRACT

Neural Radiance Fields (NeRFs) aim to synthesize novel views of objects and scenes, given the object-centric camera views with large overlaps. However, we conjugate that this paradigm does not fit the nature of the street views that are collected by many self-driving cars from the large-scale unbounded scenes. Also, the onboard cameras perceive scenes without much overlapping. Thus, existing NeRFs often produce blurs, 'floaters' and other artifacts on street-view synthesis. In this paper, we propose a new street-view NeRF (S-NeRF) that considers novel view synthesis of both the large-scale background scenes and the foreground moving vehicles jointly. Specifically, we improve the scene parameterization function and the camera poses for learning better neural representations from street views. We also use the the noisy and sparse LiDAR points to boost the training and learn a robust geometry and reprojection based confidence to address the depth outliers. Moreover, we extend our S-NeRF for reconstructing moving vehicles that is impracticable for conventional NeRFs. Thorough experiments on the large-scale driving datasets (e.g., nuScenes and Waymo) demonstrate that our method beats the state-of-the-art rivals by reducing 7% to 40% of the mean-squared error in the street-view synthesis and a 45% PSNR gain for the moving vehicles rendering.

研究动机与目标

解决传统 NeRF 在具有有限摄像机重叠的无界街景数据中的局限性。
联合重建背景场景和街景中的前景移动车辆，以实现街景视图合成。
利用带有可学习深度置信的嘈杂、稀疏 LiDAR 点来对几何进行鲁棒监督。
对静态背景和移动车辆的相机位姿进行 refinement，以提升神经表示的质量。
在大规模驾驶数据集上展示高质量的新视图渲染，并为仿真/VR 提供车辆渲染能力。

提出的方法

通过一个有界映射 f(x) 来改进场景参数化，以处理大型户外范围。
对静态背景位姿应用位姿 refinement 网络，并引入虚拟相机变换以处理移动车辆。
使用 NLSPN 将稀疏 LiDAR 深度传播，并学习一个结合重投影和几何线索的置信机制。
以 RGB 和深度损失+边缘感知平滑度进行训练，以在颜色保真度和深度质量之间取得平衡。
计算一个可学习的置信度度量组合（rgb、ssim、vgg、depth、flow），用于鲁棒监督。
使用可微分流水线渲染深度，并利用 Mip-NeRF 的 frustum 基采样来捕捉近场细节。

实验结果

研究问题

RQ1S-NeRF 是否能够从稀疏、嘈杂的数据中合成大尺度背景街景和前景移动车辆的 foto-realistic 新视图？
RQ2由稀疏 LiDAR 提供的深度监督，在可学习的置信机制辅助下，如何影响几何和渲染质量？
RQ3位姿 refinement 是否提高了对无界街景数据和移动对象的 NeRF 性能？
RQ4S-NeRF 与 nuScenes 和 Waymo 数据集上的最先进大尺度街景 NeRF 相比有何差异？

主要发现

S-NeRF 在街景视图上相较于最先进的 NeRF 基线具有更高的保真度，背景场景的均方误差降低了 7–40%。
对于移动车辆，S-NeRF 在新视图渲染方面比最新版的网格方法提高了 45% 的 PSNR。
前景静态车辆：PSNR 18.81, SSIM 0.785, LPIPS 0.194（我们的方法 vs 基线）。
前景移动车辆：PSNR 18.00, SSIM 0.736, LPIPS 0.226（我们的方法）。
背景场景（四个 nuScenes 序列）：PSNR 26.21, SSIM 0.831, LPIPS 0.228（S-NeRF vs 基线）。
消融研究表明，深度置信度和平滑项损失有助于提高深度质量并减少伪影。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。