QUICK REVIEW

[论文解读] Visibility-aware Multi-view Stereo Network

Jingyang Zhang, Yao Yao|arXiv (Cornell University)|Aug 18, 2020

Advanced Vision and Imaging参考文献 31被引用 52

一句话总结

Vis-MVSNet 明确建模并通过匹配不确定性融合像素级可见性，以降低多视图立体中被遮挡像素的影响，在严重遮挡下尤其提升深度精度。

ABSTRACT

Learning-based multi-view stereo (MVS) methods have demonstrated promising results. However, very few existing networks explicitly take the pixel-wise visibility into consideration, resulting in erroneous cost aggregation from occluded pixels. In this paper, we explicitly infer and integrate the pixel-wise occlusion information in the MVS network via the matching uncertainty estimation. The pair-wise uncertainty map is jointly inferred with the pair-wise depth map, which is further used as weighting guidance during the multi-view cost volume fusion. As such, the adverse influence of occluded pixels is suppressed in the cost fusion. The proposed framework Vis-MVSNet significantly improves depth accuracies in the scenes with severe occlusion. Extensive experiments are performed on DTU, BlendedMVS, and Tanks and Temples datasets to justify the effectiveness of the proposed framework.

研究动机与目标

在遮挡干扰多视图线索时，推动实现准确的三维重建。
提出一个端到端网络，联合估计深度和每像素的匹配不确定性。
在多视图代价体融合中集成基于不确定性的加权，以抑制遮挡部分的贡献。
采用粗到细策略并结合如 group-wise correlation 等实用技术以提升性能。

提出的方法

对于每个参考–源对，使用 group-wise correlation 计算成对代价体。
使用 3D CNN 和 soft-argmax 对成对结果回归一个深度图和一个深度方向的不确定性，且不确定性来自深度分布的熵。
将成对深度和不确定性转换为概率体，并通过权重和的方式融合所有成对潜在体，其中权重为 exp(-uncertainty)。
对融合后的体积进行正则化，通过 soft-argmax 获得最终深度图。
采用粗到细方案：以先前阶段估计为中心，逐步收窄深度范围。
以无监督方式通过与深度残差相关的 Laplacian似然目标，联合训练不确定性和深度。

实验结果

研究问题

RQ1在基于学习的 MVS 框架内，是否可以在不需要外部 EM 类步骤的情况下直接推断像素级可见性？
RQ2当存在遮挡时，显式的不确定性估计是否可以改善深度融合？
RQ3基于不确定性引导的融合对遮挡和非遮挡区域的深度精度有何影响？
RQ4与现有方法相比，提出的 Vis-MVSNet 在标准 MVS 基准（DTU、BlendedMVS、Tanks and Temples）上的表现如何？

主要发现

方法	Tanks and Temples mean F-score	Francis	Horse	Lighthouse	M60	Panther	Playground	Train	Acc.	Comp.	Overall	DTU (mm) mean
Vis-MVSNet	60.03	77.40	60.23	47.07	63.44	62.21	57.28	60.54	52.07	0.369	0.361	0.365

Vis-MVSNet 在 Tanks and Temples、DTU 和 BlendedMVS 数据集上达到最先进或具竞争力的结果。
结合不确定性加权的遮挡感知融合在深度精度上有所提升，尤其是在遮挡严重的场景中。
两步代价体正则化结合粗到细策略提升了重建质量。
基于不确定性的损失和熵派生的不确定性实现端到端训练，无需显式的可见性监督。
消融结果表明显式的可见性感知融合优于基于方差的以及简单平均/最大融合基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。