QUICK REVIEW

[论文解读] ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination

Jan-Niklas Dihlmann, Mark Boss|arXiv (Cornell University)|Mar 20, 2026

Computer Graphics and Visualization Techniques被引用 0

一句话总结

meta_description: Unified fast 3D reconstruction that disentangles materials and lighting from few views, producing relightable 3D assets with HDR illumination.

ABSTRACT

Reconstructing 3D assets from images has long required separate pipelines for geometry reconstruction, material estimation, and illumination recovery, each with distinct limitations and computational overhead. We present ReLi3D, the first unified end-to-end pipeline that simultaneously reconstructs complete 3D geometry, spatially-varying physically-based materials, and environment illumination from sparse multi-view images in under one second. Our key insight is that multi-view constraints can dramatically improve material and illumination disentanglement, a problem that remains fundamentally ill-posed for single-image methods. Key to our approach is the fusion of the multi-view input via a transformer cross-conditioning architecture, followed by a novel unified two-path prediction strategy. The first path predicts the object's structure and appearance, while the second path predicts the environment illumination from image background or object reflections. This, combined with a differentiable Monte Carlo multiple importance sampling renderer, creates an optimal illumination disentanglement training pipeline. In addition, with our mixed domain training protocol, which combines synthetic PBR datasets with real-world RGB captures, we establish generalizable results in geometry, material accuracy, and illumination quality. By unifying previously separate reconstruction tasks into a single feed-forward pass, we enable near-instantaneous generation of complete, relightable 3D assets. Project Page: https://reli3d.jdihlmann.com/

研究动机与目标

提出一种统一方法，从稀疏多视图图像中联合恢复几何、材料和照明，解决单视图逆渲染的病态性。
利用多视约束改善材料-照明的分离与材料真实感。
通过避免对每个对象进行优化实现近乎即时的推断，适合生产工作流。
通过混合域训练将合成与真实数据衔接，提升跨域的泛化能力。

提出的方法

跨视图融合：一个共享的跨条件 transformer 接受任意数量的视图并构建统一的三平面特征，驱动两个预测路径。
两路径照明分离：几何+外观路径预测网格及空间变异的 BRDF 参数；照明路径从图像背景或对象反射中预测 HDR 环境，使用 RENI++ 潜在表示。
通过 MC+MIS 的可微分蒙特卡罗渲染器进行分离训练：使用多重要采样强制实现物理意义上的材料-照明分离，并实现混合域监督。
混合域训练：将合成 PBR 数据与真实 RGB 捕获数据结合，并使用图像空间自监督以推广至真实场景。

Figure 1: Fast, illumination disentangled reconstructions. ReLi3D reconstructs high-quality 3D meshes with physically based materials from sparse input images, while disentangling illumination effects; all in just 0.3s. It is robustly trained on cross-domain datasets and excels in both single- and m

实验结果

研究问题

RQ1多视约束能否克服在3D重建中从材质属性与照明分离的病态问题？
RQ2是否可能在稀疏视图中通过单个前馈过程同时预测几何、空间变化的 PBR 材料和 HDR 环境照明？
RQ3跨视图融合如何影响材料精度与重新光照保真度在合成数据与真实世界数据中的表现？
RQ4混合域训练能否缩小合成与真实世界数据之间的差距，提升可重新光照3D资产的泛化能力？

主要发现

ReLi3D 在交互速度下实现具有竞争力的几何重建，同时提供最先进的材料与照明分离效果。
在对象表面预测空间变异的 PBR 材料（反照率、粗糙度、金属度），视图越多越准确。
在分布外 HDR 环境中的重新光照表现优越，真实照明与重新光照结果高度吻合。
从稀疏视图中就能准确推断 HDR 环境贴图，得益于背景信息和多视察觉。
混合域训练在仅需 174k 对象的情况下实现对真实世界的鲁棒性，相较于许多大规模方法数据需求显著减少。

Figure 2: ReLi3D Overview. Multi-view input images are fused by a shared cross-conditioning transformer into two parallel paths: a Geometry & Appearance Path (blue) using a Triplane Transformer to predict mesh geometry and PBR materials, and an Illumination Path (green) using a Multi-View Illuminati

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。