QUICK REVIEW

[논문 리뷰] Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency

Zhenheng Yang, Peng Wang|arXiv (Cornell University)|2017. 11. 10.

Advanced Vision and Imaging참고 문헌 30인용 수 104

한 줄 요약

이 논문은 단안 비디오에서 기하학적 일관성과 에지 인식 스무스니스를 강제하여 단일 프레임에서 깊이와 표면 법선을 공동으로 추정하는 비지도 프레임워크를 제시하며 KITTI 2015에서 최첨단 성능을 달성한다.

ABSTRACT

Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years. In this paper, we introduce a surface normal representation for unsupervised depth estimation framework. Our estimated depths are constrained to be compatible with predicted normals, yielding more robust geometry results. Specifically, we formulate an edge-aware depth-normal consistency term, and solve it by constructing a depth-to-normal layer and a normal-to-depth layer inside of the DCN. The depth-to-normal layer takes estimated depths as input, and computes normal directions using cross production based on neighboring pixels. Then given the estimated normals, the normal-to-depth layer outputs a regularized depth map through local planar smoothness. Both layers are computed with awareness of edges inside the image to help address the issue of depth/normal discontinuity and preserve sharp edges. Finally, to train the network, we apply the photometric error and gradient smoothness for both depth and normal predictions. We conducted experiments on both outdoor (KITTI) and indoor (NYUv2) datasets, and show that our algorithm vastly outperforms state of the art, which demonstrates the benefits from our approach.

연구 동기 및 목표

단안 비디오로부터 장면 기하학(깊이와 노멀)의 비지도 학습을 고무한다.
뷰 합성을 감독으로 활용하여 기하학적 일관성을 보장한다.
깊이-노멀 일관성을 규제자로 도입해 깊이 및 노멀 추정을 향상시킨다.
에지-인식 스무스니스 및 이미지 기울기 항을 통해 저-텍스처 영역의 깊이 불연속성과 문제를 다룬다.

제안 방법

End-to-end CNN that learns 카메라 모션, 깊이, 및 표면 노멀을 단안 비디오 시퀀스로부터 학습한다.
Photometric warping loss based on 3D inverse warping to synthesize target views from source views.
에지-인식 스무스니스 손실로 이미지 기울기를 존중하여 깊이 불연속성을 보존한다.
Image gradient matching loss가 선명한 깊이를 촉진하고 이미지 기울기의 정렬을 향상시킨다.
Explicit depth2normal and normal2depth layers to enforce geometry consistency between depth and normals.

실험 결과

연구 질문

RQ1단안 비디오를 이용하여 기하학적 및 광학적 제약으로 깊이와 표면 노멀을 비지도 방식으로 공동 추정할 수 있는가?
RQ2깊이-노멀 기하학 규제가 깊이 및 노멀 추정 품질에 어떤 영향을 미치는가?
RQ3에지-인식 항이 저-텍스처 영역의 깊이 매끄러움과 불연속성에 어떤 영향을 주는가?

주요 결과

프레임워크가 KITTI 2015에서 깊이 및 노멀 평가 지표에 대해 최첨단 성능을 달성한다.
깊이-노멀 일관성을 전용 계층을 통해 도입하면 깊이 맵과 노멀 맵의 품질이 모두 향상된다.
에지-인식 스무스니스 및 기울기 기반 손실이 이미지 경계에 맞춘 깊이 불연속성을 보존하는 데 도움을 준다.
뷰 합성 감독(photometric warping)이 단안 비디오에서 학습하기 위한 강력한 기하학적 신호를 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.