QUICK REVIEW

[논문 리뷰] Learning to Upsample by Learning to Sample

Wenze Liu, Hao Lü|arXiv (Cornell University)|2023. 08. 29.

Advanced Neural Network Applications인용 수 24

한 줄 요약

DySample은 경량의 동적 업샘플러로, 업샘플링을 콘텐츠 인지 포인트 샘플링으로 재구성하여, 다중 Dense 예측 작업에서 저비용으로 강한 정확도를 달성합니다. 동적 컨볼루션 및 고해상도 가이던스를 피하고, 학습 가능한 오프셋이 있는 PyTorch grid_sample을 사용해 커널 기반 업샘플러보다 성능을 앞섭니다.

ABSTRACT

We present DySample, an ultra-lightweight and effective dynamic upsampler. While impressive performance gains have been witnessed from recent kernel-based dynamic upsamplers such as CARAFE, FADE, and SAPA, they introduce much workload, mostly due to the time-consuming dynamic convolution and the additional sub-network used to generate dynamic kernels. Further, the need for high-res feature guidance of FADE and SAPA somehow limits their application scenarios. To address these concerns, we bypass dynamic convolution and formulate upsampling from the perspective of point sampling, which is more resource-efficient and can be easily implemented with the standard built-in function in PyTorch. We first showcase a naive design, and then demonstrate how to strengthen its upsampling behavior step by step towards our new upsampler, DySample. Compared with former kernel-based dynamic upsamplers, DySample requires no customized CUDA package and has much fewer parameters, FLOPs, GPU memory, and latency. Besides the light-weight characteristics, DySample outperforms other upsamplers across five dense prediction tasks, including semantic segmentation, object detection, instance segmentation, panoptic segmentation, and monocular depth estimation. Code is available at https://github.com/tiny-smart/dysample.

연구 동기 및 목표

가벼우면서도 Dense 예측에 적합한 비용이 낮은 범용 업샘플링 연산자를 제시하고, 무거운 동적 컨볼루션 및 고해상도 가이던스로부터 자유롭게 합니다.
표준 PyTorch 기본 연산을 사용한 학습 가능한 콘텐츠 인지 포인트 샘플링으로 업샘플링을 재구성합니다.
지연 시간과 메모리 사용량이 낮은 실용적 DySample로 나이트샘플링 기반의 나빠 보였던 샘플링을 체계적으로 개선합니다.]
methodAverage되
method:

제안 방법

업샘플링을 연속 맵으로의 보간(interpolation)으로 표현한 뒤 콘텐츠 인지 포인트에서 재샘플링합니다.
선형 투영으로 포인트별 오프셋을 생성하고 픽셀 셔플 또는 형태 재구성을 통해 샘플링 그리드를 생성합니다.
오프셋 이동을 제한하고 샘플링 중첩을 줄이기 위해 정적(static) 및 동적(dynamic) 스코프 팩터를 도입합니다.
채널을 그룹으로 나눠 그룹 단위의 오프셋 생성을 수행해 효율성을 높입니다.
4가지 DySample 변형(LP/PL 및 정적/동적 스코프)을 제공하고 복잡도와 성능을 비교합니다.
semantic segmentation, object detection/instance segmentation, panoptic segmentation, monocular depth estimation에서 실험적으로 검증합니다.

실험 결과

연구 질문

RQ1표준 PyTorch 기본 연산을 사용하면서 커널 기반의 동적 업샘플링과 대등하거나 더 우수한 샘플링 기반 업샘플링 연산자를 만들 수 있을까?
RQ2샘플링 기반 업샘플러의 성능과 효율성을 극대화하는 초기화, 스코프 제어, 그리고 그룹화 전략은 무엇인가?
RQ3DySample이 CARAFE, FADE, SAPA에 비해 다양한 Dense 예측 작업에서 정확도 및 자원 측면에서 어떻게 성능을 보이는가?

주요 결과

DySample은 커널 기반의 동적 업샘플러에 비해 파라미터 수, FLOPs, 메모리, 지연 시간이 훨씬 적으면서 최첨단 또는 경쟁력 있는 결과를 달성합니다.
ADE20K에서 SegFormer-B1으로 DySample-S+가 mIoU 43.58을 달성하고 여러 기준선보다 mIoU 및 경계 지표에서 우수합니다.
DySample은 다섯 가지 Dense 예측 작업에 걸쳐 세분화 및 탐지/세그먼트 메트릭을 개선하며, CARAFE 등과 비교했을 때 일부 설정에서 유의미한 향상을 보입니다.
LP/PL 변형은 파라미터 및 속도 측면에서 다른 트레이드오프를 보이며, SegFormer 및 MaskFormer에 대해 PL이 더 나은 성능을 제공하는 경우가 많습니다.
DySample+ (동적 스코프) 및 그룹화(g=4)는 정적, 단일 그룹 구성보다 주목할 만한 성능 향상을 제공합니다.
쌍선형 보간과 비교했을 때 DySample은 인공물 감소 및 내부 영역 품질 보존에 기여하며 거의 추가 오버헤드를 늘리지 않습니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.