QUICK REVIEW

[논문 리뷰] LDConv: Linear deformable convolution for improving convolutional neural networks

Xin Zhang, Yingze Song|arXiv (Cornell University)|2023. 11. 20.

Advanced Neural Network Applications인용 수 45

한 줄 요약

LDConv는 AKConv를 제시합니다. 이는 임의의 샘플링 형태와 크기를 선형 파라미터 증가로 가능하게 하는 Alterable Kernel Convolution으로, 목표 형태에 맞추기 위한 오프셋을 더하고 표준 컨볼루션 및 변형 가능한 컨볼루션보다 객체 탐지 성능이 향상됨을 보여줍니다.

ABSTRACT

Neural networks based on convolutional operations have achieved remarkable results in the field of deep learning, but there are two inherent flaws in standard convolutional operations. On the one hand, the convolution operation is confined to a local window, so it cannot capture information from other locations, and its sampled shapes is fixed. On the other hand, the size of the convolutional kernel are fixed to k $ imes$ k, which is a fixed square shape, and the number of parameters tends to grow squarely with size. Although Deformable Convolution (Deformable Conv) address the problem of fixed sampling of standard convolutions, the number of parameters also tends to grow in a squared manner. In response to the above questions, the Linear Deformable Convolution (LDConv) is explored in this work, which gives the convolution kernel an arbitrary number of parameters and arbitrary sampled shapes to provide richer options for the trade-off between network overhead and performance. In LDConv, a novel coordinate generation algorithm is defined to generate different initial sampled positions for convolutional kernels of arbitrary size. To adapt to changing targets, offsets are introduced to adjust the shape of the samples at each position. LDConv corrects the growth trend of the number of parameters for standard convolution and Deformable Conv to a linear growth. Moreover, it completes the process of efficient feature extraction by irregular convolutional operations and brings more exploration options for convolutional sampled shapes. Object detection experiments on representative datasets COCO2017, VOC 7+12, and VisDrone-DET2021 fully demonstrate the advantages of LDConv. LDConv is a plug-and-play convolutional operation that can replace the convolutional operation to improve network performance. The code for the relevant tasks can be found at https://github.com/CV-ZhangXin/LDConv.

연구 동기 및 목표

임의의 샘플링 형태와 커널 크기를 가능하게 하여 CNN의 유연성을 개선하려는 동기를 부여한다.
AKConv를 제안하여 임의의 커널에 대한 초기 샘플링 좌표를 생성하고 샘플링 형태를 적응시키기 위한 오프셋을 학습한다.
AKConv가 표준 합성곱의 플러그앤플레이 대체로 작동하여 벤치마크에서 탐지 성능을 향상시킨다는 것을 보여준다.
AKConv를 Deformable Conv 및 DSConv와 비교하여 더 넓은 형태 및 크기 유연성을 입증한다.

제안 방법

임의의 커널 크기에 대한 초기 샘플링 위치를 생성하기 위한 좌표 생성 알고리즘을 정의한다.
각 공간 위치에서 샘플 위치를 조정하기 위해 학습 가능한 오프셋을 도입한다.
비정형 샘플링 형태를 구현하기 위해 샘플링된 특징을 재구성 및 적절한 컨볼루션 연산을 통해 집계한다.
커널 크기에 따른 파라미터의 선형 증가를 AKConv가 유지함을 보여주며, 표준 컨볼루션 및 제곱 증가 컨볼루션과 다름을 입증한다.
AKConv를 확장하여 다양한 작업에 적용 가능한 여러 초기 샘플링 형태와 임의의 크기를 보여준다.

실험 결과

연구 질문

RQ1AKConv가 탐지 성능을 유지하거나 향상시키면서 선형 파라미터 증가로 임의의 샘플링 형태와 크기를 제공할 수 있는가?
RQ2학습 가능한 오프셋이 샘플링 기하학 및 다른 데이터셋과 모델에서 네트워크 정확도에 어떤 영향을 미치는가?
RQ3객체 탐지 벤치마크에서 성능과 효율성 측면에서 AKConv가 표준 Conv, Deformable Conv, DSConv와 어떻게 비교되는가?

주요 결과

AKConv가 COCO2017에서 YOLOv5 컨볼루션을 대체할 때 AP 지표를 향상시켰다.
AKConv가 더 큰 크기(예: 5, 9, 11)를 가진 경우 일반적으로 AP와 AP50/AP75를 향상시키는 한편, GFLOPS와 파라미터 수를 기준선에 가깝게 또는 약간 높게 유지한다.
Deformable Conv 및 DSConv와 비교할 때 AKConv는 더 유연한 샘플링 형태와 크기를 제공하며 탐지 성능이 경쟁력 있거나 우수하다.
공정한 비교에서 AKConv의 제로 패딩이 성능 향상에 기여한다.
AKConv는 COCO, VOC 7+12, VisDrone-DET2021 데이터셋에서 일관된 개선을 보여 일반화 가능성을 시사한다.
AKConv의 성능은 초기 샘플링 형태에 따라 달라질 수 있어 작업 및 데이터셋 특성에 맞춘 설계 선택이 필요하다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.