QUICK REVIEW

[논문 리뷰] Meta-learning with differentiable closed-form solvers

Luca Bertinetto, João F. Henriques|arXiv (Cornell University)|2018. 05. 21.

Domain Adaptation and Few-Shot Learning참고 문헌 54인용 수 151

한 줄 요약

이 논문은 differentiable ridge 및 logistic regression 솔버(R2-D2 및 LR-D2)를 빠르고 에피소드 특화된 기본 학습기로 제시하여, 닫힌 형식 또는 IRLS 기반 솔버를 통한 역전파를 통해 빠른 소수 샷 적응을 가능하게 한다. 고차원 임베딩과 Woodbury 항등식을 사용하여 Omniglot, mini-ImageNet, cifar-fs에서 경쟁력 있거나 우수한 성능을 달성한다.

ABSTRACT

Adapting deep networks to new concepts from a few examples is challenging, due to the high computational requirements of standard fine-tuning procedures. Most work on few-shot learning has thus focused on simple learning techniques for adaptation, such as nearest neighbours or gradient descent. Nonetheless, the machine learning literature contains a wealth of methods that learn non-deep models very efficiently. In this paper, we propose to use these fast convergent methods as the main adaptation mechanism for few-shot learning. The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data. This requires back-propagating errors through the solver steps. While normally the cost of the matrix operations involved in such a process would be significant, by using the Woodbury identity we can make the small number of examples work to our advantage. We propose both closed-form and iterative solvers, based on ridge regression and logistic regression components. Our methods constitute a simple and novel approach to the problem of few-shot learning and achieve performance competitive with or superior to the state of the art on three benchmarks.

연구 동기 및 목표

간단하고 미분 가능하며 closed-form 또는 빠르게 수렴하는 해를 갖는 기본 학습기를 사용하여 few-shot 학습에서 빠른 적응을 동기 부여한다.
ridge 회귀와 logistic 회귀 솔버를 메타러닝 프레임워크에 통합하여 학습 단계에 대한 역전파를 가능하게 한다.
Woodbury 항등식을 통해 고차원 임베딩 설정에서 계산 효율성을 향상시킨다.
표준 few-shot 벤치마크에서 제안된 방법을 평가하고 최첨단 방법과 비교한다.

제안 방법

에피소드 데이터에서 에피소드 특유의 가중치 W를 계산하는 differentiable ridge regression 레이어를 기본 학습기로 하는 메타러닝 설정을 제안한다.
임베딩 차원이 크지만 에피소드 샘플 수가 작은 경우 Ridge 해를 효율적으로 계산하기 위해 Woodbury 항등식을 사용한다.
IRLS(Iteratively Reweighted Least Squares)를 이용한 반복적 logistic regression 기본 학습기로 확장하여 몇 단계의 Newton 유사 업데이트를 얻는다.
크로스 엔트로피 손실에 맞추기 위해 회귀 출력의 스케일과 바이어스를 학습된 매개변수로 보정한다.
여러 에피소드에 걸쳐 에피소드 학습 단계를 역전파하여 표현 학습 및 하이퍼파라미터를 끝에서 끝으로 학습한다.
메타파라미터(특징 추출기 가중치, ridge lambda, 보정 파라미터)가held-out 에피소드 손실을 최소화하도록 SGD/Adam으로 학습되는 트레이닝 프로토콜을 제공한다.

실험 결과

연구 질문

RQ1빠르게 수렴하는 differentiable 솔버(ridge 및 logistic regression)가 few-shot 작업에서 효과적인 기본 학습기로 작동할 수 있는가?
RQ2closed-form 또는 IRLS 기반 솔버를 역전파하면 고차원 임베딩에서 경쟁력 있는 메타러닝 성능을 얻을 수 있는가?
RQ3Woodbury 항등식이 고차원 설정에서 few-shot 계산 효율성에 어떤 영향을 미치는가?
RQ4ridge 규제 및 로지스틱 회귀 기반 베이스가 표준 벤치마크의 최첨단 메타러닝 방법과 비교하여 어떤 성능 차이를 보이는가?
RQ5클래스 분류를 위한 cross-entropy 손실과 함께 회귀 출력의 보정 단계가 필요한가?

주요 결과

R2-D2( ridge regression)는 얕은 아키텍처에서도 mini-ImageNet 및 cifar-fs에서 최첨단에 근접한 성능을 달성한다.
LR-D2(반복적 logistic regression)는 다양한 반복 수에서도 유사한 성능을 달성하며, 이 IRLS의 유연성을 이 메타러닝 프레임워크에서 보여준다.
Omniglot에서 이 방법은 경쟁력 있으며, 더 높은 샷 설정을 포함한 다양한 문제에서 잘 작동한다.
Woodbury 기반의 형식은 임베딩 차원이 높고 에피소드 크기가 작을 때 계산 비용을 크게 감소시킨다.
회귀 출력의 보정(스케일링 및 바이어스)은 Few-shot 분류에서 cross-entropy 손실과의 정합성을 높이는 데 효과적이다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.