QUICK REVIEW

[논문 리뷰] Meta-Learning with Implicit Gradients

Aravind Rajeswaran, Chelsea Finn|arXiv (Cornell University)|2019. 09. 10.

Domain Adaptation and Few-Shot Learning인용 수 217

한 줄 요약

논문은 내부 루프 최적화 경로를 미분하지 않고 정확한 메타-그라디언트를 계산하는 메모리 효율적 메타 학습 방법 iMAML(implicit MAML)을 도입한다. 암시적 미분과 해시안-벡터 곱을 사용한다. 이는 few-shot 인식 벤치마크에서 경쟁력 있거나 우수한 성능을 달성하면서 메타-그래디언트를 내부 옵티마이저로부터 분리한다.

ABSTRACT

A core capability of intelligent systems is the ability to quickly learn new tasks by drawing on prior experience. Gradient (or optimization) based meta-learning has recently emerged as an effective approach for few-shot learning. In this formulation, meta-parameters are learned in the outer loop, while task-specific models are learned in the inner-loop, by using only a small amount of data from the current task. A key challenge in scaling these approaches is the need to differentiate through the inner loop learning process, which can impose considerable computational and memory burdens. By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer. This effectively decouples the meta-gradient computation from the choice of inner loop optimizer. As a result, our approach is agnostic to the choice of inner loop optimizer and can gracefully handle many gradient steps without vanishing gradients or memory constraints. Theoretically, we prove that implicit MAML can compute accurate meta-gradients with a memory footprint that is, up to small constant factors, no more than that which is required to compute a single inner loop gradient and at no overall increase in the total computational cost. Experimentally, we show that these benefits of implicit MAML translate into empirical gains on few-shot image recognition benchmarks.

연구 동기 및 목표

내부 루프 최적화가 미분될 때 발생하는 gradient-based 메타 학습의 규모 확장성 문제를 제기한다.
내부 해(solution)만 의존하고 최적화 경로는 의존하지 않는 암시적 미분 기반 메타-그래디언트 계산을 제안한다.
내부 최적화를 안정시키고 메모리 효율성을 가능하게 하기 위한 proximal 정규화를 갖춘 iMAML 알고리즘을 개발한다.
근사 메타-그래디언트에 대한 메모리 및 계산에 대한 이론적 보장을 제공하고, few-shot 학습 태스크에서 실증적 이점을 보여준다.

제안 방법

메타 파라미터를 둘러싼 proximal 항으로 정규화된 내부 문제를 포함하는 이층 최적화로 메타 학습을 수식화한다.
내부 최적화 해에 대한 암시적 야코비안을 도출해 내부 루프를 미분하지 않고 메타-그래디언트를 산출한다.
델타 정확한 내부 해해법과 공액 경사법을 이용한 델타'-근사 야코비안을 통해 해시안-벡터 곱을 계산하는 실용적 iMAML 알고리즘을 도입한다.
내부 최적화를 통한 역전파의 미니맥스(minimax) 복잡도에 상응하면서도 내부 단계에 대해 O(1) 메모리를 사용하는 것을 iMAML이 보여준다.
이론적 보장을 제공: epsilon-근사 메타-그래디언트를 내부 반복 수와 무관한 메모리 및 CG 기반 해시안-벡터 곱으로 얻을 수 있다.
Omniglot과 Mini-ImageNet에서 MAML 및 FOMAML에 비해 경쟁력 있는 성능과 바람직한 연산/메모리 트레이드오프를 보임을 실증한다.

실험 결과

연구 질문

RQ1암시적 미분이 내부 최적화 경로를 미분하지 않고도 정확한 메타-그래디언트를 산출할 수 있는가?
RQ2내부 루프 단계 수가 증가할 때 iMAML의 메모리 및 계산 비용은 표준 MAML과 어떻게 비교되는가?
RQ3더 복잡한 내부 옵티마이저와 더 큰 데이터 세트에서도 그래디언트 소실 없이 iMAML 기반 메타-그래디언트가 확장이 가능한가?
RQ4few-shot 벤치마크에서의 실험 결과가 이론적 메모리/계산 이점 및 성능 향상을 뒷받침하는가?

주요 결과

iMAML은 내부 루프 단계 수에 따라 증가하지 않는 메모리로 정확한 메타-그래디언트를 계산할 수 있으며, 전체 계산량은 역전파 기반 방법과 비슷하다.
합성 테스트에서 iMAML은 점근적으로 정확한 메타-그래디언트와 일치하며, MAML보다 더 나은 유한 스텝 근사를 제공한다.
Omniglot에서 gradient descent 내부 루프를 사용하는 iMAML은 전체 MAML과 경쟁력이 있으며 1차 차수 변형(First-order variants)을 능가하고, 해시안-프리(Hessian-free) 내부 최적화가 추가 이점을 제공한다.
Mini-ImageNet에서, 보고된 설정에서 iMAML이 MAML 및 FOMAML보다 더 높은 정확도를 달성한다.
이론적 결과는 CG 기반 해시안-벡터 곱으로 epsilon-근사 메타-그래디언트를 메모리 효율적인 방식으로 얻을 수 있으며, iMAML은 완화된 가정하에 외부 목적함수의 정지점을 찾는다는 것을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.