QUICK REVIEW

[논문 리뷰] Functional Regularisation for Continual Learning with Gaussian Processes

Michalis K. Titsias, Jonathan Schwarz|arXiv (Cornell University)|2019. 01. 31.

Gaussian Processes and Bayesian Inference참고 문헌 48인용 수 66

한 줄 요약

이 논문은 inducing points를 통해 task-specific 함수에 대한 Gaussian-process posterior를 유지하고 활용하여 망각을 방지하는 함수 공간에서의 학습 정규화를 통해 지속적 학습 프레임워크를 제안한다.

ABSTRACT

We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of a deep neural network. This method, referred to as functional regularisation for Continual Learning, avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function. To achieve this we rely on a Gaussian process obtained by treating the weights of the last layer of a neural network as random and Gaussian distributed. Then, the training algorithm sequentially encounters tasks and constructs posterior beliefs over the task-specific functions by using inducing point sparse Gaussian process methods. At each step a new task is first learnt and then a summary is constructed consisting of (i) inducing inputs -- a fixed-size subset of the task inputs selected such that it optimally represents the task -- and (ii) a posterior distribution over the function values at these inputs. This summary then regularises learning of future tasks, through Kullback-Leibler regularisation terms. Our method thus unites approaches focused on (pseudo-)rehearsal with those derived from a sequential Bayesian inference perspective in a principled way, leading to strong results on accepted benchmarks.

연구 동기 및 목표

매개변수 공간이 아닌 함수 공간에서 작동함으로써 치명적 망각 없이 지속적 학습을 촉진한다.
희소 Gaussian processes를 사용하여 작업 지식을 요약하고 보존하는 확장 가능한 베이지안 방법을 제안한다.
인듀싱 포인트 요약과 KL 정규화를 통해 rehearsal 기반 학습과 베이지안 지속적 학습을 통합한다.
작업 경계가 알려지지 않은 순차 작업 학습을 위한 task-change detection 메커니즘을 통해 가능하게 한다.

제안 방법

신경망의 마지막 층을 랜덤하고 Gaussian하게 모델링하여 task-specific 함수들에 대한 Gaussian process를 유도한다.
각 작업에 대해 inducing points를 사용하여 희소 변분 GP 근사를 얻는다.
각 작업 i에 대해 inducing-point 함수 값 q(u_i)에 대한 변분 분포를 유지하고 최적화한다.
과거 작업에 누적되며 KL(q(u_i) || p_θ(u_i)) 항으로 새로운 작업의 학습을 정규화한다.
작업 k를 학습할 때 현재 작업의 우도와 모든 이전 작업에 대한 KL 항들을 포함하는 ELBO를 최적화한다.
선택적으로 현재 작업에 대한 weight-space 추론을 수행하여 포스터리어 정확도를 향상시키고, 그다음 정규화를 위한 inducing-point 요약을 증류한다.

실험 결과

연구 질문

RQ1FRCL이 표준 지속적 학습 벤치마크에서 최첨단 성능을 달성하는가?
RQ2성능 및 확장성에 있어 inducing-point 선택 기준은 얼마나 중요한가?
RQ3베이지안 불확실성을 사용하여 자동으로 task 경계를 감지할 수 있는가?
RQ4함수 공간 정규화가 weight-space 정규화 및 rehearsal 기반 접근법과 어떻게 비교되는가?
RQ5작업당 inducing points 수가 정확도와 망각에 미치는 영향은 무엇인가?

주요 결과

FRCL은 실험에서 강한 성능을 달성하고 Permuted-MNIST 및 Omniglot 벤치마크에서 새로운 최첨단 성능을 기록한다.
작업-함수 값에 대한 근사 포스트리어 분포가 단순한 재생 버퍼보다 더 효과적인 정규화를 제공한다.
Inducing-point 최적화(특히 trace 기반 기준)가 inducing set 크기가 감소할 때 성능을 크게 개선한다.
FRCL의 예측은 특징 표현의 변화에 자동으로 적응하여 이전 작업의 망각을 상쇄한다.
최적화될 때 inducing points가 클래스 간에 고르게 분포하는 경향이 있어 다양한 작업 표현을 뒷받침한다.
Rehearsal 기반 기준선이 강력한 경쟁자이지만, FRCL은 재생 버퍼를 보완하는 원리 있는 불확실성 인식 정규화를 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.