QUICK REVIEW

[논문 리뷰] Selfless Sequential Learning

Rahaf Aljundi, Marcus Rohrbach|arXiv (Cornell University)|2018. 06. 14.

Domain Adaptation and Few-Shot Learning참고 문헌 40인용 수 53

한 줄 요약

이 논문은 SLNID를 소개하는데, 이는 고정된 모델 용량에서 장기 학습(lifelong learning)을 개선하기 위해 희소하고 국소적으로 억제된 그리고 작업 중요도 인식(neuron importance-aware) 활성화를 촉진하는 표현 기반 정규화기이다.

ABSTRACT

Sequential learning, also called lifelong learning, studies the problem of learning tasks in a sequence with access restricted to only the data of the current task. In this paper we look at a scenario with fixed model capacity, and postulate that the learning process should not be selfish, i.e. it should account for future tasks to be added and thus leave enough capacity for them. To achieve Selfless Sequential Learning we study different regularization strategies and activation functions. We find that imposing sparsity at the level of the representation (i.e.~neuron activations) is more beneficial for sequential learning than encouraging parameter sparsity. In particular, we propose a novel regularizer, that encourages representation sparsity by means of neural inhibition. It results in few active neurons which in turn leaves more free neurons to be utilized by upcoming tasks. As neural inhibition over an entire layer can be too drastic, especially for complex tasks requiring strong representations, our regularizer only inhibits other neurons in a local neighbourhood, inspired by lateral inhibition processes in the brain. We combine our novel regularizer, with state-of-the-art lifelong learning methods that penalize changes to important previously learned parts of the network. We show that our new regularizer leads to increased sparsity which translates in consistent performance improvement %over alternative regularizers we studied on diverse datasets.

연구 동기 및 목표

고정된 모델 용량에서 미래 작업이 수용되어야 한다는 점에서 평생 학습을 동기화한다.
작업 간 간섭을 줄이기 위한 표현의 희소성 vs 매개변수 희소성 비교를 조사한다.
로컬 신경 억제와 뉴런 중요도 할인(discounting)을 구현하는 새로운 정규화 기법(SLNID)을 제안한다.
표현 기반의 희소성과 SLNID가 다양한 데이터셋과 기반선에서 성능을 향상시키는 것을 입증한다.

제안 방법

MAS/EWC 스타일의 중요도 보존과 활성에 대한 표현 기반 희소성 목표를 결합한 정규화 프레임워크를 제안한다.
주변 뉴런의 동시 활성화를 페널티하는 Local Neural Inhibition and Discounting을 통한 Sparse coding(SLNID)을 도입한다(로컬 억제).
현재 작업과 관련 있을 때 과거에 중요한 뉴런들을 억제으로부터 보호하기 위해 뉴런 중요도 할인(discounting)을 포함하도록 SLNID를 확장한다.
이전에 학습된 뉴런 중요도(alpha_i)에 의해 조절되는 숨겨진 활성화에 대해 로컬 가우시안 가중 억제 항으로서의 SLNID를 수식화한다.
SLNID를 MAS(에도 EWC와의 호환성도 보이며)와 통합하고 permuted MNIST, CIFAR-100, Tiny ImageNet에서 평가하여 호환성을 시연한다.

실험 결과

연구 질문

RQ1고정된 용량의 모델에서 표현의 희소성(activation의 희소성)이 파라미터 희소성보다 더 나은 평생 학습 성능을 내는가?
RQ2로컬(전역이 아닌) 뉴런 억제와 뉴런 중요도 할인으로 과거 지식을 보존하면서 미래 작업을 위한 용량을 확보할 수 있는가?
RQ3SLNID가 다양한 데이터셋(permuted MNIST, CIFAR-100, Tiny ImageNet)과 다양한 기반 LLL 방법(MAS, EWC)에서 어떤 성능을 보이는가?
RQ4SLNID가 순차 작업에서 활성/중요 뉴런의 용량 사용도(활성화/중요한 뉴런)와 표현(희소성/상관 제거)에 미치는 영향은 무엇인가?

주요 결과

표현 기반의 정규화기가 순차 학습 설정에서 파라미터 기반 방식보다 우수하다.
SLNID 정규화기는 시퀀스 말의 정확도를 더 높이고 데이터셋 전반에서 미래 작업에 사용할 수 있는 용량을 더 많이 보유하게 한다.
로컬 억제와 뉴런 중요도 할인은 망각에 대한 강인성을 개선하여 permuted MNIST, CIFAR-100, Tiny ImageNet 시퀀스에서 강력한 기준선보다 수개에서 수십 퍼센트포인트의 이득을 달성한다.
SLNID가 MAS(EWC와의 조합 가능성 포함)과 결합될 때 일관되게 성능을 향상시키고 더 작은 네트워크가 비정규화된 더 큰 모델과 대응하도록 만든다.
제거 분석(ablation)을 통해 로컬성 및 중요도 할인(discounting)이 성능에 결정적이며 SLNID가 더 희소한 활성화를 생성하고 향후 작업에 더 많은 미사용 매개변수를 남긴다는 것을 보인다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.