[论文解读] Incremental Learning for Metric-Based Meta-Learners
本文提出了一种基于度量的元学习器的增量学习框架,可在元训练期间实现持续适应,而不会发生灾难性遗忘。通过在新数据到达时增量更新元学习器(使用现有的基于度量的算法),该方法在保持稳定性的同时,实现了与全数据集训练相当的性能。
Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen during the meta-training phase. To the best of our knowledge, all such meta-learning methods use a single base dataset for meta-training to sample tasks from and do not adapt the algorithm after meta-training. This strategy may not scale to real-world use-cases where the meta-learner does not potentially have access to the full meta-training dataset from the very beginning and we need to update the meta-learner in an incremental fashion when additional training data becomes available. Through our experimental setup, we develop a notion of incremental learning during the meta-training phase of meta-learning and propose a method which can be used with multiple existing metric-based meta-learning algorithms. Experimental results on benchmark dataset show that our approach performs favorably at test time as compared to training a model with the full meta-training set and incurs negligible amount of catastrophic forgetting
研究动机与目标
- 解决现有元学习方法依赖单一静态元训练数据集的局限性。
- 实现在现实场景中,新数据到达时元学习器的持续适应。
- 开发一种可与多种现有基于度量的元学习算法兼容的通用方法。
- 在增量更新过程中最小化灾难性遗忘,同时保持少样本分类任务的性能。
提出的方法
- 该方法在基于度量的元学习器的元训练阶段引入了一种新颖的增量学习协议。
- 它允许元学习器在新数据批次到达时进行增量更新,而无需从头开始重新训练。
- 该方法保持固定的内部表征容量,并使用经验回放或参数正则化来减少遗忘。
- 其设计可与现有的基于度量的元学习器(如原型网络和匹配网络)兼容。
- 该框架采用一种任务采样策略,在增量更新过程中同时整合旧任务和新任务。
- 通过在增量学习过程中保留先前见过类别的知识,该方法确保了性能的稳定性。
实验结果
研究问题
- RQ1在初始元训练后新数据到达时,元学习器能否以有效的方式进行增量更新?
- RQ2与全数据集元训练相比,增量元学习在测试准确率上有何表现?
- RQ3所提出的方法在增量更新过程中在多大程度上缓解了灾难性遗忘?
- RQ4该方法能否在多种现有基于度量的元学习算法上保持一致的性能提升?
主要发现
- 所提出的增量学习方法在测试性能上与在完整元训练数据集上训练的性能相当。
- 灾难性遗忘可忽略不计,在增量更新过程中先前见过类别的性能下降极小。
- 在增量更新后,该方法在已见和新少样本任务上均保持了高准确率。
- 该方法在多个基准数据集上均有效,且与多种基于度量的元学习器兼容。
- 增量训练协议使得在数据按顺序到达的真实世界环境中可扩展部署成为可能。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。