QUICK REVIEW

[论文解读] Revisiting Fine-tuning for Few-shot Learning

Akihiro Nakamura, Tatsuya Harada|arXiv (Cornell University)|Oct 1, 2019

Domain Adaptation and Few-Shot Learning参考文献 23被引用 30

一句话总结

本文重新审视了少样本学习中的网络微调方法，并表明在合适的超参数设置下——尤其是低学习率、自适应优化器（如Adam）以及全网络微调——该方法在低、高及跨域少样本图像分类基准上均达到了最先进或更优的性能，许多情况下优于专门设计的少样本学习方法。

ABSTRACT

Few-shot learning is the process of learning novel classes using only a few examples and it remains a challenging task in machine learning. Many sophisticated few-shot learning algorithms have been proposed based on the notion that networks can easily overfit to novel examples if they are simply fine-tuned using only a few examples. In this study, we show that in the commonly used low-resolution mini-ImageNet dataset, the fine-tuning method achieves higher accuracy than common few-shot learning algorithms in the 1-shot task and nearly the same accuracy as that of the state-of-the-art algorithm in the 5-shot task. We then evaluate our method with more practical tasks, namely the high-resolution single-domain and cross-domain tasks. With both tasks, we show that our method achieves higher accuracy than common few-shot learning algorithms. We further analyze the experimental results and show that: 1) the retraining process can be stabilized by employing a low learning rate, 2) using adaptive gradient optimizers during fine-tuning can increase test accuracy, and 3) test accuracy can be improved by updating the entire network when a large domain-shift exists between base and novel classes.

研究动机与目标

重新评估标准网络微调在少样本学习中的表现，挑战其本质上劣于专用算法的假设。
探究微调是否能在低分辨率、高分辨率和跨域少样本学习基准上实现具有竞争力的准确率。
识别在低数据场景下稳定并提升微调性能的关键超参数与训练策略。
分析域偏移对微调有效性的影响，并确定不同网络组件的最优更新策略。

提出的方法

使用低学习率的标准随机梯度下降对预训练深度神经网络（如ResNet-18、VGG-16）在少样本支持集上进行微调。
采用自适应梯度优化器（如Adam、Adamax、Adagrad和RMSprop）以提升微调过程中的收敛性和测试准确率。
系统评估网络不同部分的微调效果：仅微调分类器头、批量归一化与全连接层，或整个网络。
使用归一化分类器头以提升少样本分类中的泛化能力与稳定性。
在三种基准设置下进行实验：低分辨率mini-ImageNet（标准设置）、高分辨率mini-ImageNet（实际单域设置）以及跨域数据集（存在较大域偏移）。
使用验证集调优学习率和微调周期数，确保在各类任务上的鲁棒性。

实验结果

研究问题

RQ1当超参数调优得当时，标准网络微调能否超越专用的少样本学习算法？
RQ2学习率的选择如何影响少样本设置下微调的稳定性和准确率？
RQ3与标准SGD相比，使用Adam等自适应梯度优化器是否能提升少样本分类的准确率？
RQ4在何种条件下，更新整个网络比仅微调分类器头更有效？
RQ5基础类与新类之间的域偏移如何影响微调策略的性能？

主要发现

在1-shot低分辨率mini-ImageNet任务中，使用最优超参数的微调方法达到了高于常见少样本学习算法的准确率。
在5-shot低分辨率任务中，微调方法的准确率几乎与最先进方法持平，展现出强大的竞争力。
采用低学习率（如0.0001）可稳定微调过程，防止发散并提升收敛稳定性。
自适应梯度优化器（如Adam）显著提升了测试准确率，尤其在结合归一化分类器头时效果更明显。
当基础类与新类之间存在较大域偏移时，更新整个网络能带来更高的测试准确率，尤其在跨域任务中表现突出。
全网络微调带来的性能增益在跨域任务中最为显著，因为该场景下的域偏移较大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。