[论文解读] LEEP: A New Measure to Evaluate Transferability of Learned Representations
LEEP 通过一次前向传播对源模型到目标任务的可迁移性进行估计,且无需目标任务训练,预测迁移性能与收敛性。它在与实际迁移准确度的相关性方面优于 NCE 与 H 分数。
We introduce a new measure to evaluate the transferability of representations learned by classifiers. Our measure, the Log Expected Empirical Prediction (LEEP), is simple and easy to compute: when given a classifier trained on a source data set, it only requires running the target data set through this classifier once. We analyze the properties of LEEP theoretically and demonstrate its effectiveness empirically. Our analysis shows that LEEP can predict the performance and convergence speed of both transfer and meta-transfer learning methods, even for small or imbalanced data. Moreover, LEEP outperforms recently proposed transferability measures such as negative conditional entropy and H scores. Notably, when transferring from ImageNet to CIFAR100, LEEP can achieve up to 30% improvement compared to the best competing method in terms of the correlations with actual transfer accuracy.
研究动机与目标
- 为深度表示提供可靠且成本低廉的可迁移性估计的需求提供动机。
- 将 LEEP 作为一种只需在目标数据上进行一次前向传播的度量进行介绍。
- 提供将 LEEP 与最优再训练模型的平均对数似然(在假设空间中包含 EEP)相关联的理论性质。
- 在各种迁移和元迁移情景下进行实证验证,包括小数据/不平衡数据。
- 展示在源模型选择和收敛预测方面的实用性。
提出的方法
- 将 LEEP 定义为由源模型和目标数据构建的期望经验预测器(EEP)平均对数似然。
- 通过让目标数据在源模型上前向传播来计算虚拟的源标签分布。
- 从观测的目标标签和虚拟源标签中估计经验条件概率 P(y|z)。
- 计算 T(θ, D) = (1/n) sum_i log(sum_z P(y_i|z) θ(x_i)_z) 作为可迁移性分数。
- 建立 LEEP 是最优再训练模型的平均对数似然的下界(在假设空间中包含 EEP)。
- 将 LEEP 与负条件熵(NCE)相关联,并将其可解释性和计算与先前度量进行比较。
实验结果
研究问题
- RQ1LEEP 是否能够在无需目标任务训练的情况下预测常见迁移学习方法(头部重新训练和微调)的性能?
- RQ2LEEP 是否与元迁移学习性能(如 CNAPs)以及微调中的收敛速度相关?
- RQ3在不同数据情境(小、不平衡、嘈杂)以及源/目标对中,LEEP 与 NCE 和 H 分数有何比较?
- RQ4LEEP 是否有助于选择要为给定目标任务部署的源模型?
主要发现
| 算法 | 实验设置 | 相关系数 | NCE | H | 细节 |
|---|---|---|---|---|---|
| LEEP | CIFAR10 -> CIFAR100, Source ResNet20 (pre-trained on CIFAR10) | 0.982 | 0.982 | 0.831 | Sec. 5.1 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (pre-trained on ImageNet) | 0.974 | 0.973 | 0.924 | Sec. 5.1 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet20 (small, balanced) | 0.744 | 0.743 | 0.877 | Sec. 5.2 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (small, balanced) | 0.798 | 0.715 | 0.026 (∗) | Sec. 5.2 |
| LEEP | CIFAR10 -> FashionMNIST, Source ResNet20 (small, balanced) | 0.518 | 0.429 | 0.787 | Sec. 5.2 |
| LEEP | ImageNet -> FashionMNIST, Source ResNet18 (small, balanced) | 0.631 | 0.622 | 0.005 (∗) | Sec. 5.2 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet18 (small, balanced, noisy) | 0.612 | 0.579 | 0.017 (∗) | Sec. 5.2 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet18 (small, balanced) | 0.612? | 0.579? | 0.017? | Sec. 5.2 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet20 (small, imbalanced) | 0.862 | 0.847 | 0.787 | Sec. 5.3 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (small, imbalanced) | 0.522 | 0.484 | -0.058 (∗) | Sec. 5.3 |
| LEEP | CIFAR10 -> FashionMNIST, Source ResNet20 (small, imbalanced) | 0.704 | 0.688 | 0.822 | Sec. 5.3 |
| LEEP | ImageNet -> FashionMNIST, Source ResNet18 (small, imbalanced) | 0.645 | 0.624 | 0.059 (∗) | Sec. 5.3 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet18 (large, balanced) | 0.967 | 0.967 | 0.787 | Sec. 5.1 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (large, balanced) | 0.944 | 0.945 | 0.875 | Sec. 5.1 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet20 (small, balanced) | 0.396 | 0.401 | 0.737 | Sec. 5.2 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (small, balanced) | 0.762 | 0.584 | -0.029 (∗) | Sec. 5.2 |
| LEEP | CIFAR10 -> FashionMNIST, Source ResNet20 (small, balanced) | 0.339 | 0.258 | 0.826 | Sec. 5.2 |
| LEEP | ImageNet -> FashionMNIST, Source ResNet18 (small, balanced) | 0.609 | 0.578 | 0.018 (∗) | Sec. 5.2 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (small, balanced, noisy) | 0.348 | 0.324 | 0.06 (∗) | Sec. 5.2 |
| LEEP | CIFAR10 -> CIFAR100, Source ResNet20 (small, imbalanced) | 0.597 | 0.582 | 0.758 | Sec. 5.3 |
| LEEP | ImageNet -> CIFAR100, Source ResNet18 (small, imbalanced) | 0.522 | 0.484 | -0.058 (∗) | Sec. 5.3 |
| LEEP | CIFAR10 -> FashionMNIST, Source ResNet20 (small, imbalanced) | 0.704 | 0.688 | 0.822 | Sec. 5.3 |
| LEEP | ImageNet -> FashionMNIST, Source ResNet18 (small, imbalanced) | 0.645 | 0.624 | 0.059 (∗) | Sec. 5.3 |
- LEEP 分数在跨任务和迁移方法中的迁移准确性上相关性很强,在许多设置中皮尔逊相关系数超过 0.94,p 值小于 0.001。
- LEEP 在小数据和不平衡目标数据情境下仍具预测性,甚至在带噪声标签时也如此。
- LEEP 预测微调的更快收敛,并能指示何时迁移的模型优于从零开始训练的参考模型。
- 在大多数比较中,LEEP 的表现优于 NCE 和 H 分数,且与实际迁移准确度的相关性提升可达 30%。
- LEEP 也为元迁移学习(CNAPs)的性能提供了可行的度量,显示出显著相关性(0.591,p<0.001)。
- 在源模型选择中,LEEP 基于的估计在多数情境下比 NCE 或 H 分数更接近迁移性能。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。