QUICK REVIEW

[论文解读] Cross Attention Network for Few-shot Classification

Ruibing Hou, Hong Chang|arXiv (Cornell University)|Oct 17, 2019

Domain Adaptation and Few-Shot Learning被引用 143

一句话总结

CAN 引入一个跨注意力模块，通过对类别与查询特征之间的语义相关性进行建模来突出目标对象，并增加一个传导推理步骤以利用自信的查询样本来增强支持集，在少样本基准上达到最先进的结果。

ABSTRACT

Few-shot classification aims to recognize unlabeled samples from unseen classes given only few labeled samples. The unseen classes and low-data problem make few-shot classification very challenging. Many existing approaches extracted features from labeled and unlabeled samples independently, as a result, the features are not discriminative enough. In this work, we propose a novel Cross Attention Network to address the challenging problems in few-shot classification. Firstly, Cross Attention Module is introduced to deal with the problem of unseen classes. The module generates cross attention maps for each pair of class feature and query sample feature so as to highlight the target object regions, making the extracted feature more discriminative. Secondly, a transductive inference algorithm is proposed to alleviate the low-data problem, which iteratively utilizes the unlabeled query set to augment the support set, thereby making the class features more representative. Extensive experiments on two benchmarks show our method is a simple, effective and computationally efficient framework and outperforms the state-of-the-arts.

研究动机与目标

解决由于未见类别和数据稀缺导致的少样本分类中判别特征的不稳定性。
提出跨注意力模块（CAM），通过学习类别特征和查询特征之间的跨相关性来突出目标区域。
引入一种传导推理算法，利用未标记的查询样本来丰富类别表征。
证明 CAN 结构简单、高效，并在标准基准测试上取得了最先进的结果。

提出的方法

提出跨注意力模块（CAM），通过相关层和元融合层在类别特征图和查询特征图之间计算跨注意力图。
从局部特征的余弦相似度计算类别相关图 R^p 和查询相关图 R^q，然后通过输出核 w 的元学习器生成注意力图 A^p 和 A^q。
通过用 1 + A^p 和 1 + A^q 加权特征来应用残余注意力，从而获得判别性 P̄ 和 Q̄。
以联合损失 L = λL1 + L2 训练 CAN，其中 L1 是基于最近邻的局部监督，L2 是全局分类损失。
推理阶段可使用归纳或传导策略；传导推理用伪标注的查询样本来扩充支持集以迭代地细化类别特征。
可选地将 CAM 与传导推理扩展到其他模型（例如 Matching Network、Prototypical Network、Relation Network）。

实验结果

研究问题

RQ1在少样本任务中，支持集（类别）和查询特征之间的跨注意力是否可以提高对未见类别的判别能力？
RQ2在数据稀缺条件下，将传导推理策略与自信伪标注的查询样本一起扩充支持集，是否能提升性能？
RQ3跨注意力方法在计算上是否足够高效，足以在标准少样本基准上实际应用？
RQ4CAM 指导的特征是否能够在多数据集上提升归纳和传导两种少样本分类设置？

主要发现

模型	嵌入	推理时间(s)	miniImageNet 1-shot	miniImageNet 5-shot	tieredImageNet 1-shot	tieredImageNet 5-shot
CAN	ResNet-12	0.044	63.85 ± 0.48	79.44 ± 0.34	69.89 ± 0.51	84.23 ± 0.37
CAN+T	ResNet-12	-	67.19 ± 0.55	80.64 ± 0.35	73.21 ± 0.58	84.93 ± 0.38

CAN 在 miniImageNet 和 tieredImageNet 的 5-way 1-shot 和 5-way 5-shot 设置中达到最先进的结果。
CAN+.T（传导 CAN）在所报告基准上在 1-shot 提前 8% 的优势，在 5-shot 提前 5% 的优势，超过了先前的传导方法。
消融研究显示全局分类损失和跨注意力模块显著提升性能，CAM 中的元学习器有效生成自适应核。
为 CAN 设计的传导推理也可以推广到改进其他少样本模型（Matching Network、Prototypical Network、Relation Network）。
得益于 CAM 的高效基于相关的注意力和轻量级元学习器，CAN 在计算开销和参数量方面实现这些提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。