QUICK REVIEW

[论文解读] Improving neural network representations using human similarity judgments

Lukas Muttenthaler, Lorenz Linhardt|arXiv (Cornell University)|Jun 7, 2023

Domain Adaptation and Few-Shot Learning被引用 10

一句话总结

本文提出 gLocal 变换，在保持局部结构的同时使神经表示的全局结构与人类相似性判断对齐，从而在不牺牲局部邻域保留的前提下提升少样本学习和异常检测能力。

ABSTRACT

Deep neural networks have reached human-level performance on many computer vision tasks. However, the objectives used to train these networks enforce only that similar images are embedded at similar locations in the representation space, and do not directly constrain the global structure of the resulting space. Here, we explore the impact of supervising this global structure by linearly aligning it with human similarity judgments. We find that a naive approach leads to large changes in local representational structure that harm downstream performance. Thus, we propose a novel method that aligns the global structure of representations while preserving their local structure. This global-local transform considerably improves accuracy across a variety of few-shot learning and anomaly detection tasks. Our results indicate that human visual representations are globally organized in a way that facilitates learning from few examples, and incorporating this global structure into neural network representations improves performance on downstream tasks.

研究动机与目标

研究是否对人类相似性进行显式全局对齐能够提升下游迁移。
开发一种结合全局对齐和局部结构保留的变换。
评估全局-局部对齐在多种模型和数据集上对少样本学习和异常检测的影响。
评估 gLocal 变换在提升任务性能的同时是否仍保持与人类相似性判断的一致性。

提出的方法

定义一种全局对齐损失，通过对三元组的 softmax 似然估计，将模型相似性与基于人类的三元组判断进行匹配。
比较一个最大化全局对齐的朴素线性变换与一个朝向缩放单位矩阵的正则化全局变换。
引入一个局部损失，使用未变换与变换空间之间的对比目标来保留原始空间的邻域结构。
将全局对齐和局部保持损失合并为 gLocal 目标，并对变换矩阵加入正则化项。
将 ImageNet 表征嵌入到倒数第二层，并优化 W、b 以使全局损失和局部损失的加权和最小。
通过网格搜索评估超参数（alpha、lambda、tau），以在对齐和局部结构之间取得平衡。

实验结果

研究问题

RQ1将表示的全局结构与人类相似性判断对齐是否会提升下游任务的性能？
RQ2正则化变换在实现全局对齐的同时能否保留局部结构？
RQ3与朴素和原始表示相比，gLocal 在少样本学习和异常检测上的表现如何？
RQ4经过 gLocal 对齐的表示是否在多个真人类数据集上保持与人类相似性判断的一致性？

主要发现

gLocal 变换在结合全局人类对齐结构的同时保留了局部邻域结构。
朴素的全局对齐可能损害下游性能；gLocal 通过添加局部性正则化来缓解这一问题。
gLocal 在若干基于 CLIP 的模型和数据集上持续提升少样本学习和异常检测性能。
使用 gLocal 将表示对齐到人类判断与朴素对齐相当，尽管它保留了局部结构。
gLocal 的增益在多个人人类相似性数据集上具有鲁棒性，并且在人类对齐相关的指标上没有带来较大损失。

(b) Downstream task performance vs. human alignment.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。