[论文解读] Graph Density-Aware Losses for Novel Compositions in Scene Graph Generation
本文提出密度归一化的边损失,以提升场景图生成(SGG)中的零样本和少样本泛化能力,解决了两个关键问题:(1) 标准损失无意中过度惩罚稀疏图中罕见关系的样本,(2) 模型受频率偏差影响,损害泛化性能。该方法在关键指标上将少样本和零样本性能提升两倍以上,计算成本极低,且无需架构修改。
Scene graph generation (SGG) aims to predict graph-structured descriptions of input images, in the form of objects and relationships between them. This task is becoming increasingly useful for progress at the interface of vision and language. Here, it is important - yet challenging - to perform well on novel (zero-shot) or rare (few-shot) compositions of objects and relationships. In this paper, we identify two key issues that limit such generalization. Firstly, we show that the standard loss used in this task is unintentionally a function of scene graph density. This leads to the neglect of individual edges in large sparse graphs during training, even though these contain diverse few-shot examples that are important for generalization. Secondly, the frequency of relationships can create a strong bias in this task, such that a blind model predicting the most frequent relationship achieves good performance. Consequently, some state-of-the-art models exploit this bias to improve results. We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA. To address these issues, we introduce a density-normalized edge loss, which provides more than a two-fold improvement in certain generalization metrics. Compared to other works in this direction, our enhancements require only a few lines of code and no added computational cost. We also highlight the difficulty of accurately evaluating models using existing metrics, especially on zero/few shots, and introduce a novel weighted metric.
研究动机与目标
- 为解决场景图生成(SGG)在罕见或未见的物体-谓词组合上的泛化性能差的问题。
- 识别出标准SGG损失偏向密集图,忽视了包含罕见关系的稀疏但信息丰富的图。
- 揭示训练数据中的频率偏差导致模型过度拟合常见关系,从而损害零样本/少样本性能。
- 提出一种新颖且轻量级的损失函数,通过图的密度对边监督进行归一化,提升对罕见组合的学习效果。
- 引入一种加权评估指标,赋予罕见和未见关系更高权重,提升对泛化性能的敏感度。
提出的方法
- 提出一种密度归一化的边损失,通过将每个边的交叉熵损失乘以图密度的倒数(即每节点的边数),降低在稀疏图中的过度惩罚。
- 在SGG模型训练中应用该改进损失,仅需少量代码修改,且不增加推理成本。
- 引入一种新型加权指标,赋予罕见和未见关系更高重要性,提升对泛化性能评估的敏感度。
- 使用消息传递模型(如GCN消息传递)进行SGG,基于Visual Genome和GQA数据集进行训练与评估。
- 采用IoU匹配(≥50%)评估三元组预测结果,预测结果按主体、对象和谓词的softmax得分乘积排序。
- 在两个强基线模型[37]和[41]上验证性能提升,证明在不同模型和数据集上均具有一致性增益。
实验结果
研究问题
- RQ1训练数据中的图密度在多大程度上影响SGG模型在罕见和未见组合上的泛化性能?
- RQ2训练数据中的频率偏差在多大程度上导致SGG模型在零样本和少样本设置下表现不佳?
- RQ3一种简单且基于密度感知的损失重加权策略是否能显著提升泛化性能,而无需架构修改或额外计算?
- RQ4如何改进评估指标,使其更能反映SGG中罕见和未见关系的模型性能?
- RQ5所提方法是否在现有及新提出的加权指标上均提升了对罕见组合泛化性能的表现?
主要发现
- 标准SGG损失无意中对稀疏图施加了更重的惩罚,导致模型忽略这些图中包含的罕见关系。
- 所提出的密度归一化边损失在关键指标上将少样本和零样本性能提升两倍以上,即使仅做极少代码修改。
- 使用新损失训练的模型做出更多元化的预测,且更不易受频率偏差影响,Visual Genome上的定性对比已证实此点。
- 新型加权评估指标能更准确捕捉模型在罕见组合上的表现,并揭示频率偏差模型在未见关系上的性能低下。
- 该方法在Visual Genome和GQA数据集上均达到最先进性能,尤其在零样本/少样本泛化方面表现突出,且无需架构修改。
- 即使真实标签存在误标或使用同义词(如“plant”与“flower”),使用新损失的模型仍表现出更强的鲁棒性与泛化能力。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。