[论文解读] Exploring Object Relation in Mean Teacher for Cross-Domain Detection
该论文提出MTOR,一种新颖的Mean Teacher框架,用于跨域目标检测,通过将目标间关系整合到一致性正则化中,实现了最先进的性能,在Syn2Real检测基准上创下22.8% mAP的单模型新纪录。
Rendering synthetic data (e.g., 3D CAD-rendered images) to generate annotations for learning deep models in vision tasks has attracted increasing attention in recent years. However, simply applying the models learnt on synthetic images may lead to high generalization error on real images due to domain shift. To address this issue, recent progress in cross-domain recognition has featured the Mean Teacher, which directly simulates unsupervised domain adaptation as semi-supervised learning. The domain gap is thus naturally bridged with consistency regularization in a teacher-student scheme. In this work, we advance this Mean Teacher paradigm to be applicable for cross-domain detection. Specifically, we present Mean Teacher with Object Relations (MTOR) that novelly remolds Mean Teacher under the backbone of Faster R-CNN by integrating the object relations into the measure of consistency cost between teacher and student modules. Technically, MTOR firstly learns relational graphs that capture similarities between pairs of regions for teacher and student respectively. The whole architecture is then optimized with three consistency regularizations: 1) region-level consistency to align the region-level predictions between teacher and student, 2) inter-graph consistency for matching the graph structures between teacher and student, and 3) intra-graph consistency to enhance the similarity between regions of same class within the graph of student. Extensive experiments are conducted on the transfers across Cityscapes, Foggy Cityscapes, and SIM10k, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, we obtain a new record of single model: 22.8% of mAP on Syn2Real detection dataset.
研究动机与目标
- 为解决在合成数据上训练的模型在真实图像上表现不佳的域偏移问题,即从合成到真实的目标检测中的域偏移问题。
- 将原本用于弱监督学习的Mean Teacher框架扩展至跨域检测任务,通过引入结构化目标关系。
- 通过在区域级别以及目标提议的关系图结构上强制执行一致性,提升域泛化能力。
- 通过增强类内图结构的一致性,提升特征判别能力,从而减少目标域检测中的误定位和假阳性。
提出的方法
- MTOR利用区域特征之间的余弦相似度,为教师和学生模型分别构建关系图,以捕捉目标之间的相互关系。
- 通过对齐教师和学生模型中对应区域提议的检测预测(分类与边界框回归),实现区域级别的对齐。
- 通过匹配教师与学生模型之间关系图的结构相似性,应用图间一致性,提升对输入扰动的鲁棒性。
- 通过增强学生模型图中同类目标区域之间的相似性,实现图内一致性,降低类内差异,提升特征判别性。
- 通过由超参数λ和α控制的加权组合,端到端优化三种一致性损失:区域级别、图间和图内一致性。
- 在Cityscapes、Foggy Cityscapes和SIM10k之间的跨域迁移任务上进行评估,使用标准mAP指标。
实验结果
研究问题
- RQ1将目标关系整合到Mean Teacher框架中,是否能提升跨域目标检测中的泛化性能?
- RQ2在关系图结构上强制执行一致性(图间一致性)对域偏移的鲁棒性有何影响?
- RQ3图内一致性在多大程度上减少了类内差异并提升了检测准确率?
- RQ4区域级别与图结构一致性相结合的方法,是否在合成到真实的目标检测中优于现有域自适应方法?
主要发现
- MTOR在Syn2Real检测基准上实现了22.8% mAP的全新单模型SOTA纪录,显著优于先前方法。
- 在Cityscapes → Foggy Cityscapes的迁移任务中,MTOR达到22.8% mAP,展现出在域偏移下的优越泛化能力。
- 消融实验表明,图间与图内一致性均对提升检测准确率有贡献,最佳性能出现在λ = 1.0且α ≈ 0.98时。
- 定性结果表明,MTOR能够检测到Source-only和DA基线模型所遗漏的目标(如行人)。
- 误差分析显示,与DA相比,MTOR减少了误定位和背景假阳性的数量,且正确检测(IoU ≥ 0.5)的比例更高。
- 关系图可视化显示,MTOR学习到的类内相似性比Source-only和DA更具判别性,验证了图内一致性策略的有效性。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。