QUICK REVIEW

[論文レビュー] QdaVPR: A novel query-based domain-agnostic model for visual place recognition

Shanshan Wan, Lai Kang|arXiv (Cornell University)|Mar 8, 2026

Robotics and Sensor-Based Localization被引用数 0

ひとこと要約

QdaVPR は Bag-of-Queries VPR ボトムアップバックボーン上に双レベル対抗フレームワークを導入し、クエリ結合トリプレット監督と合成ドメイン拡張を用いてドメインに依存しない視覚的場所認識を実現する。

ABSTRACT

Visual place recognition (VPR) aiming at predicting the location of an image based solely on its visual features is a fundamental task in robotics and autonomous systems. Domain variation remains one of the main challenges in VPR and is relatively unexplored. Existing VPR models attempt to achieve domain agnosticism either by training on large-scale datasets that inherently contain some domain variations, or by being specifically adapted to particular target domains. In practice, the former lacks explicit domain supervision, while the latter generalizes poorly to unseen domain shifts. This paper proposes a novel query-based domain-agnostic VPR model called QdaVPR. First, a dual-level adversarial learning framework is designed to encourage domain invariance for both the query features forming the global descriptor and the image features from which these query features are derived. Then, a triplet supervision based on query combinations is designed to enhance the discriminative power of the global descriptors. To support the learning process, we augment a large-scale VPR dataset using style transfer methods, generating various synthetic domains with corresponding domain labels as auxiliary supervision. Extensive experiments show that QdaVPR achieves state-of-the-art performance on multiple VPR benchmarks with significant domain variations. Specifically, it attains the best Recall@1 and Recall@10 on nearly all test scenarios: 93.5%/98.6% on Nordland (seasonal changes), 97.5%/99.0% on Tokyo24/7 (day-night transitions), and the highest Recall@1 across almost all weather conditions on the SVOX dataset. Our code will be released at https://github.com/shuimushan/QdaVPR.

研究の動機と目的

ドメイン変動下での堅牢なVPRを動機づける。
推論時にターゲットドメインデータを必要としないドメイン非依存モデルを開発する。
クエリ特徴と画像特徴マップの両方にドメイン不変表現を促す双レベル対抗フレームワークを活用する。
クエリ結合トリプレット損失により記述子の識別力を高める。
季節変動や大域的なドメイン変動を伴う多様なVPRベンチマークで最先端の性能を示す。

提案手法

ボリュームとしてのトランスフォーマー型特徴集約のための固定可能学習クエリをグローバルプローブとして機能させるBag-of-Queries (BoQ) に基づく基盤アーキテクチャ。
双レベル対抗学習: クエリ特徴と画像特徴マップの両方に勾配反転法とドメイン分類を適用し、ドメイン不変表現を促す。
クエリ結合ベースのトリプレット監督: 中間出力から複数のクエリ結合を構築し、情報量の多い結合に焦点を当てる選択的トリプレット損失を適用する。
GSV-cities 上の6つの合成ドメイン（霧・雨・雪・風・夜・日照）を訓練データに拡張し、訓練時に補助的なドメイン监督を提供する。
グローバル記述子はクエリ特徴の重み付き結合で形成され、推論時には対抗成分を除去して効率を維持する。

実験結果

リサーチクエスチョン

RQ1ターゲットドメインデータをテスト時に用意せず、クエリベースの BoQ VPR モデルはドメイン不変性を達成できるか。
RQ2クエリ特徴と画像特徴の両方に対する双レベル対抗学習はクロスドメインVPR性能を向上させるか。
RQ3クエリ結合ベースのトリプレット監督はドメイン非依存記述子の識別力を高めるか。
RQ4合成ドメイン拡張はドメイン耐性表現の学習にどのような影響を与えるか。
RQ5QdaVPR は季節・昼夜・天候・長期変動を含むベンチマークで Recall@1/Recall@10 の最先端を達成するか。

主な発見

Method	Image size	Nordland R@1	Nordland R@5	Nordland R@10	Tokyo24/7 R@1	Tokyo24/7 R@5	Tokyo24/7 R@10	MSLS-val R@1	MSLS-val R@5	MSLS-val R@10	AmsterTime R@1	AmsterTime R@5	AmsterTime R@10	Pitts30k-test R@1	Pitts30k-test R@5	Pitts30k-test R@10
QdaVPR(ours)	224x224	89.3	95.7	97.2	95.2	97.5	98.4	92.0	96.1	96.5	57.9	77.7	82.2	92.7	96.4	97.3

QdaVPR は Nordland（季節変化）、Tokyo24/7（昼夜）、SVOX（天候/照明）などの大半のテストシナリオで最先端の Recall@1 および Recall@10 を達成する。
224x224 入力設定では、Nordland で R@1 89.3%、R@5 95.7%、R@10 97.2%、Tokyo24/7 で R@1 95.2%、R@5 97.5%、R@10 98.4%、MSLS-val で R@1 92.0%、R@5 96.1%、R@10 96.5%、AmsterTime で R@1 57.9%、R@5 77.7%、R@10 82.2%、Pitts30k-test で R@1 92.7%、R@5 96.4%、R@10 97.3% を達成。
推論時に対抗ブランチを除外することで、実行時の計算コストを変えずに頑健な性能を維持。
対抗整合と識別的なクエリ結合監督のおかげで、モデルは強力なドメイン間一般化を示す。
スタイル転換による拡張ドメインは、推論時にターゲットドメインデータを必要とせず、ドメイン不変表現の学習を支援する明示的なドメイン监督を提供する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。