QUICK REVIEW

[论文解读] Scale-Aware UAV-to-Satellite Cross-View Geo-Localization: A Semantic Geometric Approach

Yibin Ye, Shuo Chen|arXiv (Cornell University)|Mar 8, 2026

UAV Applications and Optimization被引用 0

一句话总结

论文提出一个语义-几何框架，利用小型车辆作为锚点来估计绝对 monocular UAV 图像尺度，从而实现尺度自适应裁剪用于 UAV 到卫星跨视几何定位，并在未知 UAV 图像尺度下提高鲁棒性。

ABSTRACT

Cross-View Geo-Localization (CVGL) between UAV imagery and satellite images plays a crucial role in target localization and UAV self-positioning. However, most existing methods rely on the idealized assumption of scale consistency between UAV queries and satellite galleries, overlooking the severe scale ambiguity commonly encountered in real-world scenarios. This discrepancy leads to field-of-view misalignment and feature mismatch, significantly degrading CVGL robustness. To address this issue, we propose a geometric framework that recovers the absolute metric scale from monocular UAV images using semantic anchors. Specifically, small vehicles (SVs), characterized by relatively stable prior size distributions and high detectability, are exploited as metric references. A Decoupled Stereoscopic Projection Model is introduced to estimate the absolute image scale from these semantic targets. By decomposing vehicle dimensions into radial and tangential components, the model compensates for perspective distortions in 2D detections of 3D vehicles, enabling more accurate scale estimation. To further reduce intra-class size variation and detection noise, a dual-dimension fusion strategy with Interquartile Range (IQR)-based robust aggregation is employed. The estimated global scale is then used as a physical constraint for scale-adaptive satellite image cropping, improving UAV-to-satellite feature alignment. Experiments on augmented DenseUAV and UAV-VisLoc datasets demonstrate that the proposed method significantly improves CVGL robustness under unknown UAV image scales. Additionally, the framework shows strong potential for downstream applications such as passive UAV altitude estimation and 3D model scale recovery.

研究动机与目标

突出尺度差异对 UAV 到卫星 CVGL 鲁棒性的影响。
提出一个语义-几何框架，利用语义锚点从单目 UAV 图像恢复绝对尺度。
开发一个解耦立体投影模型和一个鲁棒的双维尺度恢复策略。
实现尺度自适应卫星裁剪以改善跨视特征对齐。
证明在 UAV 高度估计和 3D 模型尺度恢复中的适用性。

提出的方法

识别小型车辆作为稳定几何锚点，原因是普遍存在、类内方差低且易检测。
开发一个解耦的立体投影模型，以分离径向和切向分量并从 2D 检测中估计绝对尺度。
通过长度和宽度计算实例尺度候选，采用双维融合并通过鲁棒的 IQR 汇聚来获得全局尺度。
利用全局尺度计算 nadir 当量分辨率并对卫星影像进行尺度自适应裁剪以进行 CVGL。
在 DenseUAV 和 UAV-VisLoc 基准上增加连续卫星地图与相对尺度注释以便验证。
评估在未知 UAV 尺度下 CVGL 的鲁棒性并探索下游任务 مانند 高度估计和 3D 模型尺度恢复。

实验结果

研究问题

RQ1尺度在 UAV 查询与卫星库之间不匹配会如何影响 CVGL 性能？
RQ2是否能够利用语义锚点通过鲁棒汇聚从单目 UAV 图像中恢复绝对尺度？
RQ3尺度感知裁剪在未知尺度下是否提升 UAV-到-卫星 CVGL 的鲁棒性？
RQ4能否将恢复的尺度用于 UAV 高度估计和 3D 模型尺度恢复？

主要发现

在存在足够可检测目标时，单目 UAV 图像可估计绝对尺度，相对误差≤10%。
由估计尺度引导的尺度感知裁剪提升了在尺度模糊下 CVGL 的鲁棒性。
该框架支持下游任务，如被动 UAV 高度估计和 3D 模型尺度恢复。
通过在 DenseUAV 和 UAV-VisLoc 数据集中增加尺度注释与连续卫星库，获得尺度估计的验证。
双维（径向和切向）投影模型解决了小型车辆在 3D 场景中的立体效应，从而实现更准确的尺度估计。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。