QUICK REVIEW

[논문 리뷰] Positive-First Most Ambiguous: A Simple Active Learning Criterion for Interactive Retrieval of Rare Categories

Kawtar Zaher, Olivier Buisson|arXiv (Cornell University)|2026. 03. 25.

Domain Adaptation and Few-Shot Learning인용 수 0

한 줄 요약

PF-MA를 소개합니다. 극심한 클래스 불균형과 낮은 주석 예산 하에서 희귀하고 세밀한 개념의 인터랙티브 검색을 위한 활성 학습 기준으로, 검색 다양성을 위한 클래스 커버리지 메트릭스를 제시합니다.

ABSTRACT

Real-world fine-grained visual retrieval often requires discovering a rare concept from large unlabeled collections with minimal supervision. This is especially critical in biodiversity monitoring, ecological studies, and long-tailed visual domains, where the target may represent only a tiny fraction of the data, creating highly imbalanced binary problems. Interactive retrieval with relevance feedback offers a practical solution: starting from a small query, the system selects candidates for binary user annotation and iteratively refines a lightweight classifier. While Active Learning (AL) is commonly used to guide selection, conventional AL assumes symmetric class priors and large annotation budgets, limiting effectiveness in imbalanced, low-budget, low-latency settings. We introduce Positive-First Most Ambiguous (PF-MA), a simple yet effective AL criterion that explicitly addresses the class imbalance asymmetry: it prioritizes near-boundary samples while favoring likely positives, enabling rapid discovery of subtle visual categories while maintaining informativeness. Unlike standard methods that oversample negatives, PF-MA consistently returns small batches with a high proportion of relevant samples, improving early retrieval and user satisfaction. To capture retrieval diversity, we also propose a class coverage metric that measures how well selected positives span the visual variability of the target class. Experiments on long-tailed datasets, including fine-grained botanical data, demonstrate that PF-MA consistently outperforms strong baselines in both coverage and classifier performance, across varying class sizes and descriptors. Our results highlight that aligning AL with the asymmetric and user-centric objectives of interactive fine-grained retrieval enables simple yet powerful solutions for retrieving rare and visually subtle categories in realistic human-in-the-loop settings.

연구 동기 및 목표

매우 불균형한 데이터에서 사용자가 정의한 시각적으로 미묘한 개념의 인터랙티브 검색을 동기부여한다.
저지연 인간-루프 주석에 적합한 경량의 빠른 분류기 워크플로우를 개발한다.
PF-MA를 도입하여 정보성 및 즉각적 사용자 유용성을 균형 있게 유지하면서 경계 근처의 양성 샘플에 우선순위를 두는 방식을 제안한다.
대상 클래스의 시각적 모드를 across를 across? 다양한 시각적 모드 전반에 걸친 검색 다양성을 정량화하는 클래스 커버리지 메트릭을 도입한다.

제안 방법

사용자가 소량의 초기 질의로 클래스를 정의하고 반복적인 이진 관련 피드백을 사용하여 경량 분류기를 학습하는 인터랙티브 검색을 형식화한다.
PF-MA를 근접 경계 양성 샘플의 우선순위를 매기고 정보성이 높은 음수를 샘플링하는 스코어링 규칙으로 정의한다: PF-MA(x) = (1 - |5 - f(x)|) * 1_{f(x) >= 0.5} + f(x) * 1_{f(x) < 0.5}.
매 iteration당 예산이 매우 작은 (b = 10) 롱테일 데이터셋에서 불확실성 기반(MA), 신뢰도 기반(MP), 및 다른 베이스라인(DAL, CoreSet, ALAMP)과 PF-MA를 비교한다.
주석이 달린 선택 S_t에 의해 업데이트되는 소량의 라벨링 세트 D_l에 대해 학습된 경량 선형 SVM 분류기를 사용한다.
클래스 커버리지 지표 cov_t^C로 검색 다양성을 평가한다. 이는 양성 샘플을 K개의 시각적 모드로 클러스터링하고 검색된 양성으로 표현된 클러스터의 비율을 측정하여 계산한다.
다양한 데이터세트(Cifar100-LT, ImageNet-LT, PlantNet300K) 및 두 가지 특징 설명자(CLIP 및 DINOv2)에서 강건성을 평가한다.
커버리지 구간(K 변경)에 따른 민감도를 검토하고 PF-MA가 세분화 및 클래스 크기에 걸쳐 우수성을 유지함을 보여준다.

Figure 1 : Comparison of selected samples. MA (left): near-boundary negatives are oversampled. MP (middle): only positives far from the boundary are selected. PF-MA (right): balance between relevant positives and negatives around the boundary.

실험 결과

연구 질문

RQ1극심하게 불균형하고 예산이 제한된 인터랙티브 검색 환경에서 PF-MA가 사용자가 정의한 희귀 시각 개념의 발견을 가속화할 수 있는가?
RQ2경계 근처의 양성 우선순위를 유지하면서 모호성을 유지하는 것이 표준 불확실성이나 순수 양성 전략보다 더 다양하고 정보에 풍부한 검색을 낳는가?
RQ3PF-MA는 서로 다른 롱테일 데이터셋과 특징 설명자에서 어떤 성능을 보이며 시각적 granularity(K의 선택)에 대한 강건성은 어떤가?
RQ4PF-MA가 초기 단계 검색 품질과 전반적 분류기 성능에 미치는 영향은 주석 예산이 엄격할 때 어떤가?

주요 결과

PF-MA는 세 가지 롱테일 데이터셋과 두 가지 설명자에서 특히 초기 반복에서 강력한 커버리지에서 일관되게 더 높은 성능을 보였으며, 강한 기준선보다 높은 초기 커버리지.
PF-MA는 결정 경계를 다듬기 위해 유익한 음수를 제공하면서 선택 배치의 양성 선택 비율을 자주 80%를 넘는 높은 양성 선택 비율을 유지한다.
데이터 세트와 설명자 모델 전반에서 PF-MA는 cov_25에서 MA, MP 및 다른 베이스라인보다 우수한 성능을 보이며 검색 다양성과 빠른 개념 발견에 강건함을 입증한다.
PF-MA의 이점은 초기 반복에서 특히 두드러져 최소한의 감독으로도 사용자 만족도를 빠르게 이끈다.
제안된 클래스 커버리지 지표는 시각적 모드 전반에 걸친 검색 다양성을 효과적으로 포착하여 PF-MA가 단일 시각 모드에만 집중하지 않고 클래스 매니폴드를 포괄할 수 있음을 보여준다.

Figure 2 : Interactive retrieval process.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.