[论文解读] Detecting and Correcting for Label Shift with Black Box Predictors
BBSE 在标签漂移下,使用一个黑盒预测器估计测试标签分布,附带一致性保证,实现在没有标注测试数据的情况下的漂移检测和分类器修正。
Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels. Motivated by medical diagnosis, where diseases (targets) cause symptoms (observations), we focus on label shift, where the label marginal $p(y)$ changes but the conditional $p(x| y)$ does not. We propose Black Box Shift Estimation (BBSE) to estimate the test distribution $p(y)$. BBSE exploits arbitrary black box predictors to reduce dimensionality prior to shift correction. While better predictors give tighter estimates, BBSE works even when predictors are biased, inaccurate, or uncalibrated, so long as their confusion matrices are invertible. We prove BBSE's consistency, bound its error, and introduce a statistical test that uses BBSE to detect shift. We also leverage BBSE to correct classifiers. Experiments demonstrate accurate estimates and improved prediction, even on high-dimensional datasets of natural images.
研究动机与目标
- Motivate the problem of distribution shift between training and test, focusing on label shift where p(x|y) stays fixed while p(y) changes.
- Develop a method to estimate the test-label distribution q(y) using only unlabeled test data and a fixed predictor f.
- Provide theoretical guarantees: consistency and error bounds for the shift estimates.
- Propose applications to statistical testing for shift detection and to classifier correction via importance-weighted ERM.
提出的方法
- Introduce BBSE that uses the predictor f with an invertible expected confusion matrix to estimate w(y)=q(y)/p(y).
- Formulate a linear system A w = b where A = C_{ŷ,y} (confusion matrix) and b = μ̂_{ŷ} (average predictor outputs on test data).
- Derive estimators ŷ w = Ŵ^{-1} μ̂_{ŷ} and μ̂_y = diag(ν̂_y) w, with ν̂_y from training data.
- Prove consistency: as n,m → ∞, ŵ → w and μ̂_y → μ_y under Assumptions A.1–A.3.
- Provide error bounds: ||ŵ−w||_2^2 ≤ (C/σ_min^2)(||w||^2 log n / n + k log m / m).
- Offer BBSD (Black Box Shift Detection) as a one-dimensional two-sample test on p(ŷ) vs q(ŷ) for shift detection, with provable Type I error control and power.
- Explain BBSC (Black Box Shift Correction) via importance-weighted ERM using the estimated weights ŵ, including handling of degenerate cases and negative weights.
实验结果
研究问题
- RQ1Can a fixed, possibly biased black-box predictor enable consistent estimation of label-shift weights w(y) from unlabeled test data?
- RQ2Can BBSE detect label shift accurately using only predictions ŷ and training confusion information?
- RQ3Can the estimated shift be used to improve downstream classifier performance via importance-weighted ERM (BBSC)?
- RQ4How do BBSE/BBSD/BBSC perform on high-dimensional data compared to kernel-based approaches like KMM?
主要发现
- BBSE provides consistent estimates of w(y) and q(y) under label shift given invertible confusion matrices and shared x|y distributions.
- Theoretical error bounds show estimation accuracy improves with larger n,m and depends on the minimum singular value σ_min of the confusion matrix.
- BBSD delivers controlled Type I error and higher power than kernel two-sample tests on MNIST across various shift strengths.
- BBSE outperforms kernel mean matching (KMM) in accuracy and scales better to high-dimensional data like MNIST/CIFAR, enabling practical shift correction.
- BBSC, using estimated weights, yields improved classifier performance under label shift on benchmark datasets, with robustness to degenerate confusion matrices via soft or merged-class adaptations.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。