QUICK REVIEW

[논문 리뷰] A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

Anastasios N. Angelopoulos, Stephen Bates|arXiv (Cornell University)|2021. 07. 15.

Anomaly Detection Techniques and Applications참고 문헌 11인용 수 28

한 줄 요약

이 논문은 분포에 의존하지 않는 프레임워크로서 분류 및 회귀 전반에 걸친 예측에 대한 유효한 불확실성 집합 생성을 위한 컨포멀 예측을 제시하며, 실용적인 절차, 확장 및 진단 도구를 제공합니다.

ABSTRACT

Black-box machine learning models are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they possess explicit, non-asymptotic guarantees even without distributional assumptions or model assumptions. One can use conformal prediction with any pre-trained model, such as a neural network, to produce sets that are guaranteed to contain the ground truth with a user-specified probability, such as 90%. It is easy-to-understand, easy-to-use, and general, applying naturally to problems arising in the fields of computer vision, natural language processing, deep reinforcement learning, and so on. This hands-on introduction is aimed to provide the reader a working understanding of conformal prediction and related distribution-free uncertainty quantification techniques with one self-contained document. We lead the reader through practical theory for and examples of conformal prediction and describe its extensions to complex machine learning tasks involving structured outputs, distribution shift, time-series, outliers, models that abstain, and more. Throughout, there are many explanatory illustrations, examples, and code samples in Python. With each code sample comes a Jupyter notebook implementing the method on a real-data example; the notebooks can be accessed and easily run using our codebase.

연구 동기 및 목표

컨포멀 예측 및 분포-비(분포에 의존하지 않는) 불확실성 정량화에 대한 실용적이고 자족적인 소개를 제공합니다.
컨포멀 예측이 휴리스틱한 모델 불확실성을 엄밀하고 유한 표본 보장으로 전환하는 방법을 보여줍니다.
실데이터 예를 사용하여 분류 및 회귀에 대한 여러 컨포멀 절차를 시연합니다.
평가, 적응성 및 복잡한 작업과 분포 이동에 대한 확장을 논의합니다.

제안 방법

점수 함수와 예측 집합을 형성하기 위한 분위수 qhat를 사용하는 보정 단계와 함께 컨포멀 예측 프레임워크를 소개합니다.
공식 커버리지 보장을 갖는 기본적이고 널리 사용되는 변형으로 분할 컨포멀 예측을 제시합니다.
분류용(적응형 예측 집합), 회귀용 컨포멀라이즈드 분위수 회귀, 스칼라 불확실성 추정의 컨포멀라이즈, 그리고 베이즈를 컨포멀라이즈하는 등 점수 함수 기반의 구체적인 절차를 제공합니다.
커버리지 보장이 기저 모델이나 데이터 분포에 독립적임을 설명합니다.
구조화된 출력, 분포 이동, 이상치 및 선택적 결정에 대한 확장과 실용적인 Python 코드 조각 및 노트북을 논의합니다.
적응성 및 정확성 평가 지표, 커버리지 검사 및 보정 세트 크기 고려를 개요합니다.

실험 결과

연구 질문

RQ1어떤 모델의 예측에서도 분포-비의 유한 표본 불확실성 보장을 어떻게 얻을 수 있는가?
RQ2분류 및 회귀에서 유용하고 적응적인 예측 집합을 생성하기 위해 점수 함수를 어떻게 설계해야 하는가?
RQ3시계열, 이상치, 분포 변동과 같은 작업으로의 컨포멀 예측의 실용적 확장은 무엇인가?
RQ4실데이터에서 적응성 및 올바른 커버리지에 대해 컨포멀 예측기를 어떻게 평가할 수 있는가?
RQ5보정 세트의 크기가 보장 및 성능에 미치는 영향은 무엇인가?

주요 결과

컨포멀 예측은 모델이나 데이터 분포에 관계없이 1−α에서 주변 커버리지가 보장되는 예측 집합을 생성합니다.
입력 난이도와 집합 크기의 균형을 맞추도록 적응형 예측 집합을 구성하여 실용적 유용성을 높일 수 있습니다.
컨포멀라이즈드 분위수 회귀는 보정에서 얻은 분위수로 기존 분위수 모델을 조정하여 회귀에 대한 유효한 구간을 제공합니다.
베이즈로의 확장, 스칼라 불확실성, 베이즈 최적 집합과 같은 다양한 컨포멀 확장이 가능하며 분포 비의 보장을 유지합니다.
보정 세트 크기와 진단 도구는 실제로 올바른 커버리지를 보장하고 적응성을 측정하는 데 매우 중요합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.