QUICK REVIEW

[논문 리뷰] Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction

Christina Wadsworth, Francesca Vera|arXiv (Cornell University)|2018. 06. 30.

Ethics and Social Impacts of AI참고 문헌 11인용 수 117

한 줄 요약

tldr: 이 논문은 차별적 편향을 줄이면서 재범 예측을 위한 적대적 가이드 신경망을 학습시키고, 공정성 정의에 가까운 동등성(parity) 및 odds의 평등을 달성하며 Broward 데이터셋에서 COMPAS보다 정확도가 더 높다.

ABSTRACT

Recidivism prediction scores are used across the USA to determine sentencing and supervision for hundreds of thousands of inmates. One such generator of recidivism prediction scores is Northpointe's Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) score, used in states like California and Florida, which past research has shown to be biased against black inmates according to certain measures of fairness. To counteract this racial bias, we present an adversarially-trained neural network that predicts recidivism and is trained to remove racial bias. When comparing the results of our model to COMPAS, we gain predictive accuracy and get closer to achieving two out of three measures of fairness: parity and equality of odds. Our model can be generalized to any prediction and demographic. This piece of research contributes an example of scientific replication and simplification in a high-stakes real-world application like recidivism prediction.

연구 동기 및 목표

고위험 재범 예측에서 공정성을 촉진하고 COMPAS와 같은 기존 지표의 편향을 해결합니다.
재범 예측기에서 인종 정보를 제거하는 적대적 학습 프레임워크를 제안합니다.
공개 Broward 카운티 데이터에서 COMPAS 및 기본 모델에 대한 공정성과 정확도를 평가합니다.
다른 예측 및 보호 속성에 대한 일반화 가능성을 보여줍니다.

제안 방법

재범 확률 hat{Y}를 출력하는 예측기 N을 사용합니다.
예측기의 로짓에서 인구통계 D를 예측하려는 적대자 A를 붙이고(동등성의 odds를 위해 Y도 포함).
손실 Ly(예측기)와 Ld(적대자)로 학습하고 L = Ly - alpha * Ld 를 최적화하여 parity 혹은 odds의 평등을 촉진합니다.
AUC와 공정성 격차로 평가: High Risk Gap, FP Gap, FN Gap, 보정(calibration) 그래프 포함.
하이퍼파라미터(레이어 수, alpha, 학습률)를 조정하여 Ld를 최대화한 뒤 Ly를 최소화합니다.

실험 결과

연구 질문

RQ1적대적 설정이 정확도를 희생하지 않으면서 재범 예측에서 인종 정보를 제거할 수 있는가?
RQ2Broward COMPAS 데이터셋에서 모델이 인구통계학적 동등성(parity)과 odds의 평등에 얼마나 근접할 수 있는가?
RQ3공정성 기준을 달성하면서 정확도에서 적대적 모형이 COMPAS를 능가하는가?
RQ4해당 접근법이 다른 예측 및 보호 속성으로 일반화 가능한가?

주요 결과

적대적 모형은 편향 격차를 크게 줄입니다: High Risk Gap 0.02, FN Gap 0.02, FP Gap 0.01.
적대적 모형은 정확도(AUC 0.70)가 COMPAS(AUC 0.66)보다 높고 공정성 목표에 근접합니다.
기준 재범 모델과 비교할 때, 적대적 모형은 더 큰 parity/odds 평등 개선을 보이면서도 강한 정확성을 유지합니다.
여러 공정성 Baseline(Bechavod, Zafar, Hardt)과 비교했을 때, 선택된 적대적 모형은 FP/FN 격차를 경쟁력 있게 달성하고 AUC는 우수하거나 준수합니다.
사례 연구는 실제 시사점을 보여주며, 구체적인 수감자 사례에서 COMPAS가 적대적 예측과 다를 수 있음을 시사합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.