QUICK REVIEW

[논문 리뷰] TRAK: Attributing Model Behavior at Scale

Sung‐Min Park, Kristian Georgiev|arXiv (Cornell University)|2023. 03. 24.

COVID-19 diagnosis using AI인용 수 10

한 줄 요약

TRAK은 대규모 확장 가능하고 효과적인 데이터 귀속 방법으로, 단 몇 개의 학습된 모델만으로도 강력한 귀속 성능을 달성합니다.

ABSTRACT

The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can struggle with accurately attributing model predictions in non-convex settings (e.g., in the context of deep neural networks), while methods that are effective in such regimes require training thousands of models, which makes them impractical for large models or datasets. In this work, we introduce TRAK (Tracing with the Randomly-projected After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differentiable models. In particular, by leveraging only a handful of trained models, TRAK can match the performance of attribution methods that require training thousands of models. We demonstrate the utility of TRAK across various modalities and scales: image classifiers trained on ImageNet, vision-language models (CLIP), and language models (BERT and mT5). We provide code for using TRAK (and reproducing our work) at https://github.com/MadryLab/trak .

연구 동기 및 목표

모델의 예측을 학습 데이터에 추적하는 것의 중요성을 동기 부여한다.
기존 데이터 귀속 방법에서의 계산상의 트레이드오프를 식별한다.
대규모 비볼록(non-convex) 모델에 대한 확장 가능하고 효과적인 데이터 귀속 접근법으로 TRAK를 소개한다.
다양한 모달리티와 아키텍처에 걸친 TRAK의 효과를 입증한다.

제안 방법

TRAK (Tracing with the Randomly-projected After Kernel) 을 데이터 귀속 방법으로 도입한다.
귀속을 수행하기 위해 단 소수의 학습된 모델만 활용한다.
수천 개의 모델을 학습해야 하는 귀속 방법의 성능과 일치하는 것을 목표로 한다.
다양한 모달리티에 걸친 대규모 차별화 가능한 모델에 TRAK를 적용한다.
TRAK를 사용하고 결과를 재현하기 위한 코드를 제공한다.

실험 결과

연구 질문

RQ1대규모 비볼록 설정에서 데이터 귀속이 확장 가능하면서도 효과적일 수 있는가?
RQ2다수의 학습된 모델이 필요한 기존 귀속 방법과 TRAK는 어떻게 비교되는가?
RQ3TRAK가 서로 다른 모달리티 및 모델 군(비전, 비전-언어, 언어) 전반에서 효과적인가?

주요 결과

TRAK는 샘플링 기반 기준선에 비해 훨씬 적은 수의 학습된 모델로도 효과적인 데이터 귀속을 달성한다.
실험에서 TRAK는 수천 개의 모델을 학습하는 방법의 성능과 같거나 근접한 성능을 보인다.
TRAK는 이미지 분류기(ImageNet), 비전-언어 모델(CLIP), 그리고 언어 모델(BERT, mT5) 전반에서 활용성을 보인다.
이 연구는 결과 재현 및 TRAK를 다른 모델에 적용하기 위한 코드를 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.