QUICK REVIEW

[논문 리뷰] A Theory of Usable Information Under Computational Constraints

Yilun Xu, Shengjia Zhao|arXiv (Cornell University)|2020. 02. 25.

Machine Learning and Algorithms참고 문헌 30인용 수 31

한 줄 요약

논문은 계산 제약과 관찰자 파워를 고려한 변분적 확장인 predictive V-information을 도입하여 PAC 보장 하에 추정 가능하게 하고, 구조 학습 및 공정한 표현 학습의 개선을 가능하게 한다.

ABSTRACT

We propose a new framework for reasoning about information in complex systems. Our foundation is based on a variational extension of Shannon's information theory that takes into account the modeling power and computational constraints of the observer. The resulting \emph{predictive $\mathcal{V}$-information} encompasses mutual information and other notions of informativeness such as the coefficient of determination. Unlike Shannon's mutual information and in violation of the data processing inequality, $\mathcal{V}$-information can be created through computation. This is consistent with deep neural networks extracting hierarchies of progressively more informative features in representation learning. Additionally, we show that by incorporating computational constraints, $\mathcal{V}$-information can be reliably estimated from data even in high dimensions with PAC-style guarantees. Empirically, we demonstrate predictive $\mathcal{V}$-information is more effective than mutual information for structure learning and fair representation learning.

연구 동기 및 목표

계산 능력과 자원 한계를 반영하는 관찰자의 모델링 파워를 반영하는 계산 가능한 정보의 개념을 제시한다.
define predictive V-entropy and predictive V-information as constrained counterparts to Shannon entropy and mutual information.
Shannon entropy와 mutual information의 제약된 대응으로 predictive V-entropy와 predictive V-information를 정의한다.

제안 방법

선택적 무지를 허용하는 모델의 예측 계열 V를 정의하고 이를 사용하여 H_V(Y|X) 를 V 전체에 걸친 기대 음의 로그 가능도들의 최솟값으로 정의한다.
I_V(X -> Y) = H_V(Y|∅) - H_V(Y|X)로 정의한다.
적절한 V 선택 하에서 V-information을 Shannon entropy 및 R^2와 같은 특수 사례와 연결한다.
실증 데이터을 사용하여 I_V의 PAC-style 추정기를 제안하고 그 오차를 Rademacher 복잡도(Theorem 1)로 상한한다.
지시된 간선에 대한 V-information을 사용하고 Chu-Liu 알고리즘으로 최대 방향성 포괄 트리를 구성하는 구조 학습 알고리즘(Algorithm 1)을 개발한다(Theorem 2).
Chow-Liu 트리 구성 및 유전자 네트워크 추론에서 V-information의 실증적 이점을 입증하고 공정성에의 적용 가능성을 보여준다.

실험 결과

연구 질문

RQ1계산과 모델링 파워가 제약될 때 정보를 어떻게 정의할 수 있는가?
RQ2제약된 복잡도 하에서 predictive V-information가 mutual information을 일반화하고 유한 샘플에서도 추정 가능하게 유지될 수 있는가?
RQ3Shannon 기반 방법에 비해 V-information을 사용하면 구조 학습과 공정성 작업이 개선되는가?
RQ4V-information을 활용한 실용적 알고리즘이 계층적 구조 학습에 대해 유한 샘플 보장을 제공할 수 있는가?
RQ5예측 과제에서 데이터 처리 및 비대칭성 하에서 V-information은 어떻게 작동하는가?

주요 결과

Predictive V-information은 mutual information을 일반화하며, 무제한 예측 가족 하에서는 그것으로 축소된다.
V-information은 예측 가족의 복잡도가 제한될 때 PAC-style 보장으로 추정될 수 있다(Theorem 1).
V-information을 이용한 방향 트리 구조 학습 접근법(Algorithm 1)은 유한 샘플 성능 보장을 갖는다(Theorem 2).
실증 결과 V-information 기반 Chow-Liu 트리가 고차원 구조 학습 및 유전자 네트워크 추론에서 mutual-information 기반 접근법을 능가함을 보인다.
V-information은 공정성에 대한 표현 학습에서 이점을 제공하며, 모델 클래스 제약 하에서 MI 기반 방법의 한계를 강조한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.