Skip to main content
QUICK REVIEW

[论文解读] Invariant Rationalization

Shiyu Chang, Shuicheng Yan|arXiv (Cornell University)|Mar 22, 2020
Explainable Artificial Intelligence (XAI)参考文献 25被引用 37
一句话总结

Invariant Rationalization (InvRat) 引入了一种博弈论驱动、基于环境的选择性理据标准,以避免虚假相关性,在泛化和与人类判断的一致性方面优于基于最大互信息(MMI)的方法。

ABSTRACT

Selective rationalization improves neural network interpretability by identifying a small subset of input features -- the rationale -- that best explains or supports the prediction. A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale. However, MMI can be problematic because it picks up spurious correlations between the input features and the output. Instead, we introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments. We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments. Our data and code are available.

研究动机与目标

  • Motivate and formulate the limitations of maximum mutual information (MMI) for rationales in neural models.
  • Propose an invariant rationalization criterion that enforces predictivity to be invariant across environments.
  • Develop a three-player game-theoretic framework to optimize rationale generation under invariance constraints.
  • Analyze convergence and generalization properties of InvRat.
  • Empirically validate InvRat on datasets with synthetic and real spurious correlations.

提出的方法

  • Formalize the MMI objective for rationales and illustrate its vulnerability to spurious correlations.
  • Introduce an environment variable and an invariance constraint Y ⟂ E | Z to filter non-causal features.
  • Propose a three-player InvRat framework: a rationale generator G, an environment-agnostic predictor Fi, and an environment-aware predictor Fe.
  • Reformulate the invariance constraint as a minimax objective via Lagrangian/entropy view and solve with alternating gradient methods.
  • Implement sparsity/continuity constraints on the rationale mask via soft or hard constraint methods.
  • Provide convergence insights via KKT conditions and a game-theoretic interpretation.

实验结果

研究问题

  • RQ1Can invariant rationales distinguish causal features from spurious correlations better than MMI-based methods?
  • RQ2Does enforcing invariance across environments improve generalization to unseen test environments?
  • RQ3How can a three-player game-theoretic framework effectively generate invariant rationales under sparsity constraints?
  • RQ4Do invariant rationales align more closely with human judgments than traditional rationales?

主要发现

  • InvRat successfully reduces reliance on spurious correlations that MMI tends to capture.
  • On synthetic IMDB with environment-driven punctuation cues, InvRat fails to highlight injected tokens, unlike MMI-based baselines.
  • InvRat outperforms baselines on a multi-aspect beer reviews dataset, achieving higher alignment with human-annotated rationales across several aspect/rationale-length settings.
  • Human-subject evaluations indicate InvRat-generated rationales better convey the target aspect.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。