Skip to main content
QUICK REVIEW

[論文レビュー] Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator

Zhenyue Qin, Dongwoo Kim|arXiv (Cornell University)|Nov 25, 2019
Adversarial Robustness in Machine Learning参考文献 29被引用数 37
ひとこと要約

本論文は、softmax cross-entropyを用いた訓練が入力とラベル間の相互情報量を最大化することを示し(一様なラベル分布の下で)、ニューラルネットワーク分類器を相互情報推定量として再構成し、情報量の高い入力領域を特定するためのinfoCAMを導入している。

ABSTRACT

Mutual information is widely applied to learn latent representations of observations, whilst its implication in classification neural networks remain to be better explained. We show that optimising the parameters of classification neural networks with softmax cross-entropy is equivalent to maximising the mutual information between inputs and labels under the balanced data assumption. Through experiments on synthetic and real datasets, we show that softmax cross-entropy can estimate mutual information approximately. When applied to image classification, this relation helps approximate the point-wise mutual information between an input image and a label without modifying the network structure. To this end, we propose infoCAM, informative class activation map, which highlights regions of the input image that are the most relevant to a given label based on differences in information. The activation map helps localise the target object in an input image. Through experiments on the semi-supervised object localisation task with two real-world datasets, we evaluate the effectiveness of our information-theoretic approach.

研究の動機と目的

  • Reinterpret neural network classifiers through an information-theoretic lens by relating softmax cross-entropy to mutual information.
  • Develop a practical MI-estimator view that can be used to assess feature informativeness in inputs for classification.
  • Introduce a probability-corrected softmax (PC-softmax) to handle imbalanced datasets while preserving MI estimation.
  • Propose and validate Informative Class Activation Map (infoCAM) for locating regions most informative to labels in images.

提案手法

  • Relate cross-entropy with softmax to a variational bound on mutual information and show equivalence under uniform label distribution.
  • Introduce PC-softmax to relax the uniform-label assumption and prove MI consistency with neural networks.
  • Define and compute pointwise mutual information (PMI) differences to quantify region-label informativeness.
  • Derive infoCAM by decomposing PMI differences across image regions to identify informative regions for WSOL.
  • Empirically compare MI estimators (softmax, MINE, MC) on synthetic data and real datasets (MNIST, CUB-200-2011) and evaluate classification performance.
  • Demonstrate WSOL improvements using infoCAM over traditional CAM across multiple architectures and datasets.

実験結果

リサーチクエスチョン

  • RQ1Does cross-entropy with softmax maximize mutual information between inputs and labels, and under what conditions?
  • RQ2Can PC-softmax provide consistent MI estimation on imbalanced data and improve classification performance?
  • RQ3Can an information-theoretic activation map (infoCAM) better localize informative regions for labels than traditional CAM, especially in WSOL tasks?

主な発見

  • Under uniform label distribution, infimum of cross-entropy aligns with mutual information between inputs and labels (up to a constant).
  • PC-softmax yields competitive MI estimates and improves average per-class accuracy on unbalanced datasets.
  • On MNIST and CUB-200-2011, PC-softmax improves average per-class accuracy over softmax on unbalanced data, while balanced cases show comparable accuracy.
  • InfoCAM consistently outperforms CAM for weakly supervised object localization across multiple networks and datasets.
  • InfoCAM+ and ADL further enhance WSOL performance, with region-based PMI differences guiding localization.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。