QUICK REVIEW

[論文レビュー] Dataset Condensation with Differentiable Siamese Augmentation

Bo Zhao, Hakan Bilen|arXiv (Cornell University)|Feb 16, 2021

Domain Adaptation and Few-Shot Learning参考文献 56被引用数 51

ひとこと要約

本論文は Differentiable Siamese Augmentation (DSA) を導入し、拡張を用いて訓練された場合に、全データで訓練したモデルの性能に近づく小さな合成訓練セットを学習させ、いくつかのベンチマークで従来法を上回る。

ABSTRACT

In many machine learning problems, large-scale datasets have become the de-facto standard to train state-of-the-art deep networks at the price of heavy computation load. In this paper, we focus on condensing large training sets into significantly smaller synthetic sets which can be used to train deep neural networks from scratch with minimum drop in performance. Inspired from the recent training set synthesis methods, we propose Differentiable Siamese Augmentation that enables effective use of data augmentation to synthesize more informative synthetic images and thus achieves better performance when training networks with augmentations. Experiments on multiple image classification benchmarks demonstrate that the proposed method obtains substantial gains over the state-of-the-art, 7% improvements on CIFAR10 and CIFAR100 datasets. We show with only less than 1% data that our method achieves 99.6%, 94.9%, 88.5%, 71.5% relative performance on MNIST, FashionMNIST, SVHN, CIFAR10 respectively. We also explore the use of our method in continual learning and neural architecture search, and show promising results.

研究の動機と目的

コンパクトな合成データを学習させることで、性能を維持しつつ訓練データサイズを削減する動機づけ。
現実データから合成データへ拡張知識を転移させるため、原理的で微分可能なフレームワークでデータ拡張を活用する。
最初から合成データとモデルパラメータを共同最適化する訓練手順を開発する。
本手法が複数のアーキテクチャとデータセットにまたがってスケールすることを示し、継続学習とニューラルアーキテクチャ検索をサポートする。

提案手法

Realとsyntheticデータ間の勾配を一致させるためにDataset Condensation (DC) フレームワークを採用する。
同じ微分可能な変換をミニバッチ内の実データと合成データの両方のバッチに適用する Differentiable Siamese Augmentation (DSA) を導入する。
拡張された実データと合成データから得られるネットワークパラメータに関する勾配の距離を最小化する勾配一致目的を定式化する。
拡張パラメータを合成データへ逆伝播させることを可能にするため、微分可能な拡張をレイヤとして実装する。
ランダムな初期化を伴う外部ループを用いて、学習された合成データがランダムシードを越えてスクラッチから訓練されることを保証する。

実験結果

リサーチクエスチョン

RQ1勾配マッチングで学習された小さな合成データセットが、微分可能な Siamese 増強変換とともに、ネットワークをスクラッチから競争力のある精度で訓練できるか。
RQ2DSA は複数のデータセットとアーキテクチャに渡って、従来の訓練セット凝縮法より一貫した改善を提供するか。
RQ3共有（Siamese）拡張と独立拡張が凝縮データの品質に与える影響はどのようか。
RQ4横断アーキテクチャおよび横断データセット設定での手法の性能はどうなるか。

主な発見

Img/Cls	Ratio %	Coreset Selection	Training Set Synthesis	Whole Dataset	Random	Herding	Forgetting	DD †	LD †	DC
MNIST	1	0.017	64.9 ± 3.5	89.2 ± 1.6	60.9 ± 3.2	91.7 ± 0.5	88.7 ± 0.6	99.6 ± 0.0
MNIST	10	0.17	95.1 ± 0.9	93.7 ± 0.3	68.1 ± 3.3	79.5 ± 8.1	87.3 ± 0.7	97.4 ± 0.2	97.8 ± 0.1
MNIST	50	0.83	97.9 ± 0.2	94.8 ± 0.2	88.2 ± 1.2	-	93.3 ± 0.3	98.8 ± 0.2	99.2 ± 0.1
FashionMNIST	1	0.017	51.4 ± 3.8	67.0 ± 1.9	42.0 ± 5.5	-	-	70.5 ± 0.6	70.6 ± 0.6	93.5 ± 0.1
FashionMNIST	10	0.17	73.8 ± 0.7	71.1 ± 0.7	53.9 ± 2.0	-	-	82.3 ± 0.4	84.6 ± 0.3	-
FashionMNIST	50	0.83	82.5 ± 0.7	71.9 ± 0.8	55.0 ± 1.1	-	-	83.6 ± 0.4	88.7 ± 0.2	-
SVHN	1	0.014	14.6 ± 1.6	20.9 ± 1.3	12.1 ± 1.7	-	-	31.2 ± 1.4	27.5 ± 1.4	95.4 ± 0.1
SVHN	10	0.14	35.1 ± 4.1	50.5 ± 3.3	16.8 ± 1.2	-	-	76.1 ± 0.6	79.2 ± 0.5	-
SVHN	50	0.7	70.9 ± 0.9	72.6 ± 0.8	27.2 ± 1.5	-	-	82.3 ± 0.3	84.4 ± 0.4	-
CIFAR10	1	0.02	14.4 ± 2.0	21.5 ± 1.2	13.5 ± 1.2	-	25.7 ± 0.7	28.3 ± 0.5	28.8 ± 0.7	84.8 ± 0.1
CIFAR10	10	0.2	26.0 ± 1.2	31.6 ± 0.7	23.3 ± 1.0	36.8 ± 1.2	38.3 ± 0.4	44.9 ± 0.5	52.1 ± 0.5	-
CIFAR10	50	1	43.4 ± 1.0	40.4 ± 0.6	23.3 ± 1.1	-	42.5 ± 0.4	53.9 ± 0.5	60.6 ± 0.5	-

DSA は CIFAR-10 および CIFAR-100 で最先端手法を大きく上回り、いくつかの設定で約7%の絶対的な向上を達成。
非常に小さなデータ領域（例: データの1%未満）では、DSA は高い相対精度を達成（例: MNIST 50 画像/クラスで 99.6%）。
DSA は強力な横断アーキテクチャの一般化を可能にし、畳み込みアーキテクチャが他のアーキテクチャへの転移を最も良く生み出す。
アブレーション研究は Siamese 増強（共有変換）が、非 Siamese または独立拡張方式を一貫して上回り、クロッピングが顕著な利得をもたらすことを示す。
複数の拡張を組み合わせるとデータセット全体で最良の性能となるが、ノイズの多いデータセット（例: SVHN）では一部の拡張が害になることもある。
CIFAR-10/100 の結果は、DSA が従来手法（例: DC）を最大で約7%程度改善することを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。