QUICK REVIEW

[论文解读] Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion

Hongxu Yin, Pavlo Molchanov|arXiv (Cornell University)|Dec 18, 2019

Domain Adaptation and Few-Shot Learning参考文献 75被引用 48

一句话总结

DeepInversion 在不使用任何训练数据的情况下，从一个预训练的 CNN 合成高保真、按类别条件的图像，实现数据自由的知识迁移、剪枝和持续学习；Adaptive DeepInversion 通过利用教师-学生分歧来增加图像多样性。

ABSTRACT

We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network. We 'invert' a trained network (teacher) to synthesize class-conditional input images starting from random noise, without using any additional information about the training dataset. Keeping the teacher fixed, our method optimizes the input while regularizing the distribution of intermediate feature maps using information stored in the batch normalization layers of the teacher. Further, we improve the diversity of synthesized images using Adaptive DeepInversion, which maximizes the Jensen-Shannon divergence between the teacher and student network logits. The resulting synthesized images from networks trained on the CIFAR-10 and ImageNet datasets demonstrate high fidelity and degree of realism, and help enable a new breed of data-free applications - ones that do not require any real images or labeled data. We demonstrate the applicability of our proposed method to three tasks of immense practical importance -- (i) data-free network pruning, (ii) data-free knowledge transfer, and (iii) data-free continual learning. Code is available at https://github.com/NVlabs/DeepInversion

研究动机与目标

在训练数据不可用或私有时，推动无数据知识迁移。
Introduce DeepInversion to synthesize class-conditional images from a fixed teacher network.

提出的方法

从随机噪声出发，反演一个训练好的 CNN 以合成按类别条件的图像。
使用按层的 Batch Normalization 运行统计来正则化中间特征映射，以匹配训练数据统计。
引入一个特征分布正则化项，使合成图像的逐层均值和方差与 BN 运行统计对齐。
在 Adaptive DeepInversion 中，添加基于 Jensen-Shannon 发散的竞争损失，以鼓励教师-学生分歧并扩大图像覆盖范围。
将合成图像应用于无数据剪枝、无数据知识迁移和无数据持续学习任务。
通过将延迟引入滤波器重要性中来提供硬件感知的剪枝，以在资源更少的情况下维持性能。

实验结果

研究问题

RQ1在无法取得原始训练数据的情况下，是否可以合成高保真且按类别条件的图像？
RQ2无数据合成图像在多大程度上能支持知识迁移和剪枝，而无需真实数据？
RQ3通过 Adaptive DeepInversion 引入学生循环是否会增加图像多样性及下游任务性能？
RQ4在 CIFAR-10 和 ImageNet 上，无数据知识迁移与数据集相关的蒸馏相比如何？
RQ5无数据方法是否能够在没有原始数据的情况下实现持续学习？

主要发现

DeepInversion 使用来自预训练网络的 BN 运行统计来产生高保真、按类别条件的图像。
Adaptive DeepInversion 通过最大化教师–学生分歧进一步提高图像多样性。
无数据剪枝实现具有竞争力的准确性和延迟提升，降低对原始数据的依赖。
通过合成数据进行的无数据知识迁移在 CIFAR-10 上接近教师模型的性能，在 ImageNet 上也有相当的准确率且损失极小。
无数据持续学习提升增量性能，并接近获得原始数据的 oracle 方法。
总体而言，该方法使无数据应用成为现实，无需真实图像或标签。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。