QUICK REVIEW

[论文解读] Improving Fast Segmentation With Teacher-student Learning

Jiafeng Xie, Bing Shuai|arXiv (Cornell University)|Oct 19, 2018

Advanced Neural Network Applications参考文献 16被引用 51

一句话总结

论文通过教师-学生学习框架，通过从强教师向快速学生传输零阶和一阶知识，在有标注和无标注数据的情况下实现快速分割模型，而无需额外推理成本。

ABSTRACT

Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks. However, these models are very heavy and generally suffer from low inference speed, which limits their application scenarios in practice. Meanwhile, existing fast segmentation models usually fail to obtain satisfactory segmentation accuracies on public benchmarks. In this paper, we propose a teacher-student learning framework that transfers the knowledge gained by a heavy and better performed segmentation network (i.e. teacher) to guide the learning of fast segmentation networks (i.e. student). Specifically, both zero-order and first-order knowledge depicted in the fine annotated images and unlabeled auxiliary data are transferred to regularize our student learning. The proposed method can improve existing fast segmentation models without incurring extra computational overhead, so it can still process images with the same fast speed. Extensive experiments on the Pascal Context, Cityscape and VOC 2012 datasets demonstrate that the proposed teacher-student learning framework is able to significantly boost the performance of student network.

研究动机与目标

在实时或资源受限环境中，需要快速而准确的分割的动机。
提出一个教师-学生框架，通过教师派生的知识规范化快速学生的学习。
扩展框架以利用无标注数据，通过教师生成的伪 Ground truth。
在多个基准数据集（Pascal Context, Cityscapes, VOC 2012）上展示增益，同时不增加推理成本。

提出的方法

定义一个学生 S（快速）和一个固定教师 T（重量级），并优化 L = L_S + r(S,T)。
通过概率损失 L_p，在教师和学生输出之间，对 S 进行零阶知识正则化。
通过一致性损失 L_c，在教师和学生输出之间的边界信息，对 S 进行一阶知识正则化。
在细粒度注释数据上蒸馏知识，并通过使用教师生成的伪标签作为地面真相扩展到无标注数据。
在无标注数据上，训练时使用 L = L_LabeledData + λ L_unlabeledData，以联合优化两种数据 regime。

实验结果

研究问题

RQ1教师网络的知识是否可以在不增加推理成本的情况下提升快速分割模型？
RQ2将零阶（概率）与一阶（一致性）知识结合，是否能比单独使用任一者更提升学生学习？
RQ3无标注数据是否可以通过教师生成的监督进一步提升性能，而无需人工标注？
RQ4该方法在标准分割基准和不同的教师/学生骨架下表现如何？

主要发现

模型	mIoU (%)	速度 (FPS)
ResNet-101-DeepLab-v2 (teacher)	48.5	16.7
MobileNet-1.0-DeepLab-v2	40.9	46.5
MobileNet-1.0-DeepLab-v2 (Lp)	42.3	46.5
MobileNet-1.0-DeepLab-v2 (Lp+Lc)	42.8	46.5
MobileNet-1.0-DeepLab-v2 (Lp+Lc+UnlabeledData)	43.8	46.5
FCN-8s	37.8	N/A
ParseNet	40.4	N/A
UoA-Context + CRF	43.3	< 1
DAG-RNN	42.6	9.8
DAG-RNN + CRF	43.7	< 1

在 Pascal Context 上，Enhanced MobileNet-1.0-DeepLab-v2（含 L_p、L_c 以及无标注数据）在 46.5 FPS 下达到 43.8% mIoU，相较于基线 40.9% mIoU。
在消融实验中，仅 L_p 提升到 42.3% mIoU，加入 L_c 提升到 42.8%；无标注数据再带来 1.0% 的增益至 43.8% mIoU。
使用固定的高容量教师（ResNet-101 DeepLab-v2）与基于 MobileNet 的学生，学生在 Cityscapes 验证集达到 71.9% mIoU（比 67.3% 提升），同时保持快速推理（20.6 FPS）。
在 VOC 2012 验证集，增强的 MobileNet-1.0-DeepLab-v2 达到 69.6% mIoU，较基线学生高出 2.3%。
在三个数据集上，该框架始终提升学生性能且没有额外的计算开销。
增益幅度通常随教师-学生性能差距增大而增加，表明知识转移有效。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。