QUICK REVIEW

[论文解读] Dirty Pixels: Optimizing Image Classification Architectures for Raw Sensor Data

Steven Diamond, Vincent Sitzmann|arXiv (Cornell University)|Jan 23, 2017

Image and Signal Denoising Methods参考文献 45被引用 68

一句话总结

本文提出一种端到端可微分架构，联合优化原始传感器数据的去噪、去模糊与图像分类，显著提升了在低光照和噪声条件下的分类准确率。与传统方法不同，该方法学习专为分类任务定制的处理流程，在增加噪声和伪影的代价下仍能保留细微细节。

ABSTRACT

Real-world sensors suffer from noise, blur, and other imperfections that make high-level computer vision tasks like scene segmentation, tracking, and scene understanding difficult. Making high-level computer vision networks robust is imperative for real-world applications like autonomous driving, robotics, and surveillance. We propose a novel end-to-end differentiable architecture for joint denoising, deblurring, and classification that makes classification robust to realistic noise and blur. The proposed architecture dramatically improves the accuracy of a classification network in low light and other challenging conditions, outperforming alternative approaches such as retraining the network on noisy and blurry images and preprocessing raw sensor inputs with conventional denoising and deblurring algorithms. The architecture learns denoising and deblurring pipelines optimized for classification whose outputs differ markedly from those of state-of-the-art denoising and deblurring methods, preserving fine detail at the cost of more noise and artifacts. Our results suggest that the best low-level image processing for computer vision is different from existing algorithms designed to produce visually pleasing images. The principles used to design the proposed architecture easily extend to other high-level computer vision tasks and image formation models, providing a general framework for integrating low-level and high-level image processing.

研究动机与目标

解决真实世界传感器退化（如噪声和模糊）下的鲁棒图像分类挑战。
开发一种联合优化框架，以端到端可微分方式整合低层次图像恢复与高层次分类。
克服传统去噪与去模糊算法的局限性，这些算法针对视觉质量而非分类性能进行优化。
证明计算机视觉中低层次处理的最优方案与为感知质量设计的方法存在根本性差异。

提出的方法

使用可微分流水线端到端训练架构，联合优化图像分类与原始传感器输入的恢复。
集成可学习的去噪与去模糊模块，与分类头在反向传播过程中联合优化。
恢复组件被设计为保留对分类至关重要的细粒度语义细节，即使引入了可感知的伪影。
采用统一损失函数，结合分类交叉熵损失与原始图像数据的重建损失。
在包含真实噪声与模糊的现实世界传感器数据上进行训练，使模型对低光照和退化条件具备鲁棒性。
该框架具备通用性，可适配其他高层次视觉任务与图像形成模型。

实验结果

研究问题

RQ1与单独处理或微调相比，去噪、去模糊与分类的联合端到端优化是否能提升低光照和噪声条件下的鲁棒性？
RQ2使用联合优化恢复流水线训练的分类网络，其性能与在噪声数据上微调或使用传统预处理方法相比如何？
RQ3在计算机视觉的低层次图像处理中，感知图像质量与分类准确率之间存在何种权衡？
RQ4用于分类的最优恢复特征与最先进图像恢复方法生成的特征在多大程度上不同？

主要发现

所提架构在分类准确率上显著优于在噪声和模糊图像上微调网络的结果。
即使传统去噪与去模糊算法生成的输出视觉质量更优，该方法仍优于使用这些算法进行预处理的结果。
所学习的恢复流水线虽引入了比标准方法更多的噪声和伪影，但仍保留了对分类至关重要的细粒度语义细节。
结果表明，计算机视觉中低层次处理的最优方案与为视觉质量设计的方案并不等同，挑战了‘感知保真度为首要目标’的假设。
该框架具备通用性，可扩展至其他高层次视觉任务与图像形成模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。