QUICK REVIEW

[論文レビュー] End-to-end Alternating Optimization for Blind Super Resolution

Zhengxiong Luo, Yan Huang|arXiv (Cornell University)|May 14, 2021

Advanced Image Processing Techniques参考文献 50被引用数 30

ひとこと要約

本論文は、深層の交互ネットワーク（DAN）を提案し、ぼかしカーネルを共同推定し、ブラインド超解像画像をエンドツーエンドで逐次展開されたフレームワークで復元します。推定と復元の整合性を高め、速度を向上させつつ最先端の結果を達成します。

ABSTRACT

Previous methods decompose the blind super-resolution (SR) problem into two sequential steps: extit{i}) estimating the blur kernel from given low-resolution (LR) image and extit{ii}) restoring the SR image based on the estimated kernel. This two-step solution involves two independently trained models, which may not be well compatible with each other. A small estimation error of the first step could cause a severe performance drop of the second one. While on the other hand, the first step can only utilize limited information from the LR image, which makes it difficult to predict a highly accurate blur kernel. Towards these issues, instead of considering these two steps separately, we adopt an alternating optimization algorithm, which can estimate the blur kernel and restore the SR image in a single model. Specifically, we design two convolutional neural modules, namely extit{Restorer} and extit{Estimator}. extit{Restorer} restores the SR image based on the predicted kernel, and extit{Estimator} estimates the blur kernel with the help of the restored SR image. We alternate these two modules repeatedly and unfold this process to form an end-to-end trainable network. In this way, extit{Estimator} utilizes information from both LR and SR images, which makes the estimation of the blur kernel easier. More importantly, extit{Restorer} is trained with the kernel estimated by extit{Estimator}, instead of the ground-truth kernel, thus extit{Restorer} could be more tolerant to the estimation error of extit{Estimator}. Extensive experiments on synthetic datasets and real-world images show that our model can largely outperform state-of-the-art methods and produce more visually favorable results at a much higher speed. The source code is available at \url{https://github.com/greatlog/DAN.git}.

研究の動機と目的

ぼかしカーネルが未知で画像ごとに変化する盲超解像を動機づける。
1つのモデル内でカーネル推定とSR復元を交互に行うエンドツーエンドのアーキテクチャを提案する。
カーネル推定とSR復元の整合性を改善し、誤差伝搬を抑制する。
合成画像と実世界画像の定量・定性的性能を優れたものとして示しつつ、速度を向上させる。

提案手法

2つのCNNモジュールを導入する：Estimator（カーネル推定器）とRestorer（SR再構成器）。
盲SRを交互最適化問題として定式化し、反復間で共有パラメータを持つ訓練可能なネットワーク（DAN）へ展開する。
デュアル-パス条件付きブロック（DPCB）とデュアル-パス条件付きグループ（DPCG）を用いて、重い結合結合なしで基本入力と条件付き入力を効率的に融合する。
完全なぼかしカーネルをSoftmaxで予測し、カーネル和が1になる制約を課す。
最後の反復に対する監督学習を用いてエンドツーエンドで訓練し、途中の結果は制約を課さず収束を促す。
実践では4つの固定交互反復を採用し、カーネルをディラックデルタで初期化し、モデルへの入力として再整形およびPCA削減する。

実験結果

リサーチクエスチョン

RQ1エンドツーエンドのネットワークは、2段階法よりもぼかしカーネルを共同推定し、盲SRをより効果的に実行できるか？
RQ2LRとSRの両方の情報はEstimatorがより良いカーネルを推定するのに役立つか、共同訓練時にEstimatorの誤差をRestorerは容認するか？
RQ3Dual-Path Conditional Blockのようなアーキテクチャ上の革新はEstimatorとRestorerの性能と効率を向上させるか？
RQ4Estimatorを完全なカーネルと縮小空間のカーネルのどちらで監督するかが最終的なSR品質に与える影響は？
RQ5等方性ガウスぼかしと不規則なぼかし劣化の下でDANの variantesはどう比較されるか？

主な発見

交互最適化を伴うエンドツーエンドのDANは、合成データ上で最先端の2段階盲SR手法を大幅に上回り（特にIKCを凌駕）、実画像でも優れた性能を示す。
DANv1はUrban100のスケール3でIKCを3.22 dB上回っており、エンドツーエンド訓練の価値を示している。
DANv2はDPCBを特徴とし、Estimator監督の改善によりさらなる改善を示し、スケール4でDANv2はDANv1を1.19 dB上回る。
デュアルパス設計は推論を速くし、訓練を安定化させ、より高速で堅牢なカーネル/HR推定を達成する。
Estimatorは完全なカーネル（縮小空間ではなく）で監督され、Softmaxによりカーネル要素の和が1になるように保証され、カーネルの現実性と収束性が向上する。
モデルはLRとSRの両方の情報を効果的に活用してカーネルを推定し、全体のシステムを推定誤差に対してより耐性のあるものにする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。