QUICK REVIEW

[論文レビュー] AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Teng Hu, Jiangning Zhang|arXiv (Cornell University)|Dec 10, 2023

Anomaly Detection Techniques and Applications被引用数 8

ひとこと要約

AnomalyDiffusion は、空間異常埋込みを用いて異常の外観と位置を分離し、Adaptive Attention Re-weighting を用いて真作のマスク揃いの異常画像-マスク対を生成する拡散ベースの few-shot 異常画像生成フレームワークで、下流の異常検査タスクを向上させます。

ABSTRACT

Anomaly inspection plays an important role in industrial manufacture. Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data. Although anomaly generation methods have been proposed to augment the anomaly data, they either suffer from poor generation authenticity or inaccurate alignment between the generated anomalies and masks. To address the above problems, we propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model, which utilizes the strong prior information of latent diffusion model learned from large-scale dataset to enhance the generation authenticity under few-shot training data. Firstly, we propose Spatial Anomaly Embedding, which consists of a learnable anomaly embedding and a spatial embedding encoded from an anomaly mask, disentangling the anomaly information into anomaly appearance and location information. Moreover, to improve the alignment between the generated anomalies and the anomaly masks, we introduce a novel Adaptive Attention Re-weighting Mechanism. Based on the disparities between the generated anomaly image and normal sample, it dynamically guides the model to focus more on the areas with less noticeable generated anomalies, enabling generation of accurately-matched anomalous image-mask pairs. Extensive experiments demonstrate that our model significantly outperforms the state-of-the-art methods in generation authenticity and diversity, and effectively improves the performance of downstream anomaly inspection tasks. The code and data are available in https://github.com/sjtuplayer/anomalydiffusion.

研究の動機と目的

製造分野における希少な異常データ下での異常検査の改善を動機付ける。
少数の例から authentic で多様な異常サンプルを生成する拡散ベースの手法を提案する。
異常の外観と位置を分離して異常のタイプと配置を制御する。
生成された異常とマスクの整合を改善し、局在化と分類タスクをサポートする。

提案手法

大規模データからの事前知識を転移するために事前学習済み Latent Diffusion Model (LDM) を利用して few-shot 異常生成を実現する。
異常埋込み（外観）と空間埋込み（位置）を組み合わせて生成を条件付ける Spatial Anomaly Embedding を導入する。
マスク付きテキスト反転を適用して異常外観トークンを学習しつつ異常領域に焦点を当てる。
カテゴリを共有する専用の空間エンコーダを介して異常マスクを空間埋込みに encoding する。
Adaptive Attention Re-weighting を組み込み、脱ノイズ時に気づかれにくい領域を動的に強調してマスク揃いを改善する。
マスクに従って異常領域と通常コンテンツをブレンドする拡散過程で正常サンプル上に異常を生成する。
訓練データを補助するために学習済みマスク埋込みを介して異常マスクを生成するオプション。

実験結果

リサーチクエスチョン

RQ1希少な異常例から、 given masks によく整列した authentic で多様な異常画像を生成できるか。
RQ2異常の外観と位置を分離することで生成された異常のコントロール性は改善されるか。
RQ3Adaptive Attention Re-weighting は拡散合成中の生成異常と異常マスクの整合性を改善できるか。
RQ4生成された異常サンプルは検出、局在化、分類などの下流の異常検査タスクを改善するか。
RQ5学習済み埋込みによるマスク生成は異常タスクの訓練データ拡張に有効か。

主な発見

AnomalyDiffusion は MVTec データセット上で生成の真性と多様性において最先端の異常生成モデルを上回る。
Adaptive Attention Re-weighting メカニズムにより入力マスクと一致する高品質で密に整合した異常画像を実現。
生成された異常データは下流の異常検査性能を大幅に改善し、ピクセルレベルの局在化を含む。
MVTec において、本手法は生成データで訓練した単純な U-Net を用いた場合、ピクセルレベルの局在化 AUROC が 99.1%、AP が 81.4% を達成。
数少ない実異常サンプルから大量の異常画像-マスク対を生成でき、空間異常埋込みは異常の配置と外観を制御可能。
生成された異常は分類器の訓練に用いると強力なカテゴリ分類性能を発揮し、いくつかのベースラインを上回る。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。