QUICK REVIEW

[論文レビュー] Inpaint Anything: Segment Anything Meets Image Inpainting

Changyuan Yu, Runseng Feng|arXiv (Cornell University)|Apr 13, 2023

Generative Adversarial Networks and Image Synthesis被引用数 59

ひとこと要約

本論文は Inpaint Anything (IA) を紹介します。マスク不要のインペイントパイプラインで、最先端のインペイントモデルである SAM と AIGC モデルを組み合わせ、簡単なクリックとテキストプロンプトでコンテンツを削除、埋め込み、または置換します。

ABSTRACT

Modern image inpainting systems, despite the significant progress, often struggle with mask selection and holes filling. Based on Segment-Anything Model (SAM), we make the first attempt to the mask-free image inpainting and propose a new paradigm of ``clicking and filling'', which is named as Inpaint Anything (IA). The core idea behind IA is to combine the strengths of different models in order to build a very powerful and user-friendly pipeline for solving inpainting-related problems. IA supports three main features: (i) Remove Anything: users could click on an object and IA will remove it and smooth the ``hole'' with the context; (ii) Fill Anything: after certain objects removal, users could provide text-based prompts to IA, and then it will fill the hole with the corresponding generative content via driving AIGC models like Stable Diffusion; (iii) Replace Anything: with IA, users have another option to retain the click-selected object and replace the remaining background with the newly generated scenes. We are also very willing to help everyone share and promote new projects based on our Inpaint Anything (IA). Our codes are available at https://github.com/geekyutao/Inpaint-Anything.

研究の動機と目的

マスクフリーのインペイントパラダイムを、セグメンテーションと生成モデルを活用して推進する。
オブジェクト削除、コンテンツ埋め、背景置換のためのユーザーフレンドリーなワークフローを提案する。
基盤モデルの組み合わせが、インペイント作業の柔軟性とアクセス性を改善することを示す。

提案手法

簡単なクリックから正確なオブジェクトマスクを得るために Segment Anything (SAM) を活用する。
高度なインペイントモデル（例: LaMa）と洗練されたマスクを用いて削除を行う。
テキストプロンプトを介して埋めるまたは背景置換のためのコンテンツを生成するために AIGC モデル（例: Stable Diffusion）を組み込む。
三つのワークフローを提供します：Remove Anything、Fill Anything、そしてReplace Anything。
結果を改善するために、マスクの洗練手法（例:膨張）と忠実度を保つリサイズ技術を適用する。

実験結果

リサーチクエスチョン

RQ1最小限のユーザー入力で、SAM が生成したマスクをマスクフリーのインペイントワークフローに効果的に活用できるか？
RQ2オブジェクトの削除、コンテンツ生成、背景置換を単一の IA パイプラインに統合するにはどうすればよいか？
RQ3多様な画像にわたり高品質なインペイント出力を達成するための実践的考慮事項（マスクの洗練、解像度の扱い、プロンプト）？

主な発見

IA は、クリック＋プロンプトのインターフェースを用いて、Remove Anything、Fill Anything、Replace Anything の三つの機能モードを実現します。
このパイプラインは SAM、LaMa、Stable Diffusion を組み合わせ、多様なコンテンツと解像度にわたって高品質なインペイント結果を生み出します。
マスクの洗練（膨張）と忠実度を維持するリサイズは、インペイント領域の配置合わせとディテールを向上させます。
COCO、LaMa テストセット、モバイル写真での実験は、IA がさまざまなアスペクト比と画像サイズ（最大 2K）に対して一般的で堅牢であることを示します。
本手法は、既存の基盤モデルを組成的ワークフローで活用することにより、効果的なマスク不要インペイントを実証します。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。