QUICK REVIEW

[論文レビュー] A Survey on Generative Modeling with Limited Data, Few Shots, and Zero Shot

Milad Abdollahzadeh, Guimeng Liu|arXiv (Cornell University)|Jul 26, 2023

Machine Learning in Healthcare被引用数 10

ひとこと要約

データ制約下の生成モデリング（GM-DC）に関する総合的な調査で、GANs、VAEs、拡散モデル全体にわたるタスク、データ制約、アプローチ、課題、および今後の方向性を詳述する。

ABSTRACT

Generative modeling in machine learning aims to synthesize new data samples that are statistically similar to those observed during training. While conventional generative models such as GANs and diffusion models typically assume access to large and diverse datasets, many real-world applications (e.g. in medicine, satellite imaging, and artistic domains) operate under limited data availability and strict constraints. In this survey, we examine Generative Modeling under Data Constraint (GM-DC), which includes limited-data, few-shot, and zero-shot settings. We present a unified perspective on the key challenges in GM-DC, including overfitting, frequency bias, and incompatible knowledge transfer, and discuss how these issues impact model performance. To systematically analyze this growing field, we introduce two novel taxonomies: one categorizing GM-DC tasks (e.g. unconditional vs. conditional generation, cross-domain adaptation, and subject-driven modeling), and another organizing methodological approaches (e.g. transfer learning, data augmentation, meta-learning, and frequency-aware modeling). Our study reviews over 230 papers, offering a comprehensive view across generative model types and constraint scenarios. We further analyze task-approach-method interactions using a Sankey diagram and highlight promising directions for future work, including adaptation of foundation models, holistic evaluation frameworks, and data-centric strategies for sample selection. This survey provides a timely and practical roadmap for researchers and practitioners aiming to advance generative modeling under limited data. Project website: https://sutd-visual-computing-group.github.io/gmdc-survey/.

研究の動機と目的

データ取得が難しいGM-DCの背景と動機付け（例: ヘルスケアなど）を調査する。
GM-DCタスクとGM-DCアプローチの2つの分類法を導入し、それらの相互作用を分析する。
GM-DC研究の傾向、ギャップ、今後の方向性を強調する。
GM-DCの研究を整理した要約と、対話的なランドスケープ視覚化を備えたプロジェクトウェブサイトを提供する。

提案手法

GANs、VAEs、拡散モデルを横断するGM-DC文献をレビュー・統合する。
GM-DCのタスク分類（uGM-1 から cGM-3、IGM、SGM）と対応するデータ制約の対応を提案する。
アプローチ分類（Transfer Learning、Data Augmentation、Network Architectures、Multi-Task Objectives、Frequency Exploitation、Meta-Learning など）を提案する。
GM-DCタスクとアプローチの相互作用を、Sankeyダイアグラムやチャートなどの視覚化を用いて分析する。

Figure 1 . Research Landscape of GM-DC. The figure shows the interaction between GM-DC tasks and approaches (main and sub categories), and GM-DC methods. Tasks are defined in our proposed taxonomy in Tab. 2 , and approaches in our proposed taxonomy in Tab. LABEL:tab:approaches . An interactive versi

実験結果

リサーチクエスチョン

RQ1どのGM-DCタスクが研究され、無条件/条件、およびクロスドメイン設定の中でどのように定義されているか？
RQ2普及しているデータ制約モダリティ（LD、FS、ZS）は何で、それらが方法論的選択にどう影響するか？
RQ3データ制約下で知識転移、データ拡張、アーキテクチャ設計、メタ学習の活用に効果的なGM-DCアプローチは何か？
RQ4GM-DCの主要な課題と未解決の問題は何か、将来の研究で最も有望に見える方向は何か？

主な発見

GM-DC研究は複数の生成モデルファミリー（GANs、VAEs、Diffusion Models）にまたがり、さまざまなデータ制約設定を網羅している。
2つの詳しい分類法（GM-DCタスクとGM-DCアプローチ）が文献を整理し、タスクと手法の相互作用を明らかにする。
転移学習、データ拡張、テキストや言語を用いた適応が、限られたデータ下でのGM-DCの中核戦略である。
メタ学習とマルチタスク目的は、未見ドメインやクラスへの適応を可能にする有効な手法として浮かび上がる。
本調査は、ドメイン間の近接性、クロスドメイン適応、知識転移を、GM-DCの成果を形作る重要な要因として強調している。

Figure 2 . Overall publications statistics in GM-DC. GM-DC Publications (Left): GM-DC publication trends indicate rising interest in this area. We remark that the previous survey (Li et al . , 2022c ) only covers $\sim$ 27% of publications discussed in our survey. Publication Venues (Right): The dis

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。