QUICK REVIEW

[论文解读] Brain Imaging Generation with Latent Diffusion Models

Walter Hugo Lopez Pinaya, Petru-Daniel Tudosiu|arXiv (Cornell University)|Sep 15, 2022

Radiomics and Machine Learning in Medical Imaging被引用 26

一句话总结

本文训练潜在扩散模型（LDMs）来在条件变量下生成高分辨率的3D脑部MRI，并用FID和多样性指标进行评估，同时发布一个100k样本的合成脑数据集。

ABSTRACT

Deep neural networks have brought remarkable breakthroughs in medical image analysis. However, due to their data-hungry nature, the modest dataset sizes in medical imaging projects might be hindering their full potential. Generating synthetic data provides a promising alternative, allowing to complement training datasets and conducting medical image research at a larger scale. Diffusion models recently have caught the attention of the computer vision community by producing photorealistic synthetic images. In this study, we explore using Latent Diffusion Models to generate synthetic images from high-resolution 3D brain images. We used T1w MRI images from the UK Biobank dataset (N=31,740) to train our models to learn about the probabilistic distribution of brain images, conditioned on covariables, such as age, sex, and brain structure volumes. We found that our models created realistic data, and we could use the conditioning variables to control the data generation effectively. Besides that, we created a synthetic dataset with 100,000 brain images and made it openly available to the scientific community.

研究动机与目标

激发合成数据生成，以克服受限的医学影像数据集和隐私问题。
使用潜在扩散模型开发高分辨率的3D脑图像生成，以便与大规模数据集相扩展。
实现对协变量（年龄、性别、脑室体积和脑容量）进行条件控制，以生成逼真、可控的脑图像。
将LDM与基于GAN的基线进行对比，评估生成图像的真实感和多样性。
向社区开放获取一个大型合成脑MRI数据集。

提出的方法

使用UK Biobank的T1加权MRI（N=31,740）在20×28×20潜在表示上训练潜在扩散模型。
使用自编码器压缩图像，并在潜在空间中训练扩散，采用1000步前向过程和固定方差计划。
通过级联和跨注意力（混合条件）使年龄、性别、脑室体积和脑容量条件化图像生成。
使用FID（Med3D特征）评估采样质量，使用MS-SSIM和4-G-R-SSIM评估多样性；与LSGAN和VAE-GAN基线进行比较。
应用DDIM来加速采样，将步骤从1000减少至50，同时损失最小化。

实验结果

研究问题

RQ1潜在扩散模型是否能够生成与真实数据分布相匹配的高分辨率3D脑部MRI？
RQ2条件变量（年龄、性别、脑室和脑容量）在多大程度上能够对生成的脑部图像进行控制？
RQ3在3D脑部MRI合成的真实感和多样性方面，LDM是否优于基于GAN的基线？
RQ4使用潜在表示来扩展高分辨率脑图像生成的可行性如何？
RQ5公开发布大型合成脑MRI数据集对社区的用途和影响是什么？

主要发现

模型	FID ↓	MS-SSIM ↓	4-G-R-SSIM ↓
LSGAN	0.0231	0.9997	0.9969
VAE-GAN	0.1576	0.9671	0.8719
LDM	0.0076	0.6555	0.3883
LDM + DDIM	0.0080	0.6704	0.3957
Real images	0.0005	0.6536	0.3909

LDMs生成了高质量的脑部MRI，具有清晰的细节和逼真的纹理，在无条件生成方面优于GAN基线。
DDIM采样显著加速生成（50步对比1000步），且精度损失很小。
对脑室和脑容量的条件化有效控制，输入与SynthSeg测量的脑室之间具有高相关性（r = 0.972）。
脑龄条件显示输入条件与预测年龄之间存在较强相关性（r = 0.692）。
对条件变量的外推演示了学习到的表征，例如当脑室/脑容量值超出训练范围时，脑室增大或出现神经退行迹象。
公开发布了一个100,000张脑部图像的合成数据集供研究使用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。