QUICK REVIEW

[论文解读] Scaling laws for decoding images from brain activity

Hubert Banville, Yohann Benchetrit|ArXiv.org|Jan 25, 2025

Neural Networks and Applications被引用 3

一句话总结

本论文系统地比较来自四种无创神经成像模态（EEG、MEG、3T fMRI、7T fMRI）在8个公开数据集上的单次试验图像解码，以推导数据量和被试数量的扩展定律。

ABSTRACT

Generative AI has recently propelled the decoding of images from brain activity. How do these approaches scale with the amount and type of neural recordings? Here, we systematically compare image decoding from four types of non-invasive devices: electroencephalography (EEG), magnetoencephalography (MEG), high-field functional Magnetic Resonance Imaging (3T fMRI) and ultra-high field (7T) fMRI. For this, we evaluate decoding models on the largest benchmark to date, encompassing 8 public datasets, 84 volunteers, 498 hours of brain recording and 2.3 million brain responses to natural images. Unlike previous work, we focus on single-trial decoding performance to simulate real-time settings. This systematic comparison reveals three main findings. First, the most precise neuroimaging devices tend to yield the best decoding performances, when the size of the training sets are similar. However, the gain enabled by deep learning - in comparison to linear models - is obtained with the noisiest devices. Second, we do not observe any plateau of decoding performance as the amount of training data increases. Rather, decoding performance scales log-linearly with the amount of brain recording. Third, this scaling law primarily depends on the amount of data per subject. However, little decoding gain is observed by increasing the number of subjects. Overall, these findings delineate the path most suitable to scale the decoding of images from non-invasive brain recordings.

研究动机与目标

评估从大脑活动中解码图像嵌入向量的数据量和设备类型如何扩展。
使用统一基准在EEG、MEG、3T fMRI和7T fMRI之间比较单次试验解码性能。
确定训练数据量、被试数量和测试时平均对解码性能的影响。
评估使用潜在图像嵌入的解码性能，并评估重建和检索能力。

提出的方法

使用两种脑到图像架构（M/EEG 与 fMRI 深度学习模块）以及一个岭线性基线，从脑活动预测图像嵌入。
通过组合的类似 CLIP 的检索损失和重建损失进行训练，将脑信号映射到图像嵌入。
在八个公开数据集上进行评估，单次试验性能以跨嵌入的皮尔逊相关系数衡量。
通过改变训练试验数量和被试数量，以及记录时间和测试时平均来分析扩展定律。
通过将解码的嵌入输入到一个预训练的扩散生成器中实现图像重建以产生图像。

实验结果

研究问题

RQ1单次试验的图像解码性能如何随 EEG、MEG、3T fMRI、7T fMRI 的脑数据量变化而扩展？
RQ2在类似的训练数据下，哪种神经成像模态提供最佳解码性能，深度学习如何放大或削弱这一点？
RQ3增加被试数量对解码性能有何影响，是否存在收益递减点？
RQ4测试时平均对各设备的解码性能有何影响？
RQ5解码的图像嵌入是否支持图像检索和重建，这些能力在各设备间有何差异？

主要发现

由于血流反应时序原因，EEG和MEG 的解码性能较早达到峰值，F MRI较晚达到峰值。
深度学习解码器相对于线性基线提供显著提升，尤其对噪声较高的设备如 EEG 和 MEG。
解码性能与脑记录数据量呈对数线性扩展，7T fMRI 展现最强的扩展性和总体最佳表现。
增加数据的收益主要来自单被试数据的增加，而增加更多被试的改进有限。
测试时平均带来一致的性能提升，但重复次数增加时收益呈边际递减。
跨设备均可实现图像检索和重建，嵌入在重复和被试上平均后重建质量提升，7T fMRI 提供最佳重建。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。