Skip to main content
QUICK REVIEW

[论文解读] FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics

Taejin Jeong, Joohyeok Kim|arXiv (Cornell University)|Mar 26, 2026
Single-cell and spatial transcriptomics被引用 0
一句话总结

FEAST 将组织建模为一个全连接图,使用负相关感知注意力和离网伪点来从全切片图像预测空间基因表达,达到最先进的结果并提供可解释的注意力图。

ABSTRACT

Spatial Transcriptomics (ST) provides spatially-resolved gene expression, offering crucial insights into tissue architecture and complex diseases. However, its prohibitive cost limits widespread adoption, leading to significant attention on inferring spatial gene expression from readily available whole slide images. While graph neural networks have been proposed to model interactions between tissue regions, their reliance on pre-defined sparse graphs prevents them from considering potentially interacting spot pairs, resulting in a structural limitation in capturing complex biological relationships. To address this, we propose FEAST (Fully connected Expressive Attention for Spatial Transcriptomics), an attention-based framework that models the tissue as a fully connected graph, enabling the consideration of all pairwise interactions. To better reflect biological interactions, we introduce negative-aware attention, which models both excitatory and inhibitory interactions, capturing essential negative relationships that standard attention often overlooks. Furthermore, to mitigate the information loss from truncated or ignored context in standard spot image extraction, we introduce an off-grid sampling strategy that gathers additional images from intermediate regions, allowing the model to capture a richer morphological context. Experiments on public ST datasets show that FEAST surpasses state-of-the-art methods in gene expression prediction while providing biologically plausible attention maps that clarify positive and negative interactions. Our code is available at https://github.com/starforTJ/ FEAST.

研究动机与目标

  • 通过对所有位点之间的相互作用建模而非稀疏图,激发改进空间转录组学基因表达预测的动机。
  • 开发带有负相关感知注意力的 FEAST 框架,以捕捉正向和负向生物关系。
  • 通过离网采样和分层注意力设计,缓解基于补丁的位点提取带来的信息损失。
  • 提供可解释的注意力图,并在公开 ST 数据集上对比最先进基线进行性能验证。

提出的方法

  • 通过自注意力对所有位点对进行建模,将组织建模为全连接图。
  • 引入负相关感知注意力,以学习正向和负向交互。
  • 加入静态位置偏差以引导局部与全局注意力头之间的交互。
  • 提出离网采样以创建伪点,捕捉中间形态上下文。
  • 使用两阶段的局部(k-NN)和全局注意力块,以应对位点增多带来的计算成本。
  • 以均方误差损失训练,并在交叉验证数据集上使用 MSE、MAE 和 PCC 进行评估。
Figure 1 : Conceptual overview of our proposed framework, illustrating how it addresses key challenges of prior ST methods. Existing approaches are limited by sparse graphs that miss potential interactions, standard positive-only relationships in attention that omit inhibitory interactions, and spar
Figure 1 : Conceptual overview of our proposed framework, illustrating how it addresses key challenges of prior ST methods. Existing approaches are limited by sparse graphs that miss potential interactions, standard positive-only relationships in attention that omit inhibitory interactions, and spar

实验结果

研究问题

  • RQ1全连接的注意力机制是否能比稀疏图 GNN 更好地捕捉空间转录组学中的所有相关位点间相互作用?
  • RQ2负相关感知注意力和离网伪点是否比基线方法在基因表达预测和可解释性方面有所提升?
  • RQ3分层注意力在局部与全局阶段如何在性能和计算之间取得平衡?
  • RQ4FEAST 是否能够生成更生物学上合理的注意力图,区分兴奋性与抑制性相互作用?

主要发现

方法ST-Net MSEST-Net MAEST-Net PCCHer2ST MSEHer2ST MAEHer2ST PCCSCC MSESCC MAESCC PCC
ResNet+FCN0.19990.34480.52210.66230.63850.46290.61030.62900.4619
BLEEP0.37560.47360.07840.74260.65910.27470.60790.60130.4176
HisToGene0.30540.43360.12110.94520.77390.20620.30950.43670.1225
Hist2ST0.38110.48220.15250.78430.72860.24791.01900.76390.3003
THItoGene0.29250.41110.36660.84360.70690.34450.67980.64420.3897
TRIPLEX0.14720.29430.23200.89820.69460.39270.48910.53560.5416
MERGE0.13470.28340.67950.64220.62550.50370.53530.58380.5512
FEAST (Ours)0.11770.26390.71550.57610.57820.55240.45010.52390.5811
  • FEAST 在三组 ST 数据集上达到最先进的性能,在九个评估指标中有七项名列前茅。
  • 在 Her2ST 上,FEAST 达到 MSE 0.5761、PCC 0.5524,超越此前的最佳结果。
  • 在 SCC 上,FEAST 的 PCC 为 0.5811,为所评方法中的最高值。
  • 定性地,FEAST 产生的注意力图能够区分正向和负向交互。
  • 离网采样(伪点)在周围上下文稀疏时显著提升对目标位点的预测。
  • 消融研究表明负相关感知注意力和离网采样均对性能提升有贡献。
Figure 2 : The overall architecture of FEAST. The framework first extracts features and spot distance ( $\mathbf{B}_{h}$ ) from the input WSI. The features are then processed through $L$ stacked hierarchical attention layers. Each layer consists of two stages: (1) a FEAST Block applied to local $k$
Figure 2 : The overall architecture of FEAST. The framework first extracts features and spot distance ( $\mathbf{B}_{h}$ ) from the input WSI. The features are then processed through $L$ stacked hierarchical attention layers. Each layer consists of two stages: (1) a FEAST Block applied to local $k$

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。