QUICK REVIEW

[论文解读] CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction

Rabeya Tus Sadia, Qiang Ye|arXiv (Cornell University)|Feb 23, 2026

RNA and protein synthesis mechanisms被引用 0

一句话总结

CrossLLM-Mamba 将 RNA 互作预测重新定义为双向状态空间对齐任务，通过 BiMamba 编码器实现 BioLLM 表征的动态跨模态融合，在 RNA–蛋白质、RNA–小分子和 RNA–RNA 互作中获得最先进的结果。

ABSTRACT

Accurate prediction of RNA-associated interactions is essential for understanding cellular regulation and advancing drug discovery. While Biological Large Language Models (BioLLMs) such as ESM-2 and RiNALMo provide powerful sequence representations, existing methods rely on static fusion strategies that fail to capture the dynamic, context-dependent nature of molecular binding. We introduce CrossLLM-Mamba, a novel framework that reformulates interaction prediction as a state-space alignment problem. By leveraging bidirectional Mamba encoders, our approach enables deep ``crosstalk'' between modality-specific embeddings through hidden state propagation, modeling interactions as dynamic sequence transitions rather than static feature overlaps. The framework maintains linear computational complexity, making it scalable to high-dimensional BioLLM embeddings. We further incorporate Gaussian noise injection and Focal Loss to enhance robustness against hard-negative samples. Comprehensive experiments across three interaction categories, RNA-protein, RNA-small molecule, and RNA-RNA demonstrate that CrossLLM-Mamba achieves state-of-the-art performance. On the RPI1460 benchmark, our model attains an MCC of 0.892, surpassing the previous best by 5.2\%. For binding affinity prediction, we achieve Pearson correlations exceeding 0.95 on riboswitch and repeat RNA subtypes. These results establish state-space modeling as a powerful paradigm for multi-modal biological interaction prediction.

研究动机与目标

通过对模态之间的动态串扰进行建模，推动超越静态融合的 RNA 相关互作预测改进。
提出一个双向状态空间融合框架（BiMamba），实现模态嵌入之间的连续信息流。
保持线性计算复杂度，以便在高维 BioLLM 嵌入上可扩展。
通过高斯噪声注入和聚焦损失提升对困难负样本和类别不平衡的鲁棒性。
在三个互作类别：RNA–蛋白质、RNA–RNA 和 RNA–小分子之间展示泛化能力。

提出的方法

使用 ESM-2 对蛋白质进行编码、用 RiNALMo 表示 RNA、用 MoleBERT 表示小分子，以获取模态特定嵌入。
通过线性投影并加入高斯噪声实现嵌入投影到共享潜在空间（X_A = W_A E_A + b_A + N(0, sigma^2)，X_B = W_B E_B + b_B + N(0, sigma^2)）。
对每个模态应用双向 Mamba（BiMamba），捕获正向与反向上下文并生成 X_enc。
通过堆叠 [X_A_enc, X_B_enc] 构建 Cross-Mamba 融合序列，并用 BiMamba 混合器处理以建模交互流（S_mixed）。
通过全局均值池化聚合，并通过多层感知机预测相互作用概率或亲和力；分类采用聚焦损失，亲和力采用综合的 MSE-皮尔逊损失。
采用线性复杂度的跨模态融合，避免跨注意力的二次缩放。

实验结果

研究问题

RQ1双向状态空间融合的 BioLLM 嵌入能否比静态融合方法更好地捕捉动态的 RNA 互作串扰？
RQ2BiMamba 为基础的跨模态融合在高维 BioLLM 嵌入下是否保持线性扩展，同时实现最先进的性能？
RQ3CrossLLM-Mamba 对 RNA 互作数据集中的困难负样本和类别不平衡的鲁棒性如何？
RQ4该框架在 RNA–蛋白质、RNA–RNA 和 RNA–小分子互作模态间的泛化能力如何？

主要发现

Method	MCC	ACC	F1	Precision	Recall	AUC–ROC
RPISeq-RF [14]	0.570	0.780	0.780	0.790	0.780	0.790
IPMiner [15,16]	0.520	0.760	0.770	0.720	0.830	0.801
CFRP [3]	0.630	0.810	0.820	0.830	0.780	0.834
RPITER [17]	0.412	0.690	0.510	0.610	0.480	0.720
LPI-CSFFR [6]	0.600	0.830	0.840	0.780	0.910	0.820
RNAincoder [22]	0.760	0.880	0.840	0.810	0.940	0.915
BioLLMNet [1]	0.848	0.923	0.925	0.888	0.966	0.948
CrossLLM-Mamba (Ours)	0.892	0.935	0.933	0.901	0.971	0.957

在 RNA–蛋白质互作（RPI1460）中，CrossLLM-Mamba 实现 MCC 0.892、ACC 0.935，超越先前最佳 MCC 0.848（BioLLMNet）。
对于 RPI1460，F1 = 0.933，Precision = 0.901，Recall = 0.971，AUC-ROC = 0.957。
在 RNA–小分子结合亲和力方面，模型在利甲调控元件（丽甲开关）0.9562、重复序列 0.9521 的皮尔逊相关性均超过 0.95。
在 RNA–RNA 植物 miRNA–lncRNA 转移任务中，CrossLLM-Mamba 在 MTR-ATH（以 M. truncatula 训练、对 A. thaliana 测试）上达到最高 75% 的准确率，在六个转移设置中的四个中超越部分基线。
消融研究显示，Cross-Mamba 融合显著优于拼接，双向性使 MCC 提升约 2.7%，高斯噪声与聚焦损失有助于泛化与应对困难负样本。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。