Skip to main content
QUICK REVIEW

[论文解读] RPNT: Robust Pre-trained Neural Transformer -- A Pathway for Generalized Motor Decoding

Hao Fang, Ryan A. Canfield|arXiv (Cornell University)|Jan 25, 2026
EEG and Brain-Computer Interfaces被引用 0
一句话总结

RPNT 引入了一个鲁棒的预训练神经变换器,具备多维旋转位置嵌入和基于上下文的注意力,在跨会话、受试者、任务和记录位点的广义运动解码中表现出色,在微电极和 Neuropixel 数据集上得到验证。

ABSTRACT

Brain decoding aims to interpret and translate neural activity into behaviors. As such, it is imperative that decoding models are able to generalize across variations, such as recordings from different brain sites, distinct sessions, different types of behavior, and a variety of subjects. Current models can only partially address these challenges and warrant the development of pretrained neural transformer models capable to adapt and generalize. In this work, we propose RPNT - Robust Pretrained Neural Transformer, designed to achieve robust generalization through pretraining, which in turn enables effective finetuning given a downstream task. In particular, RPNT unique components include 1) Multidimensional rotary positional embedding (MRoPE) to aggregate experimental metadata such as site coordinates, session name and behavior types; 2) Context-based attention mechanism via convolution kernels operating on global attention to learn local temporal structures for handling non-stationarity of neural population activity; 3) Robust self-supervised learning (SSL) objective with uniform causal masking strategies and contrastive representations. We pretrained two separate versions of RPNT on distinct datasets a) Multi-session, multi-task, and multi-subject microelectrode benchmark; b) Multi-site recordings using high-density Neuropixel 1.0 probes. The datasets include recordings from the dorsal premotor cortex (PMd) and from the primary motor cortex (M1) regions of nonhuman primates (NHPs) as they performed reaching tasks. After pretraining, we evaluated the generalization of RPNT in cross-session, cross-type, cross-subject, and cross-site downstream behavior decoding tasks. Our results show that RPNT consistently achieves and surpasses the decoding performance of existing decoding models in all tasks.

研究动机与目标

  • 解决跨会话、站点、受试者和行为中的神经解码非平稳性与记录变异性。
  • 开发一个预训练-微调管道,以实现从神经尖峰到运动解码的鲁棒广义化。
  • 设计面向神经数据的神经变换器组件(MRoPE、基于上下文的注意力、带统一因果掩蔽的自监督学习)。
  • 展示跨会话、跨受试者、跨类型、跨站点解码性能相较于现有基线的提升。

提出的方法

  • 引入多维旋转位置嵌入(MRoPE)来编码实验元数据,如站点坐标、会话名称、行为类型和变换器内的时间位置。
  • 实现基于上下文的注意力机制,使用在全局注意力上运作的可学习卷积核以捕捉局部时间结构并处理非平稳性。
  • 利用带统一因果掩蔽的鲁棒自监督学习目标和对比表示来对 RPNT 进行预训练。
  • 在两个不同的神经数据集(微电极基准;Neuropixel 记录)上训练两种 RPNT 变体,并在跨会话、跨类型、跨受试者、跨站点任务上通过微调评估下游解码。
  • 在 SSL 预训练阶段采用带因果掩蔽的自回归目标的泊松重建损失,以及一个辅助的站点不变性损失。
  • 提供可解释的注意力图,以揭示数据驱动的运动变量神经编码洞见。
Figure 1: Overall illustration of the pretraining and finetuning workflow for generalized motor decoding. (A) Experimental setup for data collection while NHPs performed reaching tasks. (B) Preparation of pretaining data. (C) and (D) overall schemes for SSL and SFT, respectively. (E) Illustration of
Figure 1: Overall illustration of the pretraining and finetuning workflow for generalized motor decoding. (A) Experimental setup for data collection while NHPs performed reaching tasks. (B) Preparation of pretaining data. (C) and (D) overall schemes for SSL and SFT, respectively. (E) Illustration of

实验结果

研究问题

  • RQ1RPNT 是否能够在未见脑部位点、会话、行为和受试者上实现对运动解码的鲁棒广义化?
  • RQ2所提出的体系结构组件(MRoPE、基于上下文的注意力)及 SSL 策略是否在跨领域场景中相对于现有神经解码器带来改进?
  • RQ3RPNT 的预训练和微调方案(FS-SFT 与 Full-SFT)相对于多样数据集的现有基线如何?
  • RQ4在跨站点 Neuropixel 数据和跨会话基准测试中,使用 RPNT 可以获得哪些下游解码增益?

主要发现

  • RPNT 在公开基准的三种广义化场景(跨会话、跨受试者、跨任务)中均优于基线模型。
  • 在单会话从零开始的情形下,RPNT 的 R^2 为 0.9647±0.0026(C-CO)、0.9103±0.0182(T-CO)、0.8356±0.0914(T-RT)。
  • 预训练 RPNT 后再进行少-shot 或全量微调始终高于基线,FS-SFT 达到 0.9801±0.0060(C-CO)、0.9431±0.0103(T-CO)、0.8515±0.1071(T-RT);Full-SFT 分别为 0.9894±0.0037、0.9626±0.0059、0.8778±0.1005。
  • 在跨站点 Neuropixel 数据上,RPNT(从零开始)为 0.6358±0.0311,而 RPNT(预训练)为 0.6612±0.0328,预训练 RPNT 展现出强劲的少-shot 性能(如 10% 训练比划)。
  • 消融实验表明 MRoPE 优于其他位置编码,且基于上下文的注意力对比标准注意力带来显著提升(约 5%)。
  • 功能连接性可以从空间注意力图中推断,提供数据驱动的运动编码洞见。
Figure 2: A schematic of components in RPNT. Components in black indicate standard transformer signal flow (i.e, no masking and standard attention mechanism). Our novel proposed components include MRoPE (green), context-based attention (cyan), and uniform random masking strategy (pink). MRoPE incorp
Figure 2: A schematic of components in RPNT. Components in black indicate standard transformer signal flow (i.e, no masking and standard attention mechanism). Our novel proposed components include MRoPE (green), context-based attention (cyan), and uniform random masking strategy (pink). MRoPE incorp

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。