QUICK REVIEW

[论文解读] Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Wenlong Wang, Fergal Reid|arXiv (Cornell University)|Feb 12, 2026

Advanced Graph Neural Networks被引用 0

一句话总结

本文用 Mamba-2 混合算子替换 tiny recursive reasoning model (TRM) 中的 Transformer 块，并在 ARC-AGI-1 pass@2 和较高的 K 覆盖范围方面显示提升，同时保持类似的 pass@1，从而验证 Mamba-2 作为小模型中的可行递归算子。

ABSTRACT

Recent work on recursive reasoning models like TRM demonstrates that tiny networks (7M parameters) can achieve strong performance on abstract reasoning tasks through latent recursion -- iterative refinement in hidden representation space without emitting intermediate tokens. This raises a natural question about operator choice: Mamba-2's state space recurrence is itself a form of iterative refinement, making it a natural candidate for recursive reasoning -- but does introducing Mamba-2 into the recursive scaffold preserve reasoning capability? We investigate this by replacing the Transformer blocks in TRM with Mamba-2 hybrid operators while maintaining parameter parity (6.83M vs 6.86M parameters). On ARC-AGI-1, we find that the hybrid improves pass@2 (the official metric) by +2.0\% (45.88\% vs 43.88\%) and consistently outperforms at higher K values (+4.75\% at pass@100), whilst maintaining pass@1 parity. This suggests improved candidate coverage -- the model generates correct solutions more reliably -- with similar top-1 selection. Our results validate that Mamba-2 hybrid operators preserve reasoning capability within the recursive scaffold, establishing SSM-based operators as viable candidates in the recursive operator design space and taking a first step towards understanding the best mixing strategies for recursive reasoning.

研究动机与目标

研究 Mamba-2 状态空间递归是否能够在不损失能力的前提下替换极小模型中的 Transformer 块。
评估 Mamba-2 混合算子对抽象推理基准（ARC-AGI-1）以及其他任务（Sudoku、Maze）的影响。
描述算子选择如何影响潜在递归推理中的候选覆盖率与 top-1 选择之间的权衡。

提出的方法

在保留 TRM 递归结构（外部 3 个周期，内部 4–6 个周期，且相同隐状态 z_H 与 z_L）的前提下。
在两种变体中用 Mamba-2 混合堆替代逐步 Transformer 块：TR-mamba2attn（Mamba-2 → Mamba-2 → Attention → MLP）与 TR-mamba2mlpt（Mamba-2 → Mamba-2 → MLP-t）。
为了 isolating 算子效应，将参数与原始 TRM-attn 相匹配（大约 6.83M vs 6.86M）。
使用后归一化（RMSNorm）来稳定递归计算。
在 ARC-AGI-1、Sudoku-Extreme、Maze-30x30-Hard 上以 pass@K（K ∈ {1,2,5,10,100,1000}）和在可用时的精确准确度进行评估。

实验结果

研究问题

RQ1Mamba-2 混合算子在类似 TRM 的递归框架中是否能保持推理能力？
RQ2Mamba-2 混合算子是否在较大 K 的覆盖率（pass@K）上提升而不牺牲 top-1 准确度？
RQ3与基于注意力的 TRM 相比，Mamba-2 混合算子在 Sudoku 与 Maze 等其他推理任务上的表现如何？
RQ4在递归推理中使用 Mamba-2 混合算子时，覆盖率与选择之间的权衡是什么？

主要发现

模型	参数量	pass@1	pass@2	pass@5	pass@10	pass@100	pass@1000
TRM-attn	6.83M	40.75	43.88	49.25	52.13	60.50	65.50
TR-mamba2attn	6.86M	40.50	45.88	51.88	54.50	65.25	69.75

在 ARC-AGI-1 上，混合算子将 pass@2 提升了 2.0 个百分点（45.88% 对 43.88%）。
在较高的 K 值下，混合算子持续领先，在 pass@100 达到 +4.75%，同时保持 pass@1 的同等性（-0.25%）。
Sudoku-Extreme 更偏好 MLP-t 变体，TRM-mlp-t 的精度为 87.4%，TR-mamba2mlpt 为 84.2%，均高于基于注意力的模型。
Maze-30x30-Hard 显示不稳定性；TR-mamba2attn 达到 80.6%，而 MLP-t 变体失败（0.0%），表明混合算子在不同任务上效果具有依赖性。
ARC-AGI-1 的结果表明使用 Mamba-2 混合算子能提升候选覆盖率，同时不降低 top-1 准确度。
后归一化被强调为在展开循环的递归中实现稳定性所必需。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。