QUICK REVIEW

[论文解读] Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification

Shu Shen, C. L. Philip Chen|arXiv (Cornell University)|Jan 12, 2026

Image and Signal Denoising Methods被引用 0

一句话总结

TAHCD 在全局和实例层面联合去除模态特定噪声与跨模态噪声，并引入测试时协同增强以适应 unseen 噪声，从而提升鲁棒多模态分类。

ABSTRACT

Reliable learning of multimodal data (e.g., multi-omics) is a widely concerning issue, especially in safety-critical applications such as medical diagnosis. However, low-quality data induced by multimodal noise poses a major challenge in this domain, causing existing methods to suffer from two key limitations. First, they struggle to handle heterogeneous data noise, hindering robust multimodal representation learning. Second, they exhibit limited adaptability and generalization when encountering previously unseen noise. To address these issues, we propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD). On one hand, TAHCD introduces the Adaptive Stable Subspace Alignment and Sample-Adaptive Confidence Alignment to reliably remove heterogeneous noise. They account for noise at both global and instance levels and enable jointly removal of modality-specific and cross-modality noise, achieving robust learning. On the other hand, TAHCD introduces Test-Time Cooperative Enhancement, which adaptively updates the model in response to input noise in a label-free manner, thus improving generalization. This is achieved by collaboratively enhancing the joint removal process of modality-specific and cross-modality noise across global and instance levels according to sample noise. Experiments on multiple benchmarks demonstrate that the proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.

研究动机与目标

在异质噪声（模态特定和跨模态）以及 unseen 噪声下，推动鲁棒的多模态学习。
开发一个在全局和实例层面去噪以提升表征可靠性的框架。
实现无需标签的测试时自适应以增强对新噪声模式的泛化能力。
提供全局与实例去噪之间协同增强的机制以提升鲁棒性。

提出的方法

Adaptive Stable Subspace Alignment (ASSA) 通过对主轴上的可学习掩码构建稳定子空间，并强制类间正交性与子空间投影对齐以去除全局噪声。
Sample-Adaptive Confidence Alignment (SACA) 利用从全局去噪特征估计的先验，通过置信感知的非对称松弛对齐引导实例级噪声移除。
Test-Time Cooperative Enhancement (TTCE) 迭代使用实例级噪声来细化全局去噪和先验，以无需标签的方式实现对 unseen 噪声的自适应。
Instance- 和 modality-wise 的噪声专家在样本级产生掩码以去除模态特定和跨模态噪声。
一个重建式反馈回路（L_re）将实例级噪声信息重新绑定回全局去噪，以改进对 unseen 噪声的处理。
一个最终融合策略，在分类前按置信分数对模态特定和跨模态去噪特征进行加权。

实验结果

研究问题

RQ1联合全局与实例级去噪能否稳健地去除多模态数据中的模态特定噪声和跨模态噪声？
RQ2测试时协同增强是否在没有标签指导的情况下提升对 unseen 噪声的泛化能力？
RQ3ASSA 与 SACA 如何协同工作以防止在去噪时过度抑制有用的模态信息？
RQ4所提出的框架是否能够在多样化的带噪声多模态基准测试中达到最先进的性能？

主要发现

TAHCD 在多种噪声条件下的分类性能优于现有的可靠多模态学习方法。
ASSA 与 SACA 共同在全局和实例层面缓解模态特定及跨模态噪声，保留互补模态信息。
TTCE 实现无标签对 unseen 噪声的自适应，迭代地提升去噪和泛化能力。
该方法在多个基准（BRCA、ROSMAP、CUB、FOOD101）在各种噪声设置下显示出强鲁棒性与泛化性。
所提出的置信感知非对称松弛对齐方法有效地将学习聚焦于低置信模态以纠正噪声，同时避免过度抑制有用信息。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。