QUICK REVIEW

[논문 리뷰] Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification

Shu Shen, C. L. Philip Chen|arXiv (Cornell University)|2026. 01. 12.

Image and Signal Denoising Methods인용 수 0

한 줄 요약

TAHCD는 글로벌 및 인스턴스 수준에서 모달리티 특이 소음과 교차 모달리티 소음을 공동으로 제거하고, unseen 소음에 적응하기 위한 테스트 시 협력 강화(test-time cooperative enhancement)를 도입하여 강인한 다중모달 분류를 향상시킨다.

ABSTRACT

Reliable learning of multimodal data (e.g., multi-omics) is a widely concerning issue, especially in safety-critical applications such as medical diagnosis. However, low-quality data induced by multimodal noise poses a major challenge in this domain, causing existing methods to suffer from two key limitations. First, they struggle to handle heterogeneous data noise, hindering robust multimodal representation learning. Second, they exhibit limited adaptability and generalization when encountering previously unseen noise. To address these issues, we propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD). On one hand, TAHCD introduces the Adaptive Stable Subspace Alignment and Sample-Adaptive Confidence Alignment to reliably remove heterogeneous noise. They account for noise at both global and instance levels and enable jointly removal of modality-specific and cross-modality noise, achieving robust learning. On the other hand, TAHCD introduces Test-Time Cooperative Enhancement, which adaptively updates the model in response to input noise in a label-free manner, thus improving generalization. This is achieved by collaboratively enhancing the joint removal process of modality-specific and cross-modality noise across global and instance levels according to sample noise. Experiments on multiple benchmarks demonstrate that the proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.

연구 동기 및 목표

이질적 소음(모달리티 특이 및 교차 모달리티) 및 미지의 소음 하에서 강인한 다중모달 학습의 필요성을 제기한다.
표현의 신뢰도를 높이기 위해 글로벌 및 인스턴스 수준에서 노이즈 제거를 수행하는 프레임워크를 개발한다.
레이블 없이 테스트 시 적응할 수 있도록 하여 새로운 소음 패턴에 대한 일반화를 향상시킨다.
글로벌 및 인스턴스 노이즈 제거 간의 협력 강화 메커니즘을 제공하여 강인성을 향상시킨다.

제안 방법

적응 가능한 안정 서브스페이스 정렬(ASSA)은 주축에 대한 학습 가능한 마스크를 통해 안정적인 서브스페이스를 구성하고 클래스 간 직교성 및 서브스페이스 프로젝션 정렬을 강제하여 글로벌 소음을 제거한다.
샘플 적응 신뢰도 정렬(SACA)은 글로벌하게 디노이즈된 특징에서 추정된 편향을 사용하는 표본 적응 신뢰도 정렬을 통해 인스턴스 수준의 소음 제거를 안내한다.
테스트 시 협력 강화(TTCE)는 인스턴스 수준의 소음을 반복적으로 사용해 글로벌 디노이징과 프라이어를 정제하여 라벨 없이도 미지의 소음에 적응할 수 있게 한다.
샘플 수준에서 모달리티 특이 소음 제거와 교차 모달리티 소음 제거를 위한 마스크를 생성하는 인스턴스 및 모달리티별 소음 전문가들.
보이지 않는 소음 처리를 개선하기 위한 재구성 기반 피드백 루프(L_re)가 인스턴스 수준의 소음 정보를 글로벌 디노이징으로 다시 연결한다.
분류 전에 신뢰도 점수로 모달리티 특이 및 교차 모달리티 디노이즈 피처를 가중하는 최종 융합 전략.

실험 결과

연구 질문

RQ1글로벌 및 인스턴스 수준의 공동 디노이징이 다중모달 데이터에서 모달리티 특이 소음과 교차 모달리티 소음을 모두 강력하게 제거할 수 있는가?
RQ2레이블된 지침 없이 테스트 시 협력 강화가 미지의 소음에 대한 일반화를 향상시키는가?
RQ3ASSA와 SACA가 노이즈를 제거하면서 정보가 풍부한 모달리티 콘텐츠의 과도한 억제를 방지하기 위해 어떻게 상호 작용하는가?
RQ4제안된 프레임워크가 다양한 노이즈 다중모달 벤치마크에서 최첨단 성능을 달성할 수 있는가?

주요 결과

TAHCD는 다양한 노이즈 조건에서 최첨단 신뢰 가능한 다중모달 학습 방법과 비교하여 우수한 분류 성능을 달성한다.
ASSA와 SACA가 함께 글로벌 및 인스턴스 수준에서 모달리티 특이 소음과 교차 모달리티 소음을 완화하여 보완적인 모달리티 정보를 보존한다.
TTCE는 라벨 없는 미지의 소음에 대한 적응을 가능하게 하여 반복에 따라 디노이징과 일반화를 점진적으로 개선한다.
이 방법은 다양한 노이즈 설정에서 여러 벤치마크(BRCA, ROSMAP, CUB, FOOD101)에서 강한 강건성과 일반화를 보여준다.
제안된 신뢰도 기반 비대칭 슬랙 정렬은 낮은 신뢰도 모달리티에 학습을 집중시켜 노이즈를 수정하지만 유용한 정보를 과도하게 억제하지 않는다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.