Skip to main content
QUICK REVIEW

[論文レビュー] PDD: Manifold-Prior Diverse Distillation for Medical Anomaly Detection

Xijun Lu, Hongying Liu|arXiv (Cornell University)|Mar 7, 2026
Anomaly Detection Techniques and Applications被引用数 0
ひとこと要約

PDD unifies dual teachers into a high-dimensional manifold and distills to dual students with diverse learning paths, achieving state-of-the-art medical anomaly detection across multiple datasets.

ABSTRACT

Medical image anomaly detection faces unique challenges due to subtle, heterogeneous anomalies embedded in complex anatomical structures. Through systematic Grad-CAM analysis, we reveal that discriminative activation maps fail on medical data, unlike their success on industrial datasets, motivating the need for manifold-level modeling. We propose PDD (Manifold-Prior Diverse Distillation), a framework that unifies dual-teacher priors into a shared high-dimensional manifold and distills this knowledge into dual students with complementary behaviors. Specifically, frozen VMamba-Tiny and wide-ResNet50 encoders provide global contextual and local structural priors, respectively. Their features are unified through a Manifold Matching and Unification (MMU) module, while an Inter-Level Feature Adaption (InA) module enriches intermediate representations. The unified manifold is distilled into two students: one performs layer-wise distillation via InA for local consistency, while the other receives skip-projected representations through a Manifold Prior Affine (MPA) module to capture cross-layer dependencies. A diversity loss prevents representation collapse while maintaining detection sensitivity. Extensive experiments on multiple medical datasets demonstrate that PDD significantly outperforms existing state-of-the-art methods, achieving improvements of up to 11.8%, 5.1%, and 8.5% in AUROC on HeadCT, BrainMRI, and ZhangLab datasets, respectively, and 3.4% in F1 max on the Uni-Medical dataset, establishing new state-of-the-art performance in medical image anomaly detection. The implementation will be released at https://github.com/OxygenLu/PDD

研究の動機と目的

  • Motivate the need for manifold-level modeling in medical anomaly detection due to diffuse, heterogeneous anomalies in medical images.
  • Introduce a dual-teacher framework to fuse global and local priors from heterogeneous backbones.
  • Develop a dual-student distillation scheme with manifold-aware modules and a diversity loss to prevent representation collapse.
  • Demonstrate strong results across multiple medical datasets and analyze components via ablations.

提案手法

  • Two frozen teachers (VMamba-Tiny and wide-ResNet50) provide global contextual and local structural priors.
  • Inter-Level Feature Adaption (InA) fuses shallow features from both backbones.
  • Manifold Matching and Unification (MMU) aligns heterogeneous manifolds into a unified space.
  • Dual students learn with: (1) layer-wise distillation via InA (local consistency); (2) skip-projected manifold features via MPA (cross-layer dependencies).
  • Manifold Prior Affine (MPA) injects prior knowledge through MLP-based affine transforms with skip connections.
  • Diversity loss prevents collapse by encouraging different representations at low-dimensional layers while maintaining similarity at high-dimensional layers.
  • Overall objective combines knowledge distillation loss, prior-guided reconstruction loss, and a diversity loss with learnable weights.
Figure 1 : Grad-CAM visualization of frozen Vmamba and ResNet across medical and industrial images. Within each group, feature maps progress from low-dimensional to high-dimensional feature representations (top to bottom). At the same feature dimension, the dispersed and aggregated activation patter
Figure 1 : Grad-CAM visualization of frozen Vmamba and ResNet across medical and industrial images. Within each group, feature maps progress from low-dimensional to high-dimensional feature representations (top to bottom). At the same feature dimension, the dispersed and aggregated activation patter

実験結果

リサーチクエスチョン

  • RQ1Can dual heterogeneous backbones be unified into a common manifold to better model normal anatomy for medical anomaly detection?
  • RQ2Does dual-student, diversity-aware distillation improve robustness and detection of subtle medical anomalies compared to single-teacher approaches?
  • RQ3How do intra-backbone fusion (InA) and manifold-unification (MMU) contribute to performance on diverse medical datasets?
  • RQ4What is the effect of prior-affine injection (MPA) and diversity regularization on anomaly localization and false positives?

主な発見

MethodHeadCT AUROCZhanglab AUROCBrainMRI AUROCCheXpert AUROCNotes
f-AnoGAN82.675.577.165.8Baseline methods comparison
CutPaste73.073.367.065.5Industrial/medical baseline
RD4AD74.387.580.971.9Knowledge-distillation baseline
SQUID75.487.674.778.1Medical UAD baseline
SIMSID74.991.181.579.7Self-supervised baseline
Skip-TS85.779.288.268.7Skip/teacher-student baseline
Ours97.594.096.779.1PDD with dual-teacher and dual-student distillation
  • PDD achieves state-of-the-art AUROC on HeadCT (97.5), Zhanglab (94.0), and BrainMRI (96.7).
  • PDD reaches competitive CheXpert performance (79.1% AUROC) versus the top method.
  • On Uni-Medical, PDD attains the best F1 max across brain, liver, and retinal categories and the top AP in retinal.
  • Ablations show the value of the dual-teacher MMU/InA, the addition of MPA, and the dual-student design, with the full model achieving 94.0% AUROC on ZhangLab and 96.6% F1 on BrainMRI in ablations.
  • Diversity loss plus teacher-student alignment yields the strongest anomaly detection and localization performance.
  • PDD demonstrates fewer false positives on normal samples compared to baselines in qualitative analyses.
Figure 2 : Overview of the proposed PDD framework. The framework employs a dual-teacher and dual-student architecture. The teachers consist of frozen VMamba-Tiny and frozen wide-ResNet50 encoders, whose intermediate features are fused via the InA module (shown in (b)) to obtain $f_{b}^{i}$ . The two
Figure 2 : Overview of the proposed PDD framework. The framework employs a dual-teacher and dual-student architecture. The teachers consist of frozen VMamba-Tiny and frozen wide-ResNet50 encoders, whose intermediate features are fused via the InA module (shown in (b)) to obtain $f_{b}^{i}$ . The two

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。