QUICK REVIEW

[论文解读] Context-aware Graph Causality Inference for Few-Shot Molecular Property Prediction

Van Thuy Hoang, O. Lee|arXiv (Cornell University)|Jan 16, 2026

Advanced Graph Neural Networks被引用 0

一句话总结

CaMol 引入一个上下文感知的因果框架，使用上下文图、原子掩蔽和回门调整来识别用于少样本分子性质预测的因果子结构，从而提高准确性和可解释性。

ABSTRACT

Molecular property prediction is becoming one of the major applications of graph learning in Web-based services, e.g., online protein structure prediction and drug discovery. A key challenge arises in few-shot scenarios, where only a few labeled molecules are available for predicting unseen properties. Recently, several studies have used in-context learning to capture relationships among molecules and properties, but they face two limitations in: (1) exploiting prior knowledge of functional groups that are causally linked to properties and (2) identifying key substructures directly correlated with properties. We propose CaMol, a context-aware graph causality inference framework, to address these challenges by using a causal inference perspective, assuming that each molecule consists of a latent causal structure that determines a specific property. First, we introduce a context graph that encodes chemical knowledge by linking functional groups, molecules, and properties to guide the discovery of causal substructures. Second, we propose a learnable atom masking strategy to disentangle causal substructures from confounding ones. Third, we introduce a distribution intervener that applies backdoor adjustment by combining causal substructures with chemically grounded confounders, disentangling causal effects from real-world chemical variations. Experiments on diverse molecular datasets showed that CaMol achieved superior accuracy and sample efficiency in few-shot tasks, showing its generalizability to unseen properties. Also, the discovered causal substructures were strongly aligned with chemical knowledge about functional groups, supporting the model interpretability.

研究动机与目标

Motivate few-shot molecular property prediction (MPP) and the need to leverage functional-group causality.
Propose CaMol to discover causal substructures by integrating chemical priors via a context graph.
Disentangle causal substructures from confounders using learnable atom masking and a distribution-based backdoor intervention.
Align discovered substructures with chemical knowledge to improve interpretability and transferability.

提出的方法

Construct a context graph encoding functional groups, molecules, and properties within each episode.
Decompose molecules into BRICS-based functional groups and learn contextual representations via a GNN encoder.
Introduce a learnable atom masking mechanism to separate causal substructures C from confounding S.
Apply a distribution intervention with backdoor adjustment to estimate P(Y|do(C)) by marginalizing over S using chemically grounded confounders.
Optimize a total loss combining causal prediction loss, KL divergence to a uniform prior over S, and a variance/invariance term across interventional subgraphs.
Use MAML-style meta-training with inner-loop causal updates and outer-loop evaluation to encourage few-shot generalization.

Figure 1: (a) The seen properties are relevant to the unseen property prediction. (b) The causal substructures vary and depend on molecular property prediction tasks.

实验结果

研究问题

RQ1How can a context graph bridging functional groups, molecules, and properties improve few-shot molecular property prediction?
RQ2Can learnable atom masking effectively disentangle causal substructures from confounding substructures in molecular graphs?
RQ3Does backdoor-adjusted distribution intervention improve robustness to confounders across molecules and properties?
RQ4Do discovered causal substructures align with chemical knowledge and enhance model interpretability?

主要发现

CaMol achieves superior accuracy across six MoleculeNet datasets in few-shot settings versus strong baselines.
Discovered causal substructures show strong alignment with known functional groups and support interpretability.
The framework demonstrates strong sample efficiency, particularly on high-diversity and imbalanced datasets (e.g., MUV, PCBA).
Backdoor-adjusted causal inference with context guidance yields more robust predictions than models relying on molecule–property relations alone.
The approach provides faithful, model-consistent explanations for predicted properties.

Figure 2: Causal relationships between variables in MPP.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。