[论文解读] Context-aware Graph Causality Inference for Few-Shot Molecular Property Prediction
CaMol 引入一个上下文感知的因果框架,使用上下文图、原子掩蔽和回门调整来识别用于少样本分子性质预测的因果子结构,从而提高准确性和可解释性。
Molecular property prediction is becoming one of the major applications of graph learning in Web-based services, e.g., online protein structure prediction and drug discovery. A key challenge arises in few-shot scenarios, where only a few labeled molecules are available for predicting unseen properties. Recently, several studies have used in-context learning to capture relationships among molecules and properties, but they face two limitations in: (1) exploiting prior knowledge of functional groups that are causally linked to properties and (2) identifying key substructures directly correlated with properties. We propose CaMol, a context-aware graph causality inference framework, to address these challenges by using a causal inference perspective, assuming that each molecule consists of a latent causal structure that determines a specific property. First, we introduce a context graph that encodes chemical knowledge by linking functional groups, molecules, and properties to guide the discovery of causal substructures. Second, we propose a learnable atom masking strategy to disentangle causal substructures from confounding ones. Third, we introduce a distribution intervener that applies backdoor adjustment by combining causal substructures with chemically grounded confounders, disentangling causal effects from real-world chemical variations. Experiments on diverse molecular datasets showed that CaMol achieved superior accuracy and sample efficiency in few-shot tasks, showing its generalizability to unseen properties. Also, the discovered causal substructures were strongly aligned with chemical knowledge about functional groups, supporting the model interpretability.
研究动机与目标
- Motivate few-shot molecular property prediction (MPP) and the need to leverage functional-group causality.
- Propose CaMol to discover causal substructures by integrating chemical priors via a context graph.
- Disentangle causal substructures from confounders using learnable atom masking and a distribution-based backdoor intervention.
- Align discovered substructures with chemical knowledge to improve interpretability and transferability.
提出的方法
- Construct a context graph encoding functional groups, molecules, and properties within each episode.
- Decompose molecules into BRICS-based functional groups and learn contextual representations via a GNN encoder.
- Introduce a learnable atom masking mechanism to separate causal substructures C from confounding S.
- Apply a distribution intervention with backdoor adjustment to estimate P(Y|do(C)) by marginalizing over S using chemically grounded confounders.
- Optimize a total loss combining causal prediction loss, KL divergence to a uniform prior over S, and a variance/invariance term across interventional subgraphs.
- Use MAML-style meta-training with inner-loop causal updates and outer-loop evaluation to encourage few-shot generalization.

实验结果
研究问题
- RQ1How can a context graph bridging functional groups, molecules, and properties improve few-shot molecular property prediction?
- RQ2Can learnable atom masking effectively disentangle causal substructures from confounding substructures in molecular graphs?
- RQ3Does backdoor-adjusted distribution intervention improve robustness to confounders across molecules and properties?
- RQ4Do discovered causal substructures align with chemical knowledge and enhance model interpretability?
主要发现
- CaMol achieves superior accuracy across six MoleculeNet datasets in few-shot settings versus strong baselines.
- Discovered causal substructures show strong alignment with known functional groups and support interpretability.
- The framework demonstrates strong sample efficiency, particularly on high-diversity and imbalanced datasets (e.g., MUV, PCBA).
- Backdoor-adjusted causal inference with context guidance yields more robust predictions than models relying on molecule–property relations alone.
- The approach provides faithful, model-consistent explanations for predicted properties.

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。