Skip to main content
QUICK REVIEW

[論文レビュー] Molecular Representations in Implicit Functional Space via Hyper-Networks

Zehong Wang, Xiaolong Han|arXiv (Cornell University)|Jan 29, 2026
Machine Learning in Materials Science被引用数 0
ひとこと要約

The paper proposes MolField, a function-space representation for molecules where each molecule is modeled as a continuous 3D field via canonical implicit neural representations, learned by a function-space hyper-network to enable task-agnostic learning across dynamics and properties.

ABSTRACT

Molecular representations fundamentally shape how machine learning systems reason about molecular structure and physical properties. Most existing approaches adopt a discrete pipeline: molecules are encoded as sequences, graphs, or point clouds, mapped to fixed-dimensional embeddings, and then used for task-specific prediction. This paradigm treats molecules as discrete objects, despite their intrinsically continuous and field-like physical nature. We argue that molecular learning can instead be formulated as learning in function space. Specifically, we model each molecule as a continuous function over three-dimensional (3D) space and treat this molecular field as the primary object of representation. From this perspective, conventional molecular representations arise as particular sampling schemes of an underlying continuous object. We instantiate this formulation with MolField, a hyper-network-based framework that learns distributions over molecular fields. To ensure physical consistency, these functions are defined over canonicalized coordinates, yielding invariance to global SE(3) transformations. To enable learning directly over functions, we introduce a structured weight tokenization and train a sequence-based hyper-network to model a shared prior over molecular fields. We evaluate MolField on molecular dynamics and property prediction. Our results show that treating molecules as continuous functions fundamentally changes how molecular representations generalize across tasks and yields downstream behavior that is stable to how molecules are discretized or queried.

研究の動機と目的

  • Motivate a move from discrete molecule representations (sequences, graphs, point clouds) to continuous molecular fields in function space.
  • Define a canonical, SE(3)-invariant molecular function as the primary representation object.
  • Develop MolField with canonical implicit neural representations (C-INR), structured weight tokenization (SWT), and a function-space hyper-network (FSHN).
  • Show that function-space representations generalize across tasks and are robust to discretization and querying schemes.

提案手法

  • Represent molecules as continuous functions over 3D space using C-INR to ensure SE(3) invariance via canonical coordinates.
  • Construct a canonical frame Q(X) from rotation-equivariant features to map queries into a fixed canonical coordinate system.
  • Expose C-INR parameters through Structured Weight Tokenization to enable transformer-based processing of function parameters.
  • Train a Function Space Hyper-Network that generates C-INR parameters conditioned on a latent variable z, modeling a distribution over molecular functions end-to-end.
  • Train with task-specific losses (MD: SDF + Eikonal; Property: aggregated INR tokens for regression; Generation: density matching) and backpropagate through the INR and hyper-network.
  • At inference, generate molecular functions in function space and query for downstream tasks without per-instance optimization.]
  • research_questions: ["Can molecules be effectively represented as SE(3)-invariant continuous functions over 3D space rather than discrete structures?","Does learning distributions over molecular functions via a hyper-network improve generalization across molecular dynamics and property prediction tasks?","Does a canonical implicit representation combined with structured tokenization enable robust, discretization-agnostic learning?","Can function-space representations improve data efficiency and long-horizon predictive performance in molecular dynamics?","Is there a measurable link between INR reconstruction fidelity and downstream property prediction accuracy?"]
  • key_findings: ["MolField achieves best average performance in molecular dynamics surface reconstruction across multiple trajectories.","MolField yields lower MAE for spatial-property targets on QM9, notably HOMO-related and polarizability-related properties, while remaining competitive on other targets.","Ablation studies show removing C-INR, SWT, or FSHN components degrades performance, underscoring the importance of joint design.","MolField demonstrates improved data efficiency, sustaining performance with less training data due to the amortized function-space prior.","Long-horizon predictions in MolField are more accurate and stable than per-trajectory implicit nets, indicating better temporal generalization.","Function fidelity (INR reconstruction loss) correlates with downstream property errors, and pretraining INR on molecular generation strengthens this relationship.]

実験結果

リサーチクエスチョン

  • RQ1Can molecules be effectively represented as SE(3)-invariant continuous functions over 3D space rather than discrete structures?
  • RQ2Does learning distributions over molecular functions via a hyper-network improve generalization across molecular dynamics and property prediction tasks?
  • RQ3Does a canonical implicit representation combined with structured tokenization enable robust, discretization-agnostic learning?
  • RQ4Can function-space representations improve data efficiency and long-horizon predictive performance in molecular dynamics?
  • RQ5Is there a measurable link between INR reconstruction fidelity and downstream property prediction accuracy?

主な発見

  • MolField achieves best average performance in molecular dynamics surface reconstruction across multiple trajectories.
  • MolField yields lower MAE for spatial-property targets on QM9, notably HOMO-related and polarizability-related properties, while remaining competitive on other targets.
  • Ablation studies show removing C-INR, SWT, or FSHN components degrades performance, underscoring the importance of joint design.
  • MolField demonstrates improved data efficiency, sustaining performance with less training data due to the amortized function-space prior.
  • Long-horizon predictions in MolField are more accurate and stable than per-trajectory implicit nets, indicating better temporal generalization.
  • Function fidelity (INR reconstruction loss) correlates with downstream property errors, and pretraining INR on molecular generation strengthens this relationship.]
  • table_headers: []
  • table_rows: []} } })

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。