Skip to main content
QUICK REVIEW

[Paper Review] MolecularRNN: Generating realistic molecular graphs with optimized properties

Mariya Popova, Mykhailo Shvets|arXiv (Cornell University)|May 30, 2019
Computational Drug Discovery Methods67 citations
TL;DR

MolecularRNN is a graph-recurrent model that generates realistic molecular graphs, achieves 100% validity with valency-based rejection sampling, and optimizes properties via policy gradient reinforcement learning.

ABSTRACT

Designing new molecules with a set of predefined properties is a core problem in modern drug discovery and development. There is a growing need for de-novo design methods that would address this problem. We present MolecularRNN, the graph recurrent generative model for molecular structures. Our model generates diverse realistic molecular graphs after likelihood pretraining on a big database of molecules. We perform an analysis of our pretrained models on large-scale generated datasets of 1 million samples. Further, the model is tuned with policy gradient algorithm, provided a critic that estimates the reward for the property of interest. We show a significant distribution shift to the desired range for lipophilicity, drug-likeness, and melting point outperforming state-of-the-art works. With the use of rejection sampling based on valency constraints, our model yields 100% validity. Moreover, we show that invalid molecules provide a rich signal to the model through the use of structure penalty in our reinforcement learning pipeline.

Motivation & Objective

  • Develop a graph-based generator for molecular structures that directly models atoms as nodes and bonds as edges.
  • Ensure chemical validity through valency-based constraints during inference and training.
  • Enable optimization of molecular properties (e.g., logP, QED, melting point) via reinforcement learning with a critic.
  • Demonstrate scalability by large-scale generation and compare against state-of-the-art methods.
  • Provide extensive empirical analysis across diverse datasets to benchmark generation quality and property shifts.

Proposed method

  • Extend GraphRNN to handle molecular graphs with atom types and bond orders (S_i^π ∈ {0,1,2,3} and C_i^π).
  • Use BFS node ordering to reduce complexity and generate graphs with NodeRNN and EdgeRNN components.
  • Apply valency-based rejection sampling to enforce chemical valency during edge sampling (no exceeding valency for either atom).
  • Pretrain unsupervised likelihood on large molecular datasets (ChEMBL, ZINC, MOSES) to learn a realistic distribution.
  • Optionally, apply structural penalty during training to reinforce valency constraints and improve validity.
  • Optimize generated molecules with policy gradient reinforcement learning using a critic to estimate property-based rewards (e.g., penalized logP, QED, melting point).

Experimental results

Research questions

  • RQ1Can molecular graphs be generated directly with node/edge type predictions to produce valid, diverse, and novel molecules?
  • RQ2Does valency-based rejection sampling guarantee 100% validity during inference without sacrificing diversity or quality?
  • RQ3Can policy-gradient-based optimization shift the distribution of generated molecules toward desirable properties (logP, QED, melting point)?
  • RQ4What is the impact of using a structural penalty during training on validity and chemical realism?
  • RQ5How does MolecularRNN compare to state-of-the-art graph- and SMILES-based generators on large-scale benchmarks?

Key findings

  • 100% validity achieved with valency-based rejection sampling during inference.
  • Unsupervised likelihood pretraining on large datasets yields high validity, uniqueness, novelty, and internal diversity across 1 million samples.
  • MolecularRNN achieves competitive validity/uniqueness/novelty compared with GCPN and JT-VAE on 30k samples.
  • Policy-gradient optimization shifts property distributions toward target ranges for penalized logP and QED, outperforming baselines.
  • Melting temperature optimization demonstrates the model can optimize a property not directly derivable from the graph, via a learned predictor as a critic.
  • Structural penalty provides a signal during training that improves validity and chemical realism.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.