Skip to main content
QUICK REVIEW

[論文レビュー] DISCOVER: A Physics-Informed, GPU-Accelerated Symbolic Regression Framework

Udaykumar Gajera, Mohsen Sotoudeh|arXiv (Cornell University)|Jan 27, 2026
Machine Learning in Materials Science被引用数 0
ひとこと要約

DISCOVER is an open-source Python-native symbolic regression framework that incorporates physics-informed constraints and optional GPU acceleration to enable scalable, interpretable model discovery in physics, chemistry, and materials science.

ABSTRACT

Symbolic Regression (SR) enables the discovery of interpretable mathematical relationships from experimental and simulation data. These relationships are often coined descriptors which are defined as a fundamental materials property that is directly correlated to a desired or undesired functional property of the material. Although established approaches such as Sure Independence Screening and Sparsifying Operator (SISSO) have successfully identified low-dimensional descriptors within large feature spaces many existing SR tools integrate poorly with modern Python workflows, offer limited control over the symbolic search space, or struggle with the computational demands of large-scale studies. This paper introduces DISCOVER (Data-Informed Symbolic Combination of Operators for Variable Equation Regression), an open-source symbolic regression package developed to address these challenges through a modular, physics-motivated design. DISCOVER allows users to guide the symbolic search using domain knowledge, constrain the feature space explicitly, and take advantage of optional GPU acceleration to improve computational efficiency in data-intensive workflows, enabling reproducible and scalable SR workflows. The software is intended for applications in computational physics, computational chemistry, and materials science, where interpretability, physical consistency, and execution time are especially important, and it complements general-purpose SR frameworks by emphasizing the discovery of physically meaningful models.

研究の動機と目的

  • Enable guided discovery of interpretable symbolic expressions from data in scientific domains.
  • Incorporate domain knowledge through physics-informed constraints and dimensional analysis.
  • Provide modular, python-native design with optional GPU acceleration for large-scale studies.

提案手法

  • Generates candidate symbolic expressions from user-provided features and operator libraries.
  • Evaluates expressions against target data to identify sparse, parsimonious models.
  • Implements multiple sparsifying search strategies (e.g., OMP, MIQP, Simulated Annealing).
  • Enforces physics-informed constraints via a configuration-based interface and dimensional analysis with the pint library.
  • Supports GPU acceleration on NVIDIA CUDA and Apple Metal, with CPU execution as a fallback.
  • Frames the search as an L0-regularized least-squares problem to find a sparse descriptor vector.
Figure 1: Overview of the DISCOVER workflow, illustrating iterative feature generation, physics-informed screening, and sparse model selection.
Figure 1: Overview of the DISCOVER workflow, illustrating iterative feature generation, physics-informed screening, and sparse model selection.

実験結果

リサーチクエスチョン

  • RQ1How can user-defined physical constraints and dimensional consistency guide the symbolic regression search to produce physically meaningful models?
  • RQ2What is the impact of hardware acceleration on the efficiency of constrained symbolic regression for large feature spaces?
  • RQ3Can DISCOVER balance predictive accuracy with model interpretability through configurable sparsity and operator constraints?

主な発見

  • Provides a Python-native SR framework that supports physics-informed constraints and hardware-accelerated computation.
  • Offers modular search strategies including heuristic, optimization-based, and stochastic approaches for sparse model discovery.
  • Integrates dimensional consistency via the pint library to prune physically invalid expressions early in the search.
  • Demonstrates scalable symbolic regression workflows on CPUs and GPUs for data-intensive, science-oriented applications.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。