Skip to main content
QUICK REVIEW

[論文レビュー] Information-Theoretic Multi-Model Fusion for Target-Oriented Adaptive Sampling in Materials Design

Yixuan Zhang, Zhiyuan Li|TUbilio (Technical University of Darmstadt)|Feb 3, 2026
Machine Learning in Materials Science被引用数 0
ひとこと要約

情報理論的フレームワークは、全表現モデリングよりもエントロピーを減らす軌道に焦点を当てることで、高次元・データ不足の材料設計問題をナビゲートする。多モデル融合を用いたターゲット指向の適応サンプリング

ABSTRACT

Target-oriented discovery under limited evaluation budgets requires making reliable progress in high-dimensional, heterogeneous design spaces where each new measurement is costly, whether experimental or high-fidelity simulation. We present an information-theoretic framework for target-oriented adaptive sampling that reframes optimization as trajectory discovery: instead of approximating the full response surface, the method maintains and refines a low-entropy information state that concentrates search on target-relevant directions. The approach couples data, model beliefs, and physics/structure priors through dimension-aware information budgeting, adaptive bootstrapped distillation over a heterogeneous surrogate reservoir, and structure-aware candidate manifold analysis with Kalman-inspired multi-model fusion to balance consensus-driven exploitation and disagreement-driven exploration. Evaluated under a single unified protocol without dataset-specific tuning, the framework improves sample efficiency and reliability across 14 single- and multi-objective materials design tasks spanning candidate pools from $600$ to $4 imes 10^6$ and feature dimensions from $10$ to $10^3$, typically reaching top-performing regions within 100 evaluations. Complementary 20-dimensional synthetic benchmarks (Ackley, Rastrigin, Schwefel) further demonstrate robustness to rugged and multimodal landscapes.

研究の動機と目的

  • Frame optimization as trajectory discovery by maintaining a low-entropy information state toward target directions.
  • Integrate data, model beliefs, and physics priors through dimension-aware information budgeting.
  • Use a heterogeneous surrogate reservoir with adaptive bootstrapped distillation.
  • Apply structure-aware candidate analysis and Kalman-inspired multi-model fusion for balanced exploration/exploitation.
  • Demonstrate robustness under a unified data-scarce protocol across diverse materials design tasks.

提案手法

  • Dimension-aware capacity alignment to estimate effective intrinsic dimension and adapt hyperparameters.
  • Target-conditioned surrogate shaping with heterogeneous models trained via importance sampling on high-value regions.
  • Structure-aware candidate organization to group candidates and estimate redundancy and feasible variation without explicit embedding.
  • Multi-source fusion using Kalman-like logic to arbitrate exploitation and disagreement-driven exploration (KF and rKF).
  • Out-of-bag diagnostics for model generalization and calibration, including R^2 and ELPD metrics.
  • Information-theoretic objective: maximize mutual information between data/model/physics triplet and the design target.

実験結果

リサーチクエスチョン

  • RQ1Can target-oriented adaptive sampling reduce evaluations by concentrating search on target-relevant trajectories in high-dimensional spaces?
  • RQ2Does a heterogeneous model ensemble with dimension-aware budgeting improve reliability and sample efficiency compared to single-model Bayesian optimization in data-scarce regimes?
  • RQ3How do structure-aware candidate analysis and Kalman-inspired fusion balance exploitation and exploration across diverse landscapes?
  • RQ4Is the proposed framework robust across single- and multi-objective materials design tasks with large candidate pools and high feature dimensionality?
  • RQ5What are the internal information dynamics (entropy, bandwidth allocation, model disagreement) that accompany convergence to target regions?

主な発見

  • The framework improves sample efficiency and reliability across 14 single- and multi-objective materials design tasks with candidate pools from 600 to 4,000,000 and feature dimensions from 10 to 10^3.
  • Typically reaches top-performing regions within 100 evaluations.
  • Complementary 20-dimensional synthetic benchmarks (Ackley, Rastrigin, Schwefel) demonstrate robustness to rugged and multimodal landscapes.
  • Under a unified protocol without dataset-specific tuning, it avoids heavy problem-specific acquisition engineering and hyperparameter tuning.
  • The approach integrates physical priors, model diversity, and uncertainty into an information-centric control loop to enable adaptive sampling decisions.
  • Results indicate consistent convergence patterns across smooth, multimodal, and deceptive landscapes, with deliberate exploration followed by contraction toward target manifolds.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。