QUICK REVIEW

[论文解读] Information-Theoretic Multi-Model Fusion for Target-Oriented Adaptive Sampling in Materials Design

Yixuan Zhang, Zhiyuan Li|TUbilio (Technical University of Darmstadt)|Feb 3, 2026

Machine Learning in Materials Science被引用 0

一句话总结

信息理论框架的目标导向自适应采样通过多模型融合在高维、数据稀缺的材料设计问题中导航，聚焦熵减少轨迹而非全局表面建模。

ABSTRACT

Target-oriented discovery under limited evaluation budgets requires making reliable progress in high-dimensional, heterogeneous design spaces where each new measurement is costly, whether experimental or high-fidelity simulation. We present an information-theoretic framework for target-oriented adaptive sampling that reframes optimization as trajectory discovery: instead of approximating the full response surface, the method maintains and refines a low-entropy information state that concentrates search on target-relevant directions. The approach couples data, model beliefs, and physics/structure priors through dimension-aware information budgeting, adaptive bootstrapped distillation over a heterogeneous surrogate reservoir, and structure-aware candidate manifold analysis with Kalman-inspired multi-model fusion to balance consensus-driven exploitation and disagreement-driven exploration. Evaluated under a single unified protocol without dataset-specific tuning, the framework improves sample efficiency and reliability across 14 single- and multi-objective materials design tasks spanning candidate pools from $600$ to $4 imes 10^6$ and feature dimensions from $10$ to $10^3$, typically reaching top-performing regions within 100 evaluations. Complementary 20-dimensional synthetic benchmarks (Ackley, Rastrigin, Schwefel) further demonstrate robustness to rugged and multimodal landscapes.

研究动机与目标

通过维持朝向目标方向的低熵信息状态，将优化框架框定为轨迹发现。
通过维度感知的信息预算整合数据、模型信念和物理先验。
使用自适应引导蒸馏的异质代理库。
应用结构感知的候选分析和卡尔曼启发的多模型融合实现平衡的探索/利用。
在统一的数据稀缺协议下，在多样化材料设计任务中展示鲁棒性。

提出的方法

维度感知的容量对齐以估计有效内在维度并自适应超参数。
以对目标条件化的代理塑形，利用在高价值区域训练的异质模型进行重要性采样。
结构感知的候选组织以将候选分组并在不给出显式嵌入的情况下估计冗余和可行变异性。
使用类似卡尔曼的逻辑进行多源融合以裁决利用与基于分歧的探索（KF 与 rKF）。
袋外诊断用于模型泛化与校准，包括 R^2 与 ELPD 指标。
信息理论目标：最大化数据/模型/物理三元组与设计目标之间的互信息。

实验结果

研究问题

RQ1目标导向的自适应采样是否能通过将搜索集中在高维空间中的目标相关轨迹来减少评估次数？
RQ2在数据稀缺情境下，与单模型贝叶斯优化相比，具有维度感知预算的异质模型集是否提升可靠性和取样效率？
RQ3结构感知的候选分析与卡尔曼启发的融合如何在多样化景观中实现利用与探索的平衡？
RQ4所提框架在具有大型候选池和高特征维度的单目标和多目标材料设计任务中是否鲁棒？
RQ5在收敛到目标区域时，内部信息动态（熵、带宽分配、模型分歧）为何物？

主要发现

该框架在14个单一目标及多目标材料设计任务中提升了取样效率和可靠性，候选池从600扩展至4,000,000，特征维度从10到10^3。
通常在100次评估内达到表现最优区域。
补充的20维综合基准测试（Ackley、Rastrigin、Schwefel）证明对崎岖且多模态地形的鲁棒性。
在统一协议下不进行数据集特定调优，避免了大量问题特定的获取函数工程和超参数调优。
该方法将物理先验、模型多样性与不确定性整合到信息中心控制回路中，以实现自适应采样决策。
结果显示在平滑、多模态和欺骗性地形上的收敛模式一致，呈现先有意图探索再收缩至目标流形的过程。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。