Skip to main content
QUICK REVIEW

[论文解读] Curriculum learning for data-driven modeling of dynamical systems

Michele Alessandro Bucci, Onofrio Semeraro|arXiv (Cornell University)|Dec 15, 2021
Simulation Techniques and Applications被引用 1
一句话总结

本文提出了一种基于熵的数据结构化方法的课程学习策略,用于数据驱动的动力系统建模,以提升模型的泛化能力和预测准确性,尤其在数据有限的情况下。通过首先在低熵、简单的轨迹上进行训练(例如靠近不稳定不动点的轨迹),再逐步过渡到复杂、混沌的区域,该方法即使在数据稀缺的情况下也能实现稳健的长期预测,优于标准的训练策略。

ABSTRACT

The reliable prediction of the temporal behavior of complex systems is key in numerous scientific fields. This strong interest is however hindered by modeling issues: often, the governing equations describing the physics of the system under consideration are not accessible or, if known, their solution might require a computational time incompatible with the prediction time constraints. Not surprisingly, approximating complex systems in a generic functional format and informing it ex-nihilo from available observations has become common practice in the age of machine learning, as illustrated by the numerous successful examples based on deep neural networks. However, generalizability of the models, margins of guarantee and the impact of data are often overlooked or examined mainly by relying on prior knowledge of the physics. We tackle these issues from a different viewpoint, by adopting a curriculum learning strategy. In curriculum learning, the dataset is structured such that the training process starts from simple samples towards more complex ones in order to favor convergence and generalization. The concept has been developed and successfully applied in robotics and control of systems. Here, we apply this concept for the learning of complex dynamical systems in a systematic way. First, leveraging insights from the ergodic theory, we assess the amount of data sufficient for a-priori guaranteeing a faithful model of the physical system and thoroughly investigate the impact of the training set and its structure on the quality of long-term predictions. Based on that, we consider entropy as a metric of complexity of the dataset; we show how an informed design of the training set based on the analysis of the entropy significantly improves the resulting models in terms of generalizability, and provide insights on the amount and the choice of data required for an effective data-driven modeling.

研究动机与目标

  • 解决在数据稀缺或获取成本高昂时,复杂动力系统可靠长期预测的挑战。
  • 探究基于复杂度度量的结构化数据排序是否能提升数据驱动建模中的模型泛化能力和收敛性。
  • 利用遍历理论和Kac引理作为理论边界,确定忠实建模的最小数据需求。
  • 评估循环模型(如LSTM)中初始记忆状态对预测性能的影响。
  • 为数据驱动物理建模领域的实践者提供基于证据的最佳实践。

提出的方法

  • 作者基于吸引子维数和系统动力学,利用遍历理论和Kac引理,理论上估计了忠实建模所需的最小数据量。
  • 引入熵作为复杂度度量,对训练数据进行排序和结构化,优先选择低复杂度、低熵的轨迹(例如靠近不稳定不动点的轨迹),再处理高熵、混沌的区域。
  • 通过按熵递增顺序组织数据,对LSTM神经网络实施课程学习策略,实现从简单动力学到复杂动力学的渐进式学习。
  • 系统评估了不同数据采样策略下的训练过程,包括从不动点出发的短轨迹和完整的吸引子轨迹。
  • 分析了LSTM记忆初始化的影响,比较了随机初始化与从不动点轨迹初始化的差异。
  • 在Lorenz '63系统(一个典型的混沌动力系统)上验证了该方法,采用时间序列预测和模型维度评估作为评估标准。

实验结果

研究问题

  • RQ1基于遍历理论的理论边界,忠实建模所需的最小数据量是多少?
  • RQ2是否可以通过按熵组织训练数据来提升模型的泛化能力和预测性能,尤其是在数据有限的情况下?
  • RQ3LSTM记忆的初始状态如何影响模型在训练数据之外的泛化能力?
  • RQ4基于轨迹复杂度(通过熵衡量)的课程学习策略是否在长期预测中优于标准的随机采样或全轨迹训练?
  • RQ5来自不稳定不动点的短轨迹能否作为学习复杂动力学的有效、数据高效起点?

主要发现

  • 根据Kac引理预测,忠实建模所需的最小数据量随吸引子维数呈指数增长;数据不足将导致泛化能力差和模型失效。
  • 在低熵轨迹(如从不稳定不动点出发的轨迹)上进行训练,相比随机采样或全覆盖数据采样,能显著提升长期预测性能。
  • 基于熵排序的课程学习策略即使在数据量少于理论预估值的情况下,也能实现准确建模,有效规避了数据稀缺的限制。
  • 使用不动点轨迹初始化记忆的LSTM模型泛化能力差,而随机初始化则表现出更优且更一致的性能。
  • 研究结果表明,在混沌系统数据驱动建模中,过拟合是一个主要风险,先前研究中看似高可预测性可能源于数据偏差而非模型能力。
  • 结果为基于熵的数据结构化提供强有力的实证和理论支持,证明其是数据驱动动力系统建模中一种有原则且高效的数据策略。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。