QUICK REVIEW

[论文解读] A Machine Learning Framework for Solving High-Dimensional Mean Field Game and Mean Field Control Problems

Lars Ruthotto, Stanley Osher|arXiv (Cornell University)|Dec 4, 2019

Stochastic processes and financial applications参考文献 95被引用 200

一句话总结

该论文提出了一种无网格的机器学习框架，通过将拉格朗日动力学与势函数的神经网络参数化相结合，求解高维平均场博弈（MFG）和平均场控制（MFC）问题。通过在特征曲线上施加汉密尔顿-雅可比-贝尔曼（HJB）方程的约束，利用定制的神经网络避免了空间离散化，在高达100个维度的场景下实现了高精度求解，展示了该方法在传统基于网格的方法之外的可扩展性和准确性。

ABSTRACT

Mean field games (MFG) and mean field control (MFC) are critical classes of multi-agent models for efficient analysis of massive populations of interacting agents. Their areas of application span topics in economics, finance, game theory, industrial engineering, crowd motion, and more. In this paper, we provide a flexible machine learning framework for the numerical solution of potential MFG and MFC models. State-of-the-art numerical methods for solving such problems utilize spatial discretization that leads to a curse-of-dimensionality. We approximately solve high-dimensional problems by combining Lagrangian and Eulerian viewpoints and leveraging recent advances from machine learning. More precisely, we work with a Lagrangian formulation of the problem and enforce the underlying Hamilton-Jacobi-Bellman (HJB) equation that is derived from the Eulerian formulation. Finally, a tailored neural network parameterization of the MFG/MFC solution helps us avoid any spatial discretization. Our numerical results include the approximate solution of 100-dimensional instances of optimal transport and crowd motion problems on a standard work station and a validation using an Eulerian solver in two dimensions. These results open the door to much-anticipated applications of MFG and MFC models that were beyond reach with existing numerical methods.

研究动机与目标

解决平均场博弈与控制问题中的维度灾难问题，该问题限制了传统基于网格的数值方法的应用范围。
开发一种可扩展的、无网格的数值框架，用于求解高维势函数型MFG与MFC问题。
实现对复杂现实应用（如最优传输与人群运动）在高维空间中的求解。
提供一种灵活的机器学习公式，将MFG/MFC问题转化为可微分的优化任务。

提出的方法

通过从势函数导出的特征曲线追踪粒子轨迹，在拉格朗日框架下表述MFG/MFC问题。
使用神经网络对空间和时间中的势函数进行参数化，实现在无空间网格情况下的高维表示。
通过在神经网络训练过程中惩罚HJB方程的违反情况，在特征曲线上施加HJB方程的约束。
利用势函数的拉普拉斯算子计算变换的雅可顿行列式，实现在无需欧拉离散化情况下的密度演化。
通过沿特征曲线追踪初始密度的前向推移，隐式集成连续性方程。
使用包含HJB违反、终端条件不匹配和正则化项的损失函数训练神经网络，以确保稳定性和准确性。

实验结果

研究问题

RQ1基于神经网络的框架是否能够在不依赖空间网格的情况下求解高维平均场博弈与控制问题？
RQ2在低维情况下，所提出的拉格朗日-机器学习框架与传统欧拉求解器在精度和可扩展性方面相比如何？
RQ3该方法在高维MFG问题中处理非线性特征和复杂动力学的能力如何？
RQ4在训练过程中惩罚HJB违反如何改善收敛性和解的精度？
RQ5该框架是否能够应用于最优传输与人群运动等真实世界问题，特别是在100维空间中？

主要发现

该框架在标准工作站上成功求解了100维动态最优传输问题的实例，展示了其在基于网格方法之外的可扩展性。
对于二维问题，结果与使用已证明收敛的欧拉求解器获得的结果高度一致，验证了所提方法的准确性。
在人群运动问题中，该方法在维度 d ∈ {2, 10, 50, 100} 下均实现了相近的目标函数值，表明其在高维下具有稳定的性能表现。
在训练过程中惩罚HJB违反可获得更精确的解，且计算量更小，表明损失函数设计的有效性。
人群运动问题中学习到的轨迹呈现出弯曲路径，避开高成本区域，证实了模型能够捕捉复杂且具有拥堵感知的动力学。
该框架能够使用相对简单的神经网络求解具有非线性特征的MFG问题，表明其具备鲁棒性和实际应用潜力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。