QUICK REVIEW

[論文レビュー] LQR through the Lens of First Order Methods: Discrete-time Case

Jingjing Bu, Afshin Mesbahi|arXiv (Cornell University)|Jul 21, 2019

Adaptive Dynamic Programming Control参考文献 18被引用数 76

ひとこと要約

本論文は離散時間LQRを安定化フィードバックゲイン上の実数値最適化として再表現し、勾配、自然勾配、準ニュートン流れとその離散化を分析する。構造化（スパース性）ケースを含む。

ABSTRACT

We consider the Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. Such a setup facilitates examining the implications of a natural initial-state independent formulation of LQR in designing first order algorithms. It is shown that this cost function is smooth and coercive, and provide an alternate means of noting its gradient dominated property. In the process, we provide a number of analytic observations on the LQR cost when directly analyzed in terms of the feedback gain. We then examine three types of well-posed flows for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property suggests that these flows admit unique solutions while gradient dominated property indicates that the corresponding Lyapunov functionals decay at an exponential rate; we also prove that these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and the quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a $Q$-quadratic convergence rate--and in the meantime--recovering the Hewer algorithm.

研究の動機と目的

stabilizing gains.
Motivate solving LQR directly over stabilizing feedback gains using initial-state independent cost formulation.
Establish smoothness, coercivity, and gradient-dominated properties of the LQR cost over feedback gains.
Develop and analyze three flow dynamics (gradient, natural gradient, quasi-Newton) and their forward Euler discretizations.
Provide convergence guarantees (linear and quadratic) and stepsize criteria for unstructured and structured (sparsity) LQR synthesis.

提案手法

Define a cost function J_x0(K) for a fixed initial state and then aggregate over multiple independent initial states to obtain a differentiable, unconstrained objective f(K).
Show f(K) is smooth, coercive, real-analytic on the stabilizing set, and gradient dominated, enabling global convergence results.
Derive and analyze three flows (gradient flow, natural gradient flow, quasi-Newton flow) in continuous time and their discretizations (gradient descent, natural gradient descent, Gauss-Newton-like iteration).
Provide Lyapunov-based step size selection and establish linear convergence to the global optimum for unstructured LQR, with quadratic convergence for quasi-Newton.
Extend the framework to structured LQR synthesis using projected gradient descent and discuss sublinear convergence to a first-order stationary point.

実験結果

リサーチクエスチョン

RQ1Can LQR synthesis be effectively formulated as optimization over stabilizing feedback gains with an initial-state independent cost?
RQ2What are the analytic properties (smoothness, coercivity, gradient dominance) of the LQR cost in this formulation?
RQ3Do gradient, natural gradient, and quasi-Newton flows converge to the global LQR optimum, and at what rates?
RQ4How do discretizations (gradient descent, natural gradient descent, Gauss-Newton iteration) perform with appropriate step sizes?
RQ5How can the approach be extended to structured (sparsity-constrained) LQR synthesis and what are the convergence guarantees under projection?

主な発見

The cost function is smooth, coercive, and gradient dominated over its effective domain.
The flows are exponentially stable in the sense of Lyapunov and converge to the global optimum.
Discrete-time updates via gradient descent, natural gradient descent, and quasi-Newton iterations achieve linear or quadratic convergence under suitable step sizes.
Natural gradient descent yields a monotone nonincreasing sequence on the positive semidefinite cone for the value matrices.
A formalism for structured (sparsity-pattern) LQR via projected gradient descent is developed, with a sublinear convergence guarantee to a first-order stationary point.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。