QUICK REVIEW

[论文解读] A LEVENBERG-MARQUARDT METHOD FOR NONSMOOTH REGULARIZED LEAST SQUARES

Aleksandr Y. Aravkin, Robert Baraldi|arXiv (Cornell University)|Jan 1, 2022

Sparse and Compressive Sensing Techniques被引用 1

一句话总结

本文提出了一种用于非光滑正则化最小二乘问题的Levenberg-Marquardt方法，结合了光滑非线性最小二乘项与一般非光滑正则项。在较弱条件下建立了全局收敛性及O(ϵ⁻²)最坏情况复杂度，实验证明其在三个测试问题上相比前向梯度法与拟牛顿法具有更少的外迭代次数。

ABSTRACT

We develop a Levenberg-Marquardt method for minimizing the sum of a smooth nonlinear least-squar es term $f(x) = frac{1}{2} \|F(x)\|_2^2$ and a nonsmooth term $h$. Both $f$ and $h$ may be nonconvex. Steps are computed by minimizing the sum of a regularized linear least-squares model and a model of $h$ using a first-order method such as the proximal gradient method. We establish global convergence to a first-order stationary point of both a trust-region and a regularization variant of the Levenberg-Marquardt method under the assumptions that $F$ and its Jacobian are Lipschitz continuous and $h$ is proper and lower semi-continuous. In the worst case, both methods perform $O(ε^{-2})$ iterations to bring a measure of stationarity below $ε\in (0, 1)$. We report numerical results on three examples: a group-lasso basis-pursuit denoise example, a nonlinear support vector machine, and parameter estimation in neuron firing. For those examples to be implementable, we describe in detail how to evaluate proximal operators for separable $h$ and for the group lasso with trust-region constraint. In all cases, the Levenberg-Marquardt methods perform fewer outer iterations than a proximal-gradient method with adaptive step length and a quasi-Newton trust-region method, neither of which exploit the least-squares structure of the problem. Our results also highlight the need for more sophisticated subproblem solvers than simple first-order methods.

研究动机与目标

开发一种专用于具有非凸、非光滑正则项的非光滑正则化最小二乘问题的Levenberg-Marquardt方法。
为该方法的正则化与信赖域两种变体建立全局收敛性与最坏情况复杂度界。
在实际问题（如组Lasso基追踪、非线性SVM、神经元放电参数估计）上展示该方法的高效性。
突出利用最小二乘结构时，昂贵的近端算子与外迭代次数减少之间的权衡。
说明需要比一阶方法更复杂的子问题求解器。

提出的方法

采用Levenberg-Marquardt框架，包含两种变体：正则化变体（LM）与信赖域变体（LMTR）。
通过最小化一个正则化线性最小二乘模型加上非光滑项h的模型来计算步长，使用一阶方法（如近端梯度法）。
采用近端梯度法或二次正则化方法求解子问题，确保在F及其雅可比矩阵满足Lipschitz连续性条件下的收敛性。
应用一般信赖域范数，允许子问题公式具有灵活性，同时保持收敛性保证。
以与近端梯度步长相关的平稳性度量来推导收敛性。
建立最坏情况复杂度为O(ϵ⁻²)次迭代，以达到ϵ ∈ (0,1)以下的平稳性，与光滑情况下的界一致。

实验结果

研究问题

RQ1能否将Levenberg-Marquardt方法有效扩展至具有非光滑、非凸正则项的问题，同时保持收敛性与复杂度界？
RQ2利用f(x) = ½∥F(x)∥²₂的最小二乘结构，与标准近端法或拟牛顿法相比，对减少外迭代次数有何影响？
RQ3在使用Levenberg-Marquardt方法时，求解子问题的代价（如昂贵的近端算子）与外迭代次数减少之间的权衡如何？
RQ4该方法是否可在弱于Lipschitz连续性与下半连续性的假设下保持全局收敛性与复杂度界？
RQ5子问题求解器的质量与不精确评估在整体效率中起到何种作用？

主要发现

LM与LMTR方法在所有三个测试问题上均比自适应线搜索近端梯度法与拟牛顿信赖域方法具有更少的外迭代次数。
在组Lasso基追踪问题中，LMTR仅需24次外迭代，而R2与TR方法分别需1359次与267次。
在非线性SVM示例中，LMTR达到最低的最终目标值（117.69），且仅使用24次外迭代。
在FitzHugh-Nagumo反问题中，LMTR使用最少的目标函数评估次数（1420次），并实现了与数据匹配的最佳拟合及正确稀疏性。
尽管内迭代成本较高，LM与LMTR在每次迭代的目标函数下降速度上仍优于其他方法，尤其在初期阶段表现更优。
结果强调了近端算子评估成本是主要瓶颈，凸显了对更高效子问题求解器的迫切需求。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。