QUICK REVIEW

[論文レビュー] Partially Lazy Gradient Descent for Smoothed Online Learning

Naram Mhaisen, George Iosifidis|arXiv (Cornell University)|Jan 22, 2026

Advanced Bandit Algorithms Research被引用数 0

ひとこと要約

k-lazyGDを導入するオンライン学習アルゴリズムは、Greedy Online Gradient DescentとLazyGDの間を補間し、FTRLベースの分析とアンサンブル手法でスイッチングコストを抑えつつ、平滑化されたオンライン凸最適化(SOCO)における最適なダイナミック後悔を達成することを示す。

ABSTRACT

We introduce $k$-lazyGD, an online learning algorithm that bridges the gap between greedy Online Gradient Descent (OGD, for $k=1$) and lazy GD/dual-averaging (for $k=T$), creating a spectrum between reactive and stable updates. We analyze this spectrum in Smoothed Online Convex Optimization (SOCO), where the learner incurs both hitting and movement costs. Our main contribution is establishing that laziness is possible without sacrificing hitting performance: we prove that $k$-lazyGD achieves the optimal dynamic regret $\mathcal{O}(\sqrt{(P_T+1)T})$ for any laziness slack $k$ up to $Θ(\sqrt{T/P_T})$, where $P_T$ is the comparator path length. This result formally connects the allowable laziness to the comparator's shifts, showing that $k$-lazyGD can retain the inherently small movements of lazy methods without compromising tracking ability. We base our analysis on the Follow the Regularized Leader (FTRL) framework, and derive a matching lower bound. Since the slack depends on $P_T$, an ensemble of learners with various slacks is used, yielding a method that is provably stable when it can be, and agile when it must be.

研究の動機と目的

SOCO（平滑化されたオンライン凸最適化）とそのヒットと移動コストを動機づけて定義すること
Greedy GDとLazy GDの間を tunable lazynesパラメータkを通じて架け橋し、更新ルールのスペクトルを形成すること
principledな枠組みの下で、部分的な怠惰さがヒット性能を損なうことなく最適なダイナミック後悔を達成できることを示すこと

提案手法

Follow the Regularized Leader (FTRL) フレームワーク内で pruneベースの勾配履歴メカニズムを用いてk-lazyGDを定式化する
k長フェーズ内での勾配を段階的に蓄積し、カウンタn_tとノーマルコーンベースの分析による剪定を導入する
怠惰性がk* = Theta(sqrt(T/P_T))までなら最適なダイナミック後悔を保持できる普遍的下界を導出する
対応する上界と、未知の比較パス長P_Tに適応するアンサンブルメタ学習スキーム（複数のkとsigmaに対して）を提供する
剪定効果を捉える非標準のg_t^I項を持つFTRL還元を利用し、k-lazyGD更新と等価であることを証明する
lazyイテレートの老化と安定性の性質を確立し、それらがスイッチングコストとダイナミック後悔のトレードオフの改善と結びつくことを示す

Figure 1: Switching in example ( $i$ , top), showing staleness , and example ( $ii$ , bottom), showing stability . Left: switching cost. Right: Snapshots over 4 (top) and 2 (bottom) rounds: greedy updates move continuously, whereas lazy updates remain still or move minimally.

実験結果

リサーチクエスチョン

RQ1SOCOにおいて最適なダイナミック後悔を損なうことなく許容できる怠惰性レベル（k）の最大値は何か？
RQ2部分的に怠惰な更新（k-lazyGD）はGreedy更新のヒット性能を再現しつつLazy更新の安定性を維持できるか？
RQ3怠惰性のレベルは比較長さP_TとホライズンTとどのように関係するか？
RQ4未知のP_Tに対して適応性を持つアンサンブル/メタ学習アプローチは、オーダー最適な後悔を維持できるか？
RQ5FTRLフレームワーク内でk-lazyGDの基礎的な理論保証（下界・上界）は何か？

主な発見

k-lazyGDは怠惰性がk* = Theta(sqrt(T/P_T))までの範囲でダイナミック後悔をO(sqrt((P_T+1)T))のオーダーで提供する
怠惰性が大きすぎると線形のダイナミック後悔を生む普遍的な下界があり、識別された閾値を動機づける
剪定規則を用いたFTRLのインスタンスとしてk-lazyGDを置き、提案されたk-lazyGD反復と更新の等価性を証明する
sigmaとkのグリッド上のk-lazyGD専門家のアンサンブルは適応性を実現し、比較対象に対して同じオーダー最適な後悔の境界を維持する
解析は反復の老化と安定性を形式化し、より大きいkがヒット性能を損なうことなく移動を減らすことを示すSOCOの枠組みを示す

Partially Lazy Gradient Descent for Smoothed Online Learning

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。