QUICK REVIEW

[論文レビュー] Flow caching for autoregressive video generation

Yuexiao Ma, Xuzhe Zheng|arXiv (Cornell University)|Feb 11, 2026

Video Coding and Compression Technologies被引用数 0

ひとこと要約

FlowCacheはチャンク単位のキャッシュとKVキャッシュ圧縮を導入して自動回帰型動画生成を高速化し、MAGI-1とSkyReels-V2の両方で品質低下を最小限に抑えつつ実質的なスピードアップを実現します。

ABSTRACT

Autoregressive models, often built on Transformer architectures, represent a powerful paradigm for generating ultra-long videos by synthesizing content in sequential chunks. However, this sequential generation process is notoriously slow. While caching strategies have proven effective for accelerating traditional video diffusion models, existing methods assume uniform denoising across all frames-an assumption that breaks down in autoregressive models where different video chunks exhibit varying similarity patterns at identical timesteps. In this paper, we present FlowCache, the first caching framework specifically designed for autoregressive video generation. Our key insight is that each video chunk should maintain independent caching policies, allowing fine-grained control over which chunks require recomputation at each timestep. We introduce a chunkwise caching strategy that dynamically adapts to the unique denoising characteristics of each chunk, complemented by a joint importance-redundancy optimized KV cache compression mechanism that maintains fixed memory bounds while preserving generation quality. Our method achieves remarkable speedups of 2.38 times on MAGI-1 and 6.7 times on SkyReels-V2, with negligible quality degradation (VBench: 0.87 increase and 0.79 decrease respectively). These results demonstrate that FlowCache successfully unlocks the potential of autoregressive models for real-time, ultra-long video generation-establishing a new benchmark for efficient video synthesis at scale. The code is available at https://github.com/mikeallen39/FlowCache.

研究の動機と目的

動画チャンク間の異種ノイズ除去を解決することにより自動回帰型動画生成を高速化する動機付け。
動画チャンクごとに再計算を独立して管理するチャンク単位キャッシュ方針を提案。
品質を損なうことなくメモリ予算に適合させるための重要度–冗長性を両立させたKVキャッシュ圧縮を提案。
自動回帰型動画生成におけるキャッシュダイナミクスの理論的および実証的分析を提供。
代表的モデルで最先端のスピードアップを示しつつ動画品質を維持。

提案手法

各動画チャンクごとに連続するタイムステップ間の相対L1距離を定義し、再利用可能性を測定。
デノイジングが進むにつれて相対L1距離が単調に増加することを理論的に確立（定理1）。
FlowCacheを提案し、デノイジング状態に基づいて各動画チャンクに独立したキャッシュ方針を割り当て。
重要度と冗長性を同時に最適化するKVキャッシュ圧縮を実装し、多様で関連性の高い以前のKVエントリを選択（式9–12）。
MAGI-1とSkyReels-V2でFlowCacheを評価し、チャンク単位の再利用とKV圧縮のメリットをアブレーションで示す。

実験結果

リサーチクエスチョン

RQ1独立したチャンク単位のキャッシュ方針は、品質を損なうことなく自動回帰型動画生成の加速を改善できるか。
RQ2長編動画におけるメモリ使用量と時間的一貫性のバランスをとるためのKVキャッシュの圧縮方法はどうあるべきか。
RQ3デノイジング軌跡のチャンクレベルの不均一性がキャッシュ戦略に与える影響は何か。
RQ4FlowCacheの理論的洞察は、異なる自動回帰型動画モデルで実証的な速度向上につながるか。

主な発見

モデル	手法	PFLOPs ↓	スピードアップ ↑	潜在時間(秒) ↓	VBench ↑	LPIPS ↓	SSIM ↑	PSNR ↑
MAGI-1	Vanilla	306	1×	2873	77.06%	-	-	-
MAGI-1	TeaCache-slow	294	1.12×	2579	77.50%	0.8160	0.1138	13.26
MAGI-1	TeaCache-fast	225	1.44×	1998	70.11%	0.8160	0.1138	8.94
MAGI-1	FlowCache-slow	161	1.86×	1546	78.96%	0.3160	0.6497	22.34
MAGI-1	FlowCache-fast	140	2.38×	1209	77.93%	0.4311	0.5140	19.27
SkyReels-V2	Vanilla	113	1×	1540	83.84%	-	-	-
SkyReels-V2	TeaCache-slow	58	1.89×	814	82.67%	0.1472	0.7501	21.96
SkyReels-V2	TeaCache-fast	49	2.2×	686	80.06%	0.3063	0.6121	18.39
SkyReels-V2	FlowCache-slow	36	5.88×	262	83.12%	0.1225	0.789	23.74
SkyReels-V2	FlowCache-fast	28	6.7×	230	83.05%	0.1467	0.7635	22.95

FlowCacheはMAGI-1で0.87ポイントのVBench向上を伴い2.38xの速度upを達成。
FlowCacheはSkyReels-V2で0.79ポイントのVBench低下を伴い6.7xの速度upを達成。
チャンク単位の再利用はTeaCache風の均一キャッシュより品質維持に優れる。
KVキャッシュ圧縮は品質低下をほとんど生じさせず、メモリ/計算量を削減。
モデル全体でFlowCacheは著しい効率向上を確立し、知覚的劣化は最小限に留まる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。