QUICK REVIEW

[论文解读] Overfitting for Fun and Profit: Instance-Adaptive Data Compression

Ties van Rozendaal, Iris A. M. Huijben|TU/e Research Portal|Jan 21, 2021

Video Coding and Compression Technologies参考文献 30被引用 28

一句话总结

该论文提出通过在单个视频的 I-frames 上对整个模型进行微调，并信令量化的模型更新，实现在相同码率下的编码器端微调提升约 1 dB PSNR 的全模型实例自适应神经数据压缩。

ABSTRACT

Neural data compression has been shown to outperform classical methods in terms of $RD$ performance, with results still improving rapidly. At a high level, neural compression is based on an autoencoder that tries to reconstruct the input instance from a (quantized) latent representation, coupled with a prior that is used to losslessly compress these latents. Due to limitations on model capacity and imperfect optimization and generalization, such models will suboptimally compress test data in general. However, one of the great strengths of learned compression is that if the test-time data distribution is known and relatively low-entropy (e.g. a camera watching a static scene, a dash cam in an autonomous car, etc.), the model can easily be finetuned or adapted to this distribution, leading to improved $RD$ performance. In this paper we take this concept to the extreme, adapting the full model to a single video, and sending model updates (quantized and compressed using a parameter-space prior) along with the latent representation. Unlike previous work, we finetune not only the encoder/latents but the entire model, and - during finetuning - take into account both the effect of model quantization and the additional costs incurred by sending the model updates. We evaluate an image compression model on I-frames (sampled at 2 fps) from videos of the Xiph dataset, and demonstrate that full-model adaptation improves $RD$ performance by ~1 dB, with respect to encoder-only finetuning.

研究动机与目标

通过将整个压缩模型适应到单个数据实例来提升码率-失真性能的动机。
将 RD 损失扩展为包含模型更新成本和量化开销。
证明带有 spike-and-slab 先验的全模型微调在降低比特率的同时提升 I-frames 的失真。
分析模型更新在参数分组中的分布，以及量化如何影响性能。

提出的方法

构造一个结合 RD 和模型速率损失 L_RDM，其中包含来自模型先验 p(delta) 的模型更新成本项 M。
使用 spike-and-slab 先验以鼓励稀疏性并降低信令零更新的成本。
用 bin width t 对模型更新 delta 进行量化，并在微调时对梯度使用 Straight-Through Estimation。
用先验 p_theta(z) 和 p([delta]) 的熵编码对潜变量 z 和量化更新 delta 进行编码。
在单帧 I-frames 上对全局模型进行微调（全模型自适应）并将模型速率成本在视频中的多帧间摊销。

实验结果

研究问题

RQ1在单个视频实例上进行全模型微调是否比仅编码器微调或仅潜变量自适应在 RD 性能方面更优？
RQ2将模型更新成本和量化感知训练纳入后，实例自适应压缩的可行性与收益如何？
RQ3在针对 I-frames 进行自适应时，参数组之间的模型更新分布如何，spike-and-slab 先验对信令成本有何影响？
RQ4在不同 β 设置下可以实现哪些 RD 增益，以及在微调过程中它们如何演变？

主要发现

在 Xiph-5N 2fps I-frames 上，全模型实例自适应微调在相同码率下比仅编码器微调的 RD 增益约为 1 dB。
考虑模型更新成本和量化是必要的；忽略它们可能导致比特率增加的退化或无界增长。
spike-and-slab 先验降低了零更新信令成本并促进稀疏性，指引哪些参数应更新。
大部分 RD 增益在微调初期实现且持续存在，在更高码率情形下由于更有效的微调产生更大的潜在速率降低。
比特分配分析表明更新通常是量化的并受限于量化器，而零更新则产生较小的静态成本。
仅编码器微调在这些实验中与直接潜变量优化的表现相当，暗示存在较小的摊销差距。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。