QUICK REVIEW

[논문 리뷰] Overfitting for Fun and Profit: Instance-Adaptive Data Compression

Ties van Rozendaal, Iris A. M. Huijben|TU/e Research Portal|2021. 01. 21.

Video Coding and Compression Technologies참고 문헌 30인용 수 28

한 줄 요약

이 논문은 전체 모델을 단일 비디오의 I-프레임에서 파인튜닝하고 양자화된 모델 업데이트를 신호하여 전체 모델 인스턴스 적응 신경 데이터 압축을 제안하며, 인코더만 파인튜닝 대비 동일 비트레이트에서 약 1 dB PSNR 개선을 달성합니다.

ABSTRACT

Neural data compression has been shown to outperform classical methods in terms of $RD$ performance, with results still improving rapidly. At a high level, neural compression is based on an autoencoder that tries to reconstruct the input instance from a (quantized) latent representation, coupled with a prior that is used to losslessly compress these latents. Due to limitations on model capacity and imperfect optimization and generalization, such models will suboptimally compress test data in general. However, one of the great strengths of learned compression is that if the test-time data distribution is known and relatively low-entropy (e.g. a camera watching a static scene, a dash cam in an autonomous car, etc.), the model can easily be finetuned or adapted to this distribution, leading to improved $RD$ performance. In this paper we take this concept to the extreme, adapting the full model to a single video, and sending model updates (quantized and compressed using a parameter-space prior) along with the latent representation. Unlike previous work, we finetune not only the encoder/latents but the entire model, and - during finetuning - take into account both the effect of model quantization and the additional costs incurred by sending the model updates. We evaluate an image compression model on I-frames (sampled at 2 fps) from videos of the Xiph dataset, and demonstrate that full-model adaptation improves $RD$ performance by ~1 dB, with respect to encoder-only finetuning.

연구 동기 및 목표

단일 데이터 인스턴스에 맞춰 전체 압축 모델을 적응시키며 RD 성능 향상을 도모한다.
RD 손실에 모델 업데이트 비용과 양자화 오버헤드를 포함하도록 확장한다.
스파이크-앤-슬래브(priors)로 전체 모델 파인튜닝이 비트레이트를 줄이면서 I-프레임의 왜곡을 개선함을 Demonstrate한다.
모델 업데이트가 매개변수에 어떻게 분포하는지와 양자화가 성능에 어떤 영향을 주는지 분석한다.

제안 방법

모델 업데이트 delta를 모델 사전 p(delta)에서 파생된 모델 업데이트 비용 항term M을 포함하는 결합 RD 및 모델-레이트 손실 L_RDM을 형식화한다.
희소성을 장려하고 0 업데이트 signaling 비용을 줄이기 위해 스파이크-앤-슬래브 priors를 사용한다.
파인튜닝 동안 그래디언트에 대한 Straight-Through Estimation으로 bin width t인 delta를 양자화한다.
latent variables z와 양자화된 업데이트 delta를 p_theta(z)와 p([delta])의 사전으로 엔트로피 코딩으로 인코딩한다.
단일 I-프레임에서 전역 모델을 파인튜닝(전체 모델 적응)하고 비디오 내 다수 프레임에 걸쳐 모델-레이트 비용을 균등화한다.

실험 결과

연구 질문

RQ1단일 비디오 인스턴스에서 전체 모델 파인튜닝이 인코더-전용 파인튜닝이나 잠재적( latent) 만 적응에 비해 RD 성능을 향상시킬 수 있는가?
RQ2모델 업데이트 비용과 양자화 인식 학습의 도입이 인스턴스 적응 압축의 실제성 및 이익에 어떤 영향을 미치는가?
RQ3I-프레임에 적응할 때 매개변수 그룹 간의 업데이트 분포는 어떠하며, 스파이크-앤-슬래브 prior가 신호 비용에 어떤 영향을 주는가?
RQ4다양한 β 설정에서 어떤 RD 이득이 가능하고 파인튜닝 중 어떻게 진화하는가?

주요 결과

전체 모델 인스턴스 적응 파인튜닝은 Xiph-5N 2fps I-프레임에서 인코더-전용 파인튜닝 대비 같은 비트레이트에서 약 1 dB RD 이득을 제공합니다.
모델 업데이트 비용과 양자화를 고려하는 것이 필수적이며 이를 무시하면 비트레이트 증가가 악화되거나 무한대에 이를 수 있습니다.
스파이크-앤-슬래브 priors는 0 업데이트 signaling 비용을 줄이고 희소성을 촉진하여 매개변수를 업데이트해야 하는지 안내합니다.
대부분의 RD 이득은 파인튜닝 초기기에 나오며 지속되며, 더 높은 비트레이트 구간에서 더 효과적인 파인튜닝으로 인해 잠재-레이트 감소가 커집니다.
비트 할당 분석은 업데이트가 종종 양자화되고 양자화기에 의해 상한을 받으며, 제로 업데이트는 작은 정적 비용을 발생시킵니다.
인코더-전용 파인튜닝은 직접 잠재 최적화와 경쟁력 있게 수행되며, 이러한 실험에서 작은 상각 차이가 있음을 시사합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.