[论文解读] An End-to-End Joint Learning Scheme of Image Compression and Quality Enhancement with Improved Entropy Minimization
JointIQ-Net 在级联中联合优化图像压缩与质量增强模块,利用高斯混合模型和全局上下文以实现改进的熵最小化,达到超越 VVC Intra 在 PSNR 和 MS-SSIM 上的最先进结果。
Recently, learned image compression methods have been actively studied. Among them, entropy-minimization based approaches have achieved superior results compared to conventional image codecs such as BPG and JPEG2000. However, the quality enhancement and rate-minimization are conflictively coupled in the process of image compression. That is, maintaining high image quality entails less compression and vice versa. However, by jointly training separate quality enhancement in conjunction with image compression, the coding efficiency can be improved. In this paper, we propose a novel joint learning scheme of image compression and quality enhancement, called JointIQ-Net, as well as entropy model improvement, thus achieving significantly improved coding efficiency against the previous methods. Our proposed JointIQ-Net combines an image compression sub-network and a quality enhancement sub-network in a cascade, both of which are end-to-end trained in a combined manner within the JointIQ-Net. Also the JointIQ-Net benefits from improved entropy-minimization that newly adopts a Gussian Mixture Model (GMM) and further exploits global context to estimate the probabilities of latent representations. In order to show the effectiveness of our proposed JointIQ-Net, extensive experiments have been performed, and showed that the JointIQ-Net achieves a remarkable performance improvement in coding efficiency in terms of both PSNR and MS-SSIM, compared to the previous learned image compression methods and the conventional codecs such as VVC Intra (VTM 7.1), BPG, and JPEG2000. To the best of our knowledge, this is the first end-to-end optimized image compression method that outperforms VTM 7.1 (Intra), the latest reference software of the VVC standard, in terms of the PSNR and MS-SSIM.
研究动机与目标
- 通过联合优化图像压缩和质量增强来提升编码效率。
- 开发一个端到端框架,能够在与压缩级联的情况下集成任何质量增强网络。
- 利用高斯混合模型和全局上下文信息改进熵建模。
- 证明联合训练在速率失真性能上优于分开训练。
提出的方法
- 提出 JointIQ-Net:一个由图像压缩子网络和质量增强子网络(GRDN)组成的级联,端到端联合训练。
- 采用改进的熵模型,对潜在表示 y_hat 使用高斯混合模型(GMM)先验,通过上下文感知的模型估计器 f 进行估计。
- 引入全局上下文 c''' 用于估计 GMM 参数,使用带有 MPRM 改进的专用全局上下文提取模块。
- 利用超先验 z_hat 和带有先验建模的自回归 y_hat|z_hat,以及一种密度卷积技巧来处理量化。
- 使用组合损失 L = R + lambda D 进行训练,其中 R 是通过学习的先验得到的速率,D 是相对于最终增强输出的失真。
- 实现对任何质量增强网络的灵活集成;在实验中,GRDN 与图像压缩网络级联,形成 JointIQ-Net。
实验结果
研究问题
- RQ1端到端联合优化图像压缩和质量增强是否能实现比分离训练的组件更好的速率失真性能?
- RQ2结合全局上下文的高斯混合模型先验是否能改善熵估计和编码效率?
- RQ3在 PSNR 与 MS-SSIM 上,联合方案相对于 VVC Intra、BPG、JPEG2000 以及先前的学习方法表现如何?
- RQ4GRDN、全局上下文、MPRM 和 GMM 对总体性能的相对贡献是什么?
- RQ5所提出的全局上下文机制是否能有效捕获非局部依赖以提高编码效率?
主要发现
- JointIQ-Net 在 Kodak PhotoCD 测试中的 PSNR 和 MS-SSIM 上超越了先前的学习方法和传统编解码器。
- 据称在 PSNR 和 MS-SSIM 上都超过 VVC Intra (VTM 7.1),标志着首个达到此水平的学习图像压缩方法。
- GMM-based priors with an enhanced estimator and global context provide coding gains over single-Gaussian models.
- A cascade with GRDN as the quality-enhancement module yields the best performance among tested configurations.
- Ablation studies show significant gains from GRDN and GMM; global context yields additional improvements, while MPRM aids higher bitrates.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。