[Paper Review] Compression Artifacts Removal Using Convolutional Neural Networks
This paper trains large deep CNNs with residual learning and skip connections to remove JPEG compression artifacts, achieving state-of-the-art results versus AR-CNN, SA-DCT, and spp across PSNR, PSNR-B, and SSIM on standard datasets.
This paper shows that it is possible to train large and deep convolutional neural networks (CNN) for JPEG compression artifacts reduction, and that such networks can provide significantly better reconstruction quality compared to previously used smaller networks as well as to any other state-of-the-art methods. We were able to train networks with 8 layers in a single step and in relatively short time by combining residual learning, skip architecture, and symmetric weight initialization. We provide further insights into convolution networks for JPEG artifact reduction by evaluating three different objectives, generalization with respect to training dataset size, and generalization with respect to JPEG quality level.
Motivation & Objective
- Motivate image restoration of JPEG artifacts using convolutional networks beyond small architectures.
- Develop and evaluate deep FCN architectures with residual and skip connections for artifact removal.
- Investigate how initialization, learning objectives, and training strategies affect convergence and performance.
- Assess generalization across JPEG quality levels and training dataset sizes.
Proposed method
- Use fully convolutional networks (L4 and L8) with 4 and 8 layers respectively.
- Adopt residual learning by predicting image residuals rather than direct mappings.
- Incorporate a skip architecture by concatenating early-layer activations to deeper layers.
- Experiment with three objectives: direct mapping, residual learning, and edge-preserving loss (Sob severity).
- Center filters during initialization to achieve symmetric weight initialization and enable higher learning rates.
- Train on BSDS500 (400 images) and evaluate on LIVE1 and BSDS500 validation sets using PSNR, PSNR-B, and SSIM.
- Compare against SOTA methods (AR-CNN, SA-DCT, spp) and analyze generalization across JPEG qualities and dataset sizes.
Experimental results
Research questions
- RQ1Can large, deep CNNs surpass previous state-of-the-art methods for JPEG artifact removal?
- RQ2What is the impact of residual learning versus direct mapping versus edge-preserving loss on reconstruction quality?
- RQ3How well do networks generalize across different JPEG quality levels and training data sizes?
- RQ4Does network architecture (L4 vs L8) and initialization affect training speed and performance?
- RQ5What are the trade-offs in computation speed and parameter count for practical deployment?
Key findings
- L8 residual network outperforms all other methods on LIVE1 and BSDS500 in PSNR, PSNR-B, and SSIM across tested qualities. (Tables 3 and 4)
- Residual learning converges faster than direct mapping, enabling training of deeper networks (e.g., 8 layers) with reasonable iterations (250k). (Figure 6, Table 5)
- L4 (smaller network) generalizes well with 400-image BSD data, often outperforming competing methods while being more efficient.
- Edge-preserving loss did not noticeably improve results for L4 compared to residual learning (Table 5).
- Training speed on a GTX 780 with cuDNN: L4 processes 1 MPx in 220 ms; L8 processes in 1052 ms, with L4 having ~140k FLOPs per pixel and L8 ~440k per pixel.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.