QUICK REVIEW

[论文解读] Mitigating Artifacts in Pre-quantization Based Scientific Data Compressors with Quantization-aware Interpolation

Pu Jiao, Sheng Di|arXiv (Cornell University)|Feb 23, 2026

Distributed and Parallel Computing Systems被引用 0

一句话总结

引入了一种后解压感知量化的插值，以减轻基于量化前数据压缩器的伪影，在不改变压缩吞吐量的情况下提升数据质量。

ABSTRACT

Error-bounded lossy compression has been regarded as a promising way to address the ever-increasing amount of scientific data in today's high-performance computing systems. Pre-quantization, a critical technique to remove sequential dependency and enable high parallelism, is widely used to design and develop high-throughput error-controlled data compressors. Despite the extremely high throughput of pre-quantization based compressors, they generally suffer from low data quality with medium or large user-specified error bounds. In this paper, we investigate the artifacts generated by pre-quantization based compressors and propose a novel algorithm to mitigate them. Our contributions are fourfold: (1) We carefully characterize the artifacts in pre-quantization based compressors to understand the correlation between the quantization index and compression error; (2) We propose a novel quantization-aware interpolation algorithm to improve the decompressed data; (3) We parallelize our algorithm in both shared-memory and distributed-memory environments to obtain high performance; (4) We evaluate our algorithm and validate it with two leading pre-quantization based compressors using five real-world datasets. Experiments demonstrate that our artifact mitigation algorithm can effectively improve the quality of decompressed data produced by pre-quantization based compressors while maintaining their high compression throughput.

研究动机与目标

表征由基于量化前的压缩器产生的伪影，以理解量化索引相关误差。
开发不影响压缩吞吐量的后解压伪影抑制方法。
设计可应用于多种量化前压缩器的感知量化插值算法。
在共享内存和分布式内存环境中并行化伪影抑制。
在使用领先的量化前压缩器的真实数据集上验证该方法。

提出的方法

表征量化索引与解压缩误差之间的相关性，包括量化边界附近符号与幅度的行为。
开发一个感知量化插值算法，利用欧氏距离变换计算到量化边界和符号翻转边界的距离，并进行两点插值，采用逆距离加权。
在解压缩后应用补偿，将插值后的误差校正添加到数据中，以在不触及压缩阶段的情况下提高数据质量。
使用OpenMP在共享内存中并行化框架，使用MPI在分布式内存中实现并行化，包括在处理器边界附近的优化。
证明对多种量化前压缩器的适用性，如cuSZ和cuSZp2，在五个真实数据集上。

实验结果

研究问题

RQ1在基于量化前的科学数据压缩器中，主要的伪影模式是什么？
RQ2后解压的量化感知插值能否在不改变压缩吞吐量的情况下抑制伪影？
RQ3如何利用 EDT 和符号传播在多维度上准确插值并补偿解压缩数据？
RQ4所提出的方法是否可推广到领先的量化前压缩器和真实数据集？
RQ5在数据质量（如 SSIM/PSNR）和有效压缩增益方面，抑制伪影的量化提升有何量化值？

主要发现

伪影抑制显著提升了基于量化前的压缩器的数据质量。
该方法在解压缩数据上实现了最高达 108.33% 的 SSIM 提升。
在与匹配 SSIM 时，对 cuSZ 和 cuSZp 分别实现了高达 1.17× 和 1.34× 的压缩比增益。
感知量化插值通过仅在解压数据上工作且不修改压缩管线，维持了高吞吐量。
该框架支持共享内存和分布式内存并行化，以保持可扩展性和性能。
该方法还支持在高吞吐科学工作流中通过改进的 PSNR 放宽误差边界。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。