QUICK REVIEW

[论文解读] Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks

Joonyoung Yi, Juhyuk Lee|arXiv (Cornell University)|Apr 30, 2020

Domain Adaptation and Few-Shot Learning参考文献 49被引用 3

一句话总结

本文识别出变量稀疏性问题（VSP）——即模型性能随输入缺失率波动——是使用零填充进行缺失数据处理时神经网络表现不佳的关键原因。本文提出稀疏性归一化（SN）技术，通过纠正输入层面的稀疏性偏差，提升了多种基准测试中的模型准确率与训练稳定性。

ABSTRACT

Handling missing data is one of the most fundamental problems in machine learning. Among many approaches, the simplest and most intuitive way is zero imputation, which treats the value of a missing entry simply as zero. However, many studies have experimentally confirmed that zero imputation results in suboptimal performances in training neural networks. Yet, none of the existing work has explained what brings such performance degradations. In this paper, we introduce the variable sparsity problem (VSP), which describes a phenomenon where the output of a predictive model largely varies with respect to the rate of missingness in the given input, and show that it adversarially affects the model performance. We first theoretically analyze this phenomenon and propose a simple yet effective technique to handle missingness, which we refer to as Sparsity Normalization (SN), that directly targets and resolves the VSP. We further experimentally validate SN on diverse benchmark datasets, to show that debiasing the effect of input-level sparsity improves the performance and stabilizes the training of neural networks.

研究动机与目标

识别在使用零填充处理缺失数据时，神经网络性能下降的根本原因。
形式化变量稀疏性问题（VSP），即模型输出随输入缺失率显著变化的现象。
提出一种直接纠正神经网络训练中输入层面稀疏性偏差的方法。
在多种基准数据集上验证所提方法的有效性。

提出的方法

将变量稀疏性问题（VSP）作为理论框架，解释在不同缺失率下性能不稳定的成因。
提出稀疏性归一化（SN），通过归一化输入特征以减轻缺失值导致的稀疏性影响。
在训练过程中通过根据缺失模式调整特征尺度来应用SN，以稳定梯度与预测结果。
该方法直接作用于输入层，无需复杂的网络结构改动或额外参数。
SN设计简洁、高效，并与标准神经网络训练流程完全兼容。

实验结果

研究问题

RQ1为何零填充会导致神经网络训练表现次优？
RQ2输入数据中缺失率的变化如何影响模型的泛化能力与预测稳定性？
RQ3一种简单的归一化技术能否有效纠正由数据稀疏性引入的偏差？
RQ4通过去偏处理输入层面的稀疏性，是否能提升多样本数据集上的训练稳定性和模型准确率？

主要发现

变量稀疏性问题（VSP）被识别为使用零填充的模型性能下降的主要原因。
稀疏性归一化（SN）有效降低了输入层面稀疏性的影响，使模型预测更加稳定且准确。
SN通过纠正缺失数据引起的偏差，显著提升了多种基准数据集上的模型性能。
该方法稳定了训练动态，尤其在高缺失率或可变缺失率条件下表现突出。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。