QUICK REVIEW

[论文解读] The Missing Covariate Indicator Method is Nearly Valid Almost Always

Xu, Gang, Mingyang Song|arXiv (Cornell University)|Oct 30, 2021

Statistical Methods in Epidemiology被引用 33

一句话总结

论文推导了 Missing Covariate Indicator Method (MCIM) 何时会偏倚，并在典型流行病学情境下显示其几乎有效；当缺失与结果独立且协变量作为非混杂的风险因素时，偏倚在很大程度上被避免。

ABSTRACT

Background: Although the missing covariate indicator method (MCIM) has been shown to be biased under extreme conditions, the degree and determinants of bias have not been formally assessed. We derived the formula for the relative bias in the MCIM and systematically investigated conditions under which bias arises. We found that the extent of bias is independent of both the disease rate and the exposure-outcome association, but it is a function of 5 parameters: exposure and covariate prevalences, covariate missingness proportion, and associations of covariate with exposure and outcome. The MCIM was unbiased when the missing covariate is a risk factor for the outcome but not a confounder. The average median relative bias was zero across each of the parameters over a wide range of values considered. Our simulation study demonstrated that the mean and median of relative bias of MCIM was comparable to that of the no missingness method, which used the full sample with complete information for all variables, as long as the missingness of covariate is independent of the outcome. When missingness was no greater than 50%, less than 5% of the scenarios considered had relative bias greater than 10%. In several analyses of the Harvard cohort studies, the MCIM produced materially the same results as the multiple imputation method. In conclusion, the MCIM is nearly valid almost always in settings typically encountered in epidemiology and its continued use is recommended, unless the covariate is missing in an extreme proportion or acts as a strong confounder.

研究动机与目标

在不同缺失数据条件下评估 Missing Covariate Indicator Method (MCIM) 的偏倚。
推导相对偏倚的公式并识别偏倚的关键决定因素。
在模拟和实际队列分析中评估 MCIM 相对于完整信息和多重插补的表现。

提出的方法

将 MCIM 的相对偏债公式表示为五个参数的函数：暴露流行度、协变量流行度、协变量缺失比例，以及协变量与暴露和结局的关联。
描述当 MCIM 无偏时的情形（协变量是结局的风险因素但不是混杂因素）。
进行仿真研究，将 MCIM 在不同情景下与无缺失方法进行比较。
将 MCIM 应用于哈佛队列分析以与多重插补结果进行比较。

实验结果

研究问题

RQ1在何种条件下 MCIM 会引入偏倚，且偏倚可能有多大？
RQ2五个参数（暴露/协变量流行度、缺失比例，以及协变量-暴露/结局的关联）如何影响偏倚？
RQ3当缺失的协变量是结局的风险因素但不是混杂因素时，MCIM 是否无偏？
RQ4在模拟和真实数据中，MCIM 与完整数据方法及多重插补相比的表现如何？

主要发现

MCIM 的偏倚是五个参数的函数，并且与疾病发生率和暴露-结局关联无关。
当缺失的协变量是结局的风险因素但不是混杂因素时，MCIM 无偏。
在广泛的参数值范围内，平均中位相对偏倚为零。
在仿真中，只要缺失与结局独立，MCIM 的均值和中位相对偏倚与无缺失方法相当。
当缺失不超过 50% 时，只有不到 5% 的情景相对偏倚 >10%。
在哈佛队列分析中，MCIM 的结果与多重插补在实质上相似。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。