Skip to main content
QUICK REVIEW

[论文解读] Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data

Zhiqiang Tan|arXiv (Cornell University)|Oct 23, 2017
Advanced Causal Inference Techniques参考文献 2被引用 17
一句话总结

本文提出了一种正则化校准估计量用于倾向得分,可在模型误设和高维协变量条件下改进逆概率加权(IPW)方法。通过使用一种新颖的Fisher评分下降算法,最小化带有Lasso惩罚的校准损失,该方法降低了IPW估计量的均方误差,并实现了高维一致性,在模拟和实证应用中优于最大似然估计和标准正则化方法。

ABSTRACT

Propensity score methods are widely used for estimating treatment effects from observational studies. A popular approach is to estimate propensity scores by maximum likelihood based on logistic regression, and then apply inverse probability weighted estimators or extensions to estimate treatment effects. However, a challenging issue is that such inverse probability weighting methods including doubly robust methods can perform poorly even when the logistic model appears adequate as examined by conventional techniques. In addition, there is increasing difficulty to appropriately estimate propensity scores when dealing with a large number of covariates. To address these issues, we study calibrated estimation as an alternative to maximum likelihood estimation for fitting logistic propensity score models. We show that, with possible model misspecification, minimizing the expected calibration loss underlying the calibrated estimators involves reducing both the expected likelihood loss and a measure of relative errors which controls the mean squared errors of inverse probability weighted estimators. Furthermore, we propose a regularized calibrated estimator by minimizing the calibration loss with a Lasso penalty. We develop a novel Fisher scoring descent algorithm for computing the proposed estimator, and provide a high-dimensional analysis of the resulting inverse probability weighted estimators of population means, leveraging the control of relative errors for calibrated estimation. We present a simulation study and an empirical application to demonstrate the advantages of the proposed methods compared with maximum likelihood and regularization.

研究动机与目标

  • 解决倾向得分模型即使在传统准则下拟合良好时,逆概率加权(IPW)方法性能仍不佳的问题。
  • 克服在协变量数量较大或与样本量相当的高维设定下,最大似然估计方法的局限性。
  • 开发一种理论基础坚实的正则化校准估计量,在模型误设和高维协变量条件下保持一致性和效率。
  • 对所得IPW估计量进行高维渐近分析,通过相对误差度量控制均方误差。
  • 提出一种新颖的Fisher评分下降算法,以高效计算正则化校准估计量。

提出的方法

  • 将校准估计表述为最小化校准损失,以强制处理组与总体样本协变量均值之间的平衡。
  • 在校准损失中引入Lasso惩罚,以实现变量选择并处理高维协变量。
  • 开发一种新颖的Fisher评分下降算法,以迭代方式计算正则化校准估计量,确保收敛性和计算可行性。
  • 校准损失同时最小化期望似然损失和一种相对误差度量,从而控制IPW估计量的均方误差。
  • 利用对相对误差的控制,推导出所得IPW估计量在高维渐近理论下的结果,适用于总体均值的估计。
  • 使用估计方程强制在处理组中实现加权协变量平衡,从而增强对模型误设的鲁棒性。

实验结果

研究问题

  • RQ1当存在模型误设时,校准估计是否能为倾向得分建模提供相对于最大似然估计的正式优势?
  • RQ2如何在p ≈ n或p > n的高维设定下扩展和分析校准估计?
  • RQ3模型误设对逆概率加权估计量性能有何影响?校准估计能否缓解这一问题?
  • RQ4正则化校准估计量能否在IPW估计中实现高维一致性并改善均方误差控制?
  • RQ5何种算法框架能够实现在高维设定下正则化校准估计量的高效计算?

主要发现

  • 校准估计量同时降低了期望似然损失和一种相对误差度量,后者可直接控制逆概率加权估计量的均方误差。
  • 所提出的正则化校准估计量在适当的正则性条件下,对IPW估计量的总体均值实现了高维一致性。
  • 新颖的Fisher评分下降算法确保了即使在高维设定下,正则化校准估计量的计算也稳定且高效。
  • 模拟研究和实证应用表明,所提出方法在偏差和均方误差方面优于最大似然估计和标准正则化方法。
  • 当倾向得分模型和结果回归模型均被误设时,该方法仍能保持一致性和鲁棒性,而双重稳健方法则不然。
  • 理论分析表明,该估计量通过控制相对误差来限制IPW估计量的均方误差,这在高维和误设设定下至关重要。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。