Skip to main content
QUICK REVIEW

[论文解读] An Introduction to Proximal Causal Learning

Eric J. Tchetgen Tchetgen, Andrew Ying|arXiv (Cornell University)|Sep 23, 2020
Advanced Causal Inference Techniques参考文献 19被引用 35
一句话总结

本论文开发近端因果学习,以使用未测量混杂的代理变量从观测数据中识别因果效应,提出近端 g-公式与近端 g-计算算法。

ABSTRACT

A standard assumption for causal inference from observational data is that one has measured a sufficiently rich set of covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values. Skepticism about the exchangeability assumption in observational studies is often warranted because it hinges on investigators' ability to accurately measure covariates capturing all potential sources of confounding. Realistically, confounding mechanisms can rarely if ever, be learned with certainty from measured covariates. One can therefore only ever hope that covariate measurements are at best proxies of true underlying confounding mechanisms operating in an observational study, thus invalidating causal claims made on basis of standard exchangeability conditions. Causal learning from proxies is a challenging inverse problem which has to date remained unresolved. In this paper, we introduce a formal potential outcome framework for proximal causal learning, which while explicitly acknowledging covariate measurements as imperfect proxies of confounding mechanisms, offers an opportunity to learn about causal effects in settings where exchangeability on the basis of measured covariates fails. Sufficient conditions for nonparametric identification are given, leading to the proximal g-formula and corresponding proximal g-computation algorithm for estimation. These may be viewed as generalizations of Robins' foundational g-formula and g-computation algorithm, which account explicitly for bias due to unmeasured confounding. Both point treatment and time-varying treatment settings are considered, and an application of proximal g-computation of causal effects is given for illustration.

研究动机与目标

  • 动机并对观测性因果推断中的未测量混杂挑战进行形式化。
  • 引入一个近端框架,使用分为三种类型的代理来实现可识别性。
  • 在完备性条件下,通过近端 g-公式推导非参数识别结果。
  • 开发基于近端 g-计算算法的估计方法,适用于点治疗与时变治疗。

提出的方法

  • 定义代理类型(类型 a:共同原因;类型 b:治疗诱发混淆代理;类型 c:结果诱发混淆代理)。
  • 提出近端识别策略,用代理和完备性条件替代可交换性。
  • 推导近端 g-公式,作为 Robins’ g-公式在存在未测量混杂情形下的推广。
  • 引入结果混淆桥函数 h(a,x,w),通过求解一个积分方程来识别因果效应。
  • 表明估计可以通过近端 g-计算算法进行,在某些完备性假设下实现非参数识别。
  • 将该框架扩展到时变治疗和纵向数据,并具备相应的基于代理的完备性与桥函数条件。

实验结果

研究问题

  • RQ1在未测量混杂导致可交换性失败时,如何识别因果效应?
  • RQ2不同类型的代理能否通过近端 g-公式来恢复因果效应?
  • RQ3在点情形和纵向情形下,非参数识别所需的完备性和桥函数条件有哪些?

主要发现

  • 将代理分为类型 a、b、c,可以在对观测协变量的可交换性失败时识别因果效应。
  • 推导出近端 g-公式,作为 Robins’ g-公式在通过代理处理未测量混杂方面的推广。
  • 结果混淆桥函数 h(a,x,w) 求解一个 Fredholm 积分方程以识别因果效应。
  • 在完备性条件下,因果效应参数 beta(a) 可以通过近端 g-公式以非参数方式作为观测数据的函数来识别。
  • 该方法扩展到时变治疗和纵向数据,为复杂的纵向情景提供近端识别结果。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。