QUICK REVIEW

[论文解读] When Should We (Not) Interpret Linear IV Estimands as LATE?

Tymon Słoczyński|arXiv (Cornell University)|Nov 13, 2020

Advanced Causal Inference Techniques被引用 2

一句话总结

本文挑战了线性IV估计量作为条件局部平均处理效应（LATE）加权平均的标准解释，表明在弱单调性条件下，由于对工具变量效应施加了隐含的同质性限制，传统IV模型中可能出现负权重。本文证明，当强单调性不成立时，Angrist和Imbens（1995）提出的交互IV设定可避免负权重，是处理协变量依赖的工具变量效应时进行因果推断的稳健替代方案。

ABSTRACT

In this paper I revisit the interpretation of the linear instrumental variables (IV) estimand as a weighted average of conditional local average treatment effects (LATEs). I focus on a situation in which additional covariates are required for identification while the reduced-form and first-stage regressions may be misspecified due to an implicit homogeneity restriction on the effects of the instrument. I show that the weights on some conditional LATEs are negative and the IV estimand is no longer interpretable as a causal effect under a weaker version of monotonicity, i.e. when there are compliers but no defiers at some covariate values and defiers but no compliers elsewhere. The problem of negative weights disappears in the interacted specification of Angrist and Imbens (1995), which avoids misspecification and seems to be underused in applied work. I illustrate my findings in an application to the causal effects of pretrial detention on case outcomes. In this setting, I reject the stronger version of monotonicity, demonstrate that the interacted instruments are sufficiently strong for consistent estimation using the jackknife methodology, and present several estimates that are economically and statistically different, depending on whether the interacted instruments are used.

研究动机与目标

重新审视在需要协变量进行识别时，线性IV与2SLS估计量的因果可解释性。
研究在弱单调性下，当假设工具变量效应同质时，标准IV估计量是否仍为有效的因果总结。
在模型误设的情况下，比较传统IV与Angrist和Imbens（1995）提出的交互IV设定在性能和可解释性方面的表现。
为应用研究人员提供实用指导，说明在何种情况下应使用交互工具变量，以及如何在存在“大量工具变量”偏差的情况下一致地估计它们。

提出的方法

在条件LATE框架下，推导线性IV估计量中权重作为第一阶段和简化形式效应的函数。
将标准IV模型（假设工具变量效应同质）中的权重与Angrist和Imbens（1995）提出的交互设定（允许工具变量效应异质）中的权重进行比较。
通过理论分析表明，在弱单调性下，标准IV中可能出现负权重，但在交互设定中不会。
采用自举法估计量（如FEJIV、UJIVE、IJIVE）以校正交互模型中的大量工具变量偏差。
将方法应用于预审拘留数据集，检验单调性假设，并在不同设定间比较估计量。
使用基于自举的检验方法，评估非交互设定与交互设定下估计量是否存在显著差异。

实验结果

研究问题

RQ1在何种条件下，由于对条件LATE施加了负权重，标准线性IV估计量无法被解释为因果效应？
RQ2在弱单调性下，选择标准IV还是Angrist和Imbens（1995）提出的交互IV设定，如何影响估计量的可解释性与有效性？
RQ3在存在大量工具变量的情况下，现代自举估计量（如FEJIV）是否能在传统2SLS存在偏差时，仍提供一致的估计？
RQ4在实证应用中，非交互IV与交互IV设定下的估计量在统计上是否存在显著差异？
RQ5当强单调性不成立但弱单调性成立时，交互设定在多大程度上优于标准IV？

主要发现

在弱单调性下，标准IV估计量可能对部分条件LATE施加负权重，导致其无法作为因果效应进行解释。
相比之下，Angrist和Imbens（1995）提出的交互IV设定在相同假设下可确保所有权重非负，从而保持估计量的可解释性。
在预审拘留应用中，交互IV设定的估计结果与2SLS估计结果在统计上和经济上均显著不同。
针对监禁时长的自举检验在5%显著性水平上拒绝了非交互与交互设定下估计量相等的原假设，表明存在实质性差异。
FEJIV估计量成功校正了交互模型中的大量工具变量偏差，产生了在2SLS失效时仍可靠的估计结果。
Mikusheva和Sun（2022）提出的弱识别预检方法在判断交互模型的一致估计是否可行方面具有有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。