QUICK REVIEW

[论文解读] Logs with zeros? Some problems and solutions

Jiafeng Chen, Jonathan Roth|arXiv (Cornell University)|Dec 12, 2022

Advanced Causal Inference Techniques被引用 40

一句话总结

论文表明，对于具有零值结果的对数型变换，ATEs 为 log-like transformations with zero-valued outcomes are arbitrarily scale-dependent and cannot be interpreted as unit-invariant percentage effects; 它概述了替代方法和一个三难困境，并给出实证说明。

ABSTRACT

When studying an outcome $Y$ that is weakly-positive but can equal zero (e.g. earnings), researchers frequently estimate an average treatment effect (ATE) for a "log-like" transformation that behaves like $\log(Y)$ for large $Y$ but is defined at zero (e.g. $\log(1+Y)$, $\mathrm{arcsinh}(Y)$). We argue that ATEs for log-like transformations should not be interpreted as approximating percentage effects, since unlike a percentage, they depend on the units of the outcome. In fact, we show that if the treatment affects the extensive margin, one can obtain a treatment effect of any magnitude simply by re-scaling the units of $Y$ before taking the log-like transformation. This arbitrary unit-dependence arises because an individual-level percentage effect is not well-defined for individuals whose outcome changes from zero to non-zero when receiving treatment, and the units of the outcome implicitly determine how much weight the ATE for a log-like transformation places on the extensive margin. We further establish a trilemma: when the outcome can equal zero, there is no treatment effect parameter that is an average of individual-level treatment effects, unit-invariant, and point-identified. We discuss several alternative approaches that may be sensible in settings with an intensive and extensive margin, including (i) expressing the ATE in levels as a percentage (e.g. using Poisson regression), (ii) explicitly calibrating the value placed on the intensive and extensive margins, and (iii) estimating separate effects for the two margins (e.g. using Lee bounds). We illustrate these approaches in three empirical applications.

研究动机与目标

Motivate the use of log-like transformations when outcomes can be zero and identify the interpretational issues that arise.
Show that ATEs for log-like transformations are scale-dependent when the extensive margin is affected by treatment.
Demonstrate a trilemma: no parameter is simultaneously an average of individual effects, unit-invariant, and point-identified under zero outcomes.
Propose alternative target parameters and estimation strategies for settings with extensive and intensive margins.
Illustrate approaches through empirical applications and discuss practical implications for researchers.

提出的方法

Define log-like transformations m(y) that behave like log(y) for large y but are defined at zero (e.g., log(1+y), arcsinh(y)).
Prove that if treatment affects the extensive margin (zero vs positive outcomes), the ATE for m(Y) can be scaled to any magnitude by rescaling units, implying unit dependence.
Establish a trilemma: no parameter can be an average of individual effects, unit-invariant, and point-identified simultaneously under zero-valued outcomes.
Discuss alternative parameters (e.g., ATE% in levels via Poisson regression, explicit calibration of margins, separate margins via bounds or additional assumptions).
Provide a blueprint for estimating alternative parameters and apply to three empirical settings (RCT, DiD, IV).

Figure 1: Change from multiplying outcome by 100 versus extensive margin effect

实验结果

研究问题

RQ1How do log-like transformations behave when outcomes can be zero under treatment vs control?
RQ2Can an ATE for log-like transformations be interpreted as a unit-invariant percentage effect when zeros and extensive margins are present?
RQ3What alternative causal parameters and estimation strategies are sensible when zero-valued outcomes and margins exist?
RQ4How do these alternatives perform in empirical applications with intensive and extensive margins?

主要发现

ATEs for log-like transformations are arbitrarily sensitive to the units of Y when the extensive margin is affected.
There is no treatment effect parameter that is an average of individual effects, unit-invariant, and point-identified when zeros are possible.
Researchers can instead use level-based percentages (e.g., ATE% in levels via Poisson regression) or explicitly calibrate the value on intensive vs extensive margins, or estimate separate margins.
Sensitivity analyses show that rescaling the outcome by a factor can change the ATE for log-like transforms by roughly the log of the scaling, driven by the extensive margin effect.
The paper provides empirical illustrations showing substantial scaling sensitivity in papers using arcsinh(Y) or log(1+Y) in AER publications.

Figure 2: Density of bribe amount in Sequeira ( 2016 )

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。