QUICK REVIEW

[論文レビュー] Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data

Joseph Kang, Joseph L. Schafer|Apr 18, 2008

Advanced Causal Inference Techniques参考文献 44被引用数 141

ひとこと要約

この論文は、欠損データからの母平均推定における二重にロバスト（DR）推定量と他の手法を比較し、理論的に片方のモデルが誤っている場合に耐性を持つDR推定量が、両方のモデルが中程度に誤っている場合には、単純な回帰ベースの補完よりも一貫して優れているとは限らないことを示している。研究では、逆確率重み付け法が小さな適合確率スコアに敏感であることが明らかになった。シミュレーションでは、DR手法が基本的な回帰補完法を顕著に上回る例はなく、二つの誤ったモデルを組み合わせることでより良い推定が得られるという仮定に疑問を呈している。

ABSTRACT

When outcomes are missing for reasons beyond an investigator's control, there are two different ways to adjust a parameter estimate for covariates that may be related both to the outcome and to missingness. One approach is to model the relationships between the covariates and the outcome and use those relationships to predict the missing values. Another is to model the probabilities of missingness given the covariates and incorporate them into a weighted or stratified estimate. Doubly robust (DR) procedures apply both types of model simultaneously and produce a consistent estimate of the parameter if either of the two models has been correctly specified. In this article, we show that DR estimates can be constructed in many ways. We compare the performance of various DR and non-DR estimates of a population mean in a simulated example where both models are incorrect but neither is grossly misspecified. Methods that use inverse-probabilities as weights, whether they are DR or not, are sensitive to misspecification of the propensity model when some estimated propensities are small. Many DR methods perform better than simple inverse-probability weighting. None of the DR methods we tried, however, improved upon the performance of simple regression-based prediction of the missing values. This study does not represent every missing-data problem that will arise in practice. But it does demonstrate that, in at least some settings, two wrong models are not better than one.

研究の動機と目的

両方のアウトカムモデルと欠損モデルが中程度に誤っている状況下で、有限標本における二重にロバスト推定量の実用的性能を評価すること。
DR推定量を逆確率重み付けや回帰ベースの補完といった代替手法と比較すること。
二つの誤ったモデルを組み合わせることで、単一のモデルを使用するよりも推定が改善されるかどうかを調査すること。
現実の欠損データ設定においてDR手法が意味のある利点を示す条件を明確にすること。

提案手法

完全に欠損しているデータ（MAR）の下で母平均を推定するための複数の推定戦略を比較するためのシミュレーションスタディを実施。
アウトカム回帰モデルと適合確率スコアモデルを組み合わせたDR推定量を適用。いずれのモデルが正しく指定されていれば一貫性が保証される。
逆確率重み付け（IPW）とモデル補助アンケート推定技術を基準として用いる。
回帰ベースの補完（共変数を用いて欠損値を予測）をベースライン手法として使用。
繰り返しシミュレーションにおける平均二乗誤差（MSE）とバイアスを用いて性能を評価。
増強逆確率重み付けや一般化回帰推定量を含む、さまざまなDR推定量を検討。

実験結果

リサーチクエスチョン

RQ1アウトカムモデルと欠損モデルの両方が中程度に誤っている状況下で、二重にロバスト性が実用的利点に反映されるか？
RQ2モデル誤りの下で、DR推定量は逆確率重み付けや回帰補完法と比べて性能に優れるか？
RQ3IPWに基づく実装において、DR推定量は小さな推定適合確率スコアに敏感か？
RQ4二重モデル誤りの状況下で、特定のDR手法が単純な回帰補完法を顕著に上回るか？
RQ5二つの誤ったモデルを組み合わせることで、単一の正しいモデルを使用する場合よりも良い推定が得られる条件は何か？

主な発見

両方のモデルが中程度に誤っている状況下で、DR推定量の平均二乗誤差（MSE）は、単純な回帰ベースの補完法を一貫して上回らなかった。
逆確率重み付け法は、小さな推定適合確率スコアに強く影響を受けており、推定が不安定になった。
DR推定量の性能は実装方法によって変動したが、いずれの実装も回帰補完法を顕著に上回る改善を示さなかった。
片方のモデルが正しかった場合ですら、DR推定量が他の手法を一貫して上回らなかったため、中程度の誤りの下では実用的利点が限定的であることが示唆された。
本研究は、「二つの誤ったモデルは一つの正しいモデルより良い」という仮定に疑問を呈し、推定の信頼性にはモデルの正しさが依然として極めて重要であることを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。