QUICK REVIEW

[论文解读] Reliable Decision Support using Counterfactual Models

Peter Schulam, Suchi Saria|arXiv (Cornell University)|Mar 30, 2017

Complex Systems and Decision Making被引用 94

一句话总结

本文提出反事实高斯过程（CGP），用于从观测时间序列数据预测在一系列行动下的反事实结果，解决来自行动影响训练策略的偏差，并实现可靠的风险评估和“假设-情景”推理以进行个体化治疗计划。

ABSTRACT

Decision-makers are faced with the challenge of estimating what is likely to happen when they take an action. For instance, if I choose not to treat this patient, are they likely to die? Practitioners commonly use supervised learning algorithms to fit predictive models that help decision-makers reason about likely future outcomes, but we show that this approach is unreliable, and sometimes even dangerous. The key issue is that supervised learning algorithms are highly sensitive to the policy used to choose actions in the training data, which causes the model to capture relationships that do not generalize. We propose using a different learning objective that predicts counterfactuals instead of predicting outcomes under an existing action policy as in supervised learning. To support decision-making in temporal settings, we introduce the Counterfactual Gaussian Process (CGP) to predict the counterfactual future progression of continuous-time trajectories under sequences of future actions. We demonstrate the benefits of the CGP on two important decision-support tasks: risk prediction and "what if?" reasoning for individualized treatment planning.

研究动机与目标

动机：由于训练数据中的行动策略偏差，说明标准监督学习在决策支持中的不可靠性。
提出反事实预测作为目标，以提高跨策略的泛化能力。
介绍并给出在行动序列下的连续时间轨迹的反事实高斯过程（CGP）的形式化。
通过带标记的点过程，从观測轨迹中学习CGP，提出一个修正的最大似然目标。
展示CGP在可靠的风险预测以及促进面向个体化治疗计划的“what-if”推理方面的作用。

提出的方法

在连续时间框架中，对集合 C 内的动作 a 建模反事实 Y[a]。
使用带标记的点过程（MPP）将包含行动和结果的不规则时间序列表示为数据。
用条件于历史和行动的高斯过程（GP）来参数化结果模型，同时用事件/行动模型来捕捉时序与选择。
推导一个修正的最大似然目标，通过 MPP 强度和历史来考虑行动策略（公式3）。
施加连续时间无未观测混杂变量（NUC）和非信息性测量时间假设，将 CGP 与目标反事实联系起来（假设3和假设4）。
通过最大化观测轨迹似然来估计 CGP 参数，然后使用 CGP 预测用于决策支持任务的反事实轨迹 Y[s][a]。

实验结果

研究问题

RQ1基于反事实的学习能否产生对用于收集训练数据的行动策略具有鲁棒性的预测？
RQ2如何在连续时间中对未来行动序列下的反事实轨迹进行可靠预测？
RQ3在具有策略驱动观测的时间序列数据中，反事实模型是否比标准监督模型提供更可靠的风险评估？
RQ4CGP 能否支持医学数据中面向个体化治疗计划的“what-if”推理？
RQ5将从观测轨迹学习到的 CGP 与真实反事实模型联系起来，需要哪些假设？

主要发现

基线 GP A	CGP A	基线 GP B	CGP B	基线 GP C	CGP C
0.000	0.000	0.083	0.001	0.162	0.128
1.000	1.000	0.857	0.998	0.640	0.562
0.853	0.872	0.832	0.872	0.806	0.829

CGP 的风险预测对训练数据策略稳定；与按策略变化的基线 GP 模型不同。
在仿真数据中，CGP 风险分数在满足核心假设的各个情形下显示几乎相同的排序和 AUC，而基线 GP 则不然。
核心假设（连续时间 NU C、非信息性测量时间）的违反会使 CGP 的稳定性下降，类似于基线模型。
在 ICU 数据中，CGP 使透析治疗对肌酐的影响具有定性反事实推理，并在预测 MAE 上优于基线（24 小时：0.39，基线更高；24-48 小时：0.62）。
CGP 通过对结果模型使用高斯过程混合来建模治疗效应和异质性而获益。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。