QUICK REVIEW

[论文解读] Vine copula based inference of multivariate event time data

Nicole Barthel, Candida Geerdens|arXiv (Cornell University)|Mar 4, 2016

Statistical Methods and Bayesian Inference参考文献 22被引用 2

一句话总结

本文提出了一种两阶段估计方法，用于建模多变量右删失事件时间数据的vine copula，其中首先对边际分布进行建模，然后通过使用数值积分处理删失的似然最大化来捕捉依赖结构。该方法在小样本情况下表现出色，并实现了高维生存数据中灵活且数据驱动的vine copula结构选择。

ABSTRACT

In many studies multivariate event time data are generated from clusters having a possibly complex association pattern. Flexible models are needed to capture this dependence. Vine copulas serve this purpose. Inference methods for vine copulas are available for complete data. Event time data, however, are often subject to right-censoring. As a consequence, the existing inferential tools, e.g. likelihood estimation, need to be adapted. A two-stage estimation approach is proposed. First, the marginal distributions are modeled. Second, the dependence structure modeled by a vine copula is estimated via likelihood maximization. Due to the right-censoring single and double integrals show up in the copula likelihood expression such that numerical integration is needed for its evaluation. For the dependence modeling a sequential estimation approach that facilitates the computational challenges of the likelihood optimization is provided. A three-dimensional simulation study provides evidence for the good finite sample performance of the proposed method. Using four-dimensional mastitis data, it is shown how an appropriate vine copula model can be selected for data at hand.

研究动机与目标

解决在右删失条件下建模多变量事件时间数据中复杂依赖结构的挑战。
在完整数据假设不成立（因删失）的情况下，为vine copula开发一种计算上可行的推断框架。
通过将vine copula技术适配至删失数据，实现在生存分析中灵活的高维依赖结构建模。
提供一种顺序估计方法，以减轻删失多变量生存模型中似然优化的计算负担。
通过在真实四维乳房炎数据上的模型选择与应用，展示该方法的实际效用。

提出的方法

在依赖结构建模之前，分别使用参数或非参数方法对事件时间的边际分布进行建模。
构建vine copula以表示多变量依赖结构，利用r-regular vines以灵活建模复杂的依赖模式。
推导删失数据的似然函数，由于右删失，该函数涉及单重和双重积分，需通过数值积分进行评估。
实施两阶段估计程序：首先估计边际参数，然后通过使用积分似然函数的极大似然法估计copula参数。
采用顺序估计方法，以简化高维似然函数的优化，从而提高计算效率。
应用模型选择准则（如AIC或BIC）以基于对观测数据的拟合度，识别最合适的vine copula结构。

实验结果

研究问题

RQ1当观测值受右删失影响时，如何将vine copula适配于多变量事件时间数据的建模？
RQ2何种估计策略能够在删失生存数据存在的情况下，实现vine copula的高效且准确的推断？
RQ3与替代方法相比，所提出的两阶段方法在有限样本中的表现如何？
RQ4顺序估计方法是否能有效降低删失vine copula模型中似然优化的计算复杂度？
RQ5何种准则与流程能够实现对真实世界多变量生存数据中合适vine copula结构的可靠选择？

主要发现

所提出的两阶段估计方法在三维模拟研究中表现出良好的小样本性能，表明即使在数据有限的情况下也具有鲁棒性。
数值积分对于评估右删失下的似然函数至关重要，该方法有效处理了由此产生的单重和双重积分。
顺序估计方法成功缓解了高维似然函数优化相关的计算挑战。
该方法支持有效的模型选择，如在四维乳房炎数据上的成功应用所示，成功识别出合适的vine copula结构。
该方法支持灵活的高维依赖结构建模，当依赖模式复杂时，其表现优于简单的参数模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。