QUICK REVIEW

[论文解读] Regularized Estimation and Testing for High-Dimensional Multi-Block Vector-Autoregressive Models

Jiahe Lin, George Michailidis|arXiv (Cornell University)|Aug 19, 2017

Complex Systems and Time Series Analysis参考文献 45被引用 35

一句话总结

本文提出了一种具有块间格兰杰因果排序的高维多区块向量自回归（MB-VAR）模型，采用正则化最大似然估计和具有收敛性保证的迭代算法。该研究建立了估计量的理论一致性，并开发了一项检验程序以检测块间的格兰杰因果关系，其有效性在合成数据及S&P100/宏观数据（2001–2016）上得到验证。

ABSTRACT

Dynamical systems comprising of multiple components that can be partitioned into distinct blocks originate in many scientific areas. A pertinent example is the interactions between financial assets and selected macroeconomic indicators, which has been studied at aggregate level---e.g. a stock index and an employment index---extensively in the macroeconomics literature. A key shortcoming of this approach is that it ignores potential influences from other related components (e.g. Gross Domestic Product) that may exert influence on the system's dynamics and structure and thus produces incorrect results. To mitigate this issue, we consider a multi-block linear dynamical system with Granger-causal ordering between blocks, wherein the blocks' temporal dynamics are described by vector autoregressive processes and are influenced by blocks higher in the system hierarchy. We derive the maximum likelihood estimator for the posited model for Gaussian data in the high-dimensional setting based on appropriate regularization schemes for the parameters of the block components. To optimize the underlying non-convex likelihood function, we develop an iterative algorithm with convergence guarantees. We establish theoretical properties of the maximum likelihood estimates, leveraging the decomposability of the regularizers and a careful analysis of the iterates. Finally, we develop testing procedures for the null hypothesis of whether a block "Granger-causes" another block of variables. The performance of the model and the testing procedures are evaluated on synthetic data, and illustrated on a data set involving log-returns of the US S&P100 component stocks and key macroeconomic variables for the 2001--16 period.

研究动机与目标

解决聚合水平VAR模型忽略主系统外部关键变量的问题，从而导致推断偏差。
对变量被划分为具有层次性、单向格兰杰因果影响的块的复杂动态系统进行建模。
在稀疏性假设下，为高维MB-VAR模型开发正则化最大似然估计框架。
在高维渐近条件下，建立估计量的理论性质，包括一致性与收敛速率。
设计一种正式的假设检验程序，以判断一个块是否格兰杰因果影响另一个块，从而实现在多变量时间序列中的因果推断。

提出的方法

构建具有块间有向无环结构的递归多区块VAR模型，假设存在格兰杰因果排序。
对每个块的系数矩阵应用分块正则化（如组lasso或融合lasso），以在高维设置下诱导稀疏性。
开发一种迭代优化算法（如块坐标下降）以最大化正则化对数似然，具有收敛性保证。
利用正则化项的可分解性及集中不等式，分析迭代过程的行为并推导误差界。
基于原假设下检验统计量的渐近分布，构建用于检测块间格兰杰因果关系的似然比检验。
运用高维随机矩阵理论，对精度矩阵和协方差估计量的估计误差进行界约束。

实验结果

研究问题

RQ1在高维渐近条件下，具有层次格兰杰因果排序的高维多区块VAR模型能否实现一致估计？
RQ2所提出的正则化最大似然估计量是否在系数矩阵和精度矩阵上达到最优收敛速率？
RQ3用于估计的迭代算法能否保证收敛至驻点？
RQ4所提出的检验程序在高维设置下是否能有效检测块间的格兰杰因果关系？
RQ5引入额外块（如GDP、M2）如何影响关于金融资产与宏观经济指标之间因果关系的推断？

主要发现

正则化最大似然估计量的Frobenius范数误差率达到 $ O\big(\tfrac{\text{log}(p_1 + p_2) + \text{log} p_2}{T}\big) $，与初始迭代的误差率一致，表明收敛稳定。
迭代算法以高概率全局收敛，且各次迭代中估计参数的误差界得到统一控制。
块间格兰杰因果关系的检验统计量在原假设下渐近服从卡方分布，从而支持有效推断。
理论分析证实，该估计量在稀疏性与高维尺度下具有一致性，且正则化项促进了块内稀疏性。
在合成数据上的实证评估验证了该方法恢复真实因果结构的能力，并有效控制了第一类错误率。
在S&P100股票收益与宏观经济变量（2001–2016）上的应用揭示，宏观经济指标对金融资产收益具有显著的格兰杰因果影响，其中GDP与M2表现出强大的预测能力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。