QUICK REVIEW

[论文解读] Dynamic Partition of Complex Networks

Lin F. Yang, Vladimir Braverman|arXiv (Cornell University)|May 22, 2017

Human Mobility and Location-Based Analysis被引用 2

一句话总结

本文提出了一种随机广义赫布算法，用于通过随机游走观测实现大规模隐式网络的在线、可扩展分解与划分。该算法学习低维顶点表示，当底层马尔可夫过程满足可 lumpable 性质时，可实现网络划分的精确恢复，已在纽约曼哈顿出租车数据中成功发现与交通流对齐的城市分区。

ABSTRACT

Finding the reduced-dimensional structure is critical to understanding complex networks. Existing approaches such as spectral clustering are applicable only when the full network is explicitly observed. In this paper, we focus on the online factorization and partition of implicit large-scale networks based on observations from an associated random walk. We formulate this into a nonconvex stochastic factorization problem and propose an efficient and scalable stochastic generalized Hebbian algorithm. The algorithm is able to process dependent state-transition data dynamically generated by the underlying network and learn a low-dimensional representation for each vertex. By applying a diffusion approximation analysis, we show that the continuous-time limiting process of the stochastic algorithm converges globally to the principal components of the Markov chain and achieves a nearly optimal sample complexity. Once given the learned low-dimensional representations, we further apply clustering techniques to recover the network partition. We show that when the associated Markov process is lumpable, one can recover the partition exactly with high probability. We apply the proposed approach to model the traffic flow of Manhattan as city-wide random walks. By using our algorithm to analyze the taxi trip data, we discover a latent partition of the Manhattan city that closely matches the traffic dynamics.

研究动机与目标

解决在仅能获得随机游走部分依赖观测的情况下，大规模网络中学习低维表示的挑战。
开发一种无需完整网络可观测性的在线、可扩展网络分解与划分方法。
实现在近乎最优样本复杂度下，全局收敛至底层马尔可夫链的主成分。
在关联马尔可夫过程满足可 lumpable 性质时，实现网络划分的精确恢复。
对现实系统（如城市交通流）中的潜在结构分区进行建模与发现。

提出的方法

将网络分解问题建模为基于随机游走所获依赖状态转移数据的非凸随机优化任务。
采用随机广义赫布算法，利用流式转移数据迭代更新低维顶点表示。
应用扩散近似方法，证明该算法在连续时间极限下全局收敛至马尔可夫链的主成分。
在学习到的低维表示上应用聚类，以恢复网络的潜在分区结构。
依赖马尔可夫过程的可 lumpable 性质，确保以高概率实现精确的分区恢复。
在真实出租车行程数据上验证该方法，以建模曼哈顿的全市交通动态。

实验结果

研究问题

RQ1我们能否通过随机游走的流式依赖观测，学习大规模网络的低维表示？
RQ2所提出的随机算法是否能实现对底层马尔可夫链主成分的全局收敛？
RQ3该算法的样本复杂度是多少，其与最优值的接近程度如何？
RQ4在何种条件下，可从学习到的表示中精确恢复网络分区？
RQ5该方法能否在复杂系统（如城市交通网络）中发现有意义的真实世界分区？

主要发现

随机广义赫布算法在连续时间极限下，全局收敛至马尔可夫链的主成分。
通过扩散近似分析表明，该算法实现了近乎最优的样本复杂度。
当底层马尔可夫过程满足可 lumpable 性质时，该方法以高概率精确恢复真实网络分区。
学习到的低维表示成功捕捉了曼哈顿交通网络中的潜在社区结构。
所发现的分区与实际交通动态高度一致，经真实出租车行程数据验证。
该方法实现了无需完整网络可观测性的可扩展在线隐式网络分析。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。