QUICK REVIEW

[论文解读] Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback

Zheng Wen, Branislav Kveton|arXiv (Cornell University)|May 21, 2016

Advanced Bandit Algorithms Research参考文献 24被引用 51

一句话总结

该论文提出 IMLinUCB，一种计算高效的基于UCB的算法，用于在独立级联模型下、具有半-bandit反馈的在线影响最大化问题。其达到的遗憾界为多项式形式，依赖于网络拓扑结构和边激活概率，首次为该设置提供了理论保证，且通过线性泛化实现大规模网络的可扩展性。

ABSTRACT

We study the online influence maximization problem in social networks under the independent cascade model. Specifically, we aim to learn the set of "best influencers" in a social network online while repeatedly interacting with it. We address the challenges of (i) combinatorial action space, since the number of feasible influencer sets grows exponentially with the maximum number of influencers, and (ii) limited feedback, since only the influenced portion of the network is observed. Under a stochastic semi-bandit feedback, we propose and analyze IMLinUCB, a computationally efficient UCB-based algorithm. Our bounds on the cumulative regret are polynomial in all quantities of interest, achieve near-optimal dependence on the number of interactions and reflect the topology of the network and the activation probabilities of its edges, thereby giving insights on the problem complexity. To the best of our knowledge, these are the first such results. Our experiments show that in several representative graph topologies, the regret of IMLinUCB scales as suggested by our upper bounds. IMLinUCB permits linear generalization and thus is both statistically and computationally suitable for large-scale problems. Our experiments also show that IMLinUCB with linear generalization can lead to low regret in real-world online influence maximization.

研究动机与目标

解决社交网络中激活概率初始未知的在线影响最大化问题。
处理由指数级多组可能影响者集合引发的组合动作空间问题。
设计一种在有限半-bandit 反馈下运行的学习算法，仅观察到被影响的边。
开发一种兼具统计效率和计算可扩展性的方法，适用于大规模网络。
建立反映网络拓扑结构和边概率的理论遗憾界。

提出的方法

提出 IMLinUCB，一种基于UCB的算法，利用边特征上的线性泛化来建模激活概率。
通过协方差矩阵 $\mathbf{V}_t$ 的逆矩阵使用自归一化置信区间，以控制估计误差。
采用线性模型 $\mathbf{w}(e) = \langle \mathbf{x}_e, \theta^* \rangle$ 表示边激活概率，实现高效泛化。
通过文献[1]中的自归一化鞅不等式，对估计误差 $\| \overline{\theta}_t - \theta^* \|_{\mathbf{M}_t^{-1}}$ 建立高概率界。
定义最大观测相关性度量 $C_*$，以基于网络结构和边概率捕捉问题的复杂度。
采用半-bandit 反馈模型，其中智能体可观察每次扩散过程中被激活的边。

实验结果

研究问题

RQ1我们能否设计一种计算高效的在线学习算法，用于在独立级联模型下、具有半-bandit 反馈的影响力最大化问题？
RQ2在线影响力最大化问题的信息论复杂度是什么？其如何依赖于网络拓扑结构和边概率？
RQ3我们能否实现随问题参数多项式增长的遗憾界，并真实反映网络的复杂度？
RQ4线性泛化如何提升大规模影响力最大化中的统计效率和计算效率？
RQ5所提出的算法在不同网络拓扑结构下是否在实践中实现低遗憾？

主要发现

IMLinUCB 实现了累积遗憾界，其形式为所有相关参数（包括交互次数、网络规模和特征空间维度）的多项式函数。
遗憾界对交互次数的依赖接近最优，其对数因子与 bandit 设置中的已知下界一致。
该算法的遗憾界通过最大观测相关性度量 $C_*$ 反映了网络的拓扑结构和边激活概率，该度量量化了问题的复杂度。
实验结果表明，IMLinUCB 的遗憾在各种合成和真实图拓扑中均按理论边界预测的方式增长。
线性泛化使 IMLinUCB 在大规模真实影响力最大化任务中保持低遗憾，展示了实际可扩展性。
通过自归一化鞅不等式推导出参数估计误差的高概率置信区间，确保了序列学习中的鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。