QUICK REVIEW

[论文解读] Online Influence Maximization (Extended Version)

Siyu Lei, Silviu Maniu|arXiv (Cornell University)|Jun 3, 2015

Spam and Phishing Detection参考文献 25被引用 42

一句话总结

本文提出在线影响最大化（OIM）框架，用于在初始影响概率未知时的社会网络传播影响最大化。该框架通过多轮试验中的探索-利用策略，迭代选择种子节点并利用真实反馈更新影响概率估计，显著优于传统离线方法在部分信息环境下的性能。

ABSTRACT

Social networks are commonly used for marketing purposes. For example, free samples of a product can be given to a few influential social network users (or "seed nodes"), with the hope that they will convince their friends to buy it. One way to formalize marketers' objective is through influence maximization (or IM), whose goal is to find the best seed nodes to activate under a fixed budget, so that the number of people who get influenced in the end is maximized. Recent solutions to IM rely on the influence probability that a user influences another one. However, this probability information may be unavailable or incomplete. In this paper, we study IM in the absence of complete information on influence probability. We call this problem Online Influence Maximization (OIM) since we learn influence probabilities at the same time we run influence campaigns. To solve OIM, we propose a multiple-trial approach, where (1) some seed nodes are selected based on existing influence information; (2) an influence campaign is started with these seed nodes; and (3) users' feedback is used to update influence information. We adopt the Explore-Exploit strategy, which can select seed nodes using either the current influence probability estimation (exploit), or the confidence bound on the estimation (explore). Any existing IM algorithm can be used in this framework. We also develop an incremental algorithm that can significantly reduce the overhead of handling users' feedback information. Our experiments show that our solution is more effective than traditional IM methods on the partial information.

研究动机与目标

解决用户间影响概率未知或不完整时的影响最大化挑战。
开发一种在运行影响传播活动时实时学习影响概率的框架。
提升在缺乏影响概率先验知识的真实场景中影响传播活动的有效性。
降低在重复传播活动期间更新影响概率估计的计算开销。

提出的方法

提出一种在线影响最大化（OIM）框架，该框架在多轮试验中运行，每轮包含选择阶段和执行阶段。
采用探索-利用策略：利用当前的影响概率估计选择种子节点，或通过置信区间探索不确定性较高的区域。
采用贝叶斯更新机制，结合共轭先验，以维护并细化影响概率的不确定性。
提出CB-INC算法，通过重用前序试验中的样本，降低图更新过程中的计算成本。
在选择阶段应用最先进的影响最大化算法（如 CELF、TIM+），并基于估计的影响概率进行操作。
基于每轮试验的用户反馈，使用最大似然估计（MLE）更新影响概率分布。

实验结果

研究问题

RQ1当初始影响概率未知或不完整时，能否有效执行影响最大化？
RQ2探索-利用策略如何在实时传播活动中平衡学习新影响概率与最大化影响传播之间的关系？
RQ3增量图更新技术对计算效率有何影响，是否在不影响影响传播范围的前提下实现优化？
RQ4试验次数和预算分配如何影响在线影响最大化的性能？

主要发现

在DBLP数据集上，CB-INC相比非增量CB算法将运行时间减少了最多16小时，当N ≥ 10时样本重用率达80–99%。
CB与CB-INC在不同τ值下，影响传播范围达到真实影响图（oracle）的3–15%之间，其中τ = 0.01时传播范围最佳。
使用CB-INC的OIM框架的影响传播范围始终接近oracle，尤其在试验次数增加时表现更优。
由于需要更多试验，k值越小效率越低，但随着k增大，CB-INC相比CB的性能提升更为显著。
较小的τ值（如0.01）可提升传播范围，但使效率降低28–38%，因需执行更严格的全局检查。
该框架在NetPHY、NetHEPT和DBLP数据集上，无论在影响传播范围还是效率方面，均显著优于启发式方法（Random、MaxDegree）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。