QUICK REVIEW

[论文解读] Neural Factorization Machines for Sparse Predictive Analytics

Xiangnan He, Tat‐Seng Chua|arXiv (Cornell University)|Aug 16, 2017

Recommender Systems and Techniques参考文献 31被引用 188

一句话总结

NFM 将 Factorization Machines 的线性二阶交互与神经网络结合，用于稀疏数据建模更高阶、非线性特征交互；在较浅、可训练的体系结构下，优于 FM 并且与竞争性的深度模型相比具有优势。

ABSTRACT

Many predictive tasks of web applications need to model categorical variables, such as user IDs and demographics like genders and occupations. To apply standard machine learning techniques, these categorical predictors are always converted to a set of binary features via one-hot encoding, making the resultant feature vector highly sparse. To learn from such sparse data effectively, it is crucial to account for the interactions between features. Factorization Machines (FMs) are a popular solution for efficiently using the second-order feature interactions. However, FM models feature interactions in a linear way, which can be insufficient for capturing the non-linear and complex inherent structure of real-world data. While deep neural networks have recently been applied to learn non-linear feature interactions in industry, such as the Wide&Deep by Google and DeepCross by Microsoft, the deep structure meanwhile makes them difficult to train. In this paper, we propose a novel model Neural Factorization Machine (NFM) for prediction under sparse settings. NFM seamlessly combines the linearity of FM in modelling second-order feature interactions and the non-linearity of neural network in modelling higher-order feature interactions. Conceptually, NFM is more expressive than FM since FM can be seen as a special case of NFM without hidden layers. Empirical results on two regression tasks show that with one hidden layer only, NFM significantly outperforms FM with a 7.3% relative improvement. Compared to the recent deep learning methods Wide&Deep and DeepCross, our NFM uses a shallower structure but offers better performance, being much easier to train and tune in practice.

研究动机与目标

在稀疏、类别特征的交互建模方面，避免繁重的特征工程，推动更好的建模。
引入 Bi-Interaction 池化，作为 FM 的二阶交互在神经网络中的等价实现。
开发 Neural Factorization Machines (NFM)，通过非线性隐藏层来加深 FM 的表达能力。
在真实数据集上证明 NFM 相对于 FM、Wide&Deep、DeepCross 的有效性。

提出的方法

将每个特征通过嵌入层嵌入到密集向量中。
在嵌入空间应用 Bi-Interaction 池化，以捕捉特征的二阶交互。
在 Bi-Interaction 输出之上堆叠全连接层，以学习更高阶的交互。
使用一个预测层将最终隐藏表示映射到目标分数。
在没有隐藏层时，NFM-0 能够一般化地恢复 FM。
对 Bi-Interaction 层和隐藏层进行 dropout 正则化，并在 Bi-Interaction 及后续层后应用批量归一化。

实验结果

研究问题

RQ1Bi-Interaction 池化是否能够有效捕捉二阶特征交互？
RQ2NFM 的隐藏层是否提升对高阶交互的表达能力？
RQ3NFM 与高阶 FM 以及像 Wide&Deep、DeepCross 这样的前沿深度模型相比如何？
RQ4哪些优化与正则化策略（dropout、批量归一化）有助于 NFM 的训练？
RQ5FM 是否在 NFM 框架下成为一个特例？

主要发现

带有一个隐藏层的 NFM 在所测试任务上显著优于 FM，相对提升约为 7.3%。
NFM 在使用更浅、易于训练的结构时，达到与 Wide&Deep、DeepCross 相竞争甚至更好的表现。
Bi-Interaction 池化提供了一种线性时间复杂度的机制来建模二阶交互，便于在后续层学习更高阶的交互。
对 Bi-Interaction 层和隐藏层进行 dropout 有助于正则化 NFM，并且可能优于标准的 L2 正则化。
NFM-0 在不使用隐藏层时恰好恢复 FM，体现 FM 是 NFM 的一个特例。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。