QUICK REVIEW

[论文解读] FAT-DeepFFM: Field Attentive Deep Field-aware Factorization Machine.

Junlin Zhang, Tongwen Huang|arXiv (Cornell University)|Jan 1, 2019

Recommender Systems and Techniques被引用 1

一句话总结

本文提出FAT-DeepFFM，一种用于点击率（CTR）预测的新型深度学习模型，将基于组合-激励网络的字段注意力机制（CENet）集成到DeepFFM中，以在显式特征交互之前动态突出显示信息丰富的特征。该方法通过在交互前强调特征重要性，在两个真实世界数据集上实现了最先进性能，优于现有模型，证明了交互前注意力比交互后注意力更有效。

ABSTRACT

Click through rate (CTR) estimation is a fundamental task in personalized advertising and recommender systems. Recent years have witnessed the success of both the deep learning based model and attention mechanism in various tasks in computer vision (CV) and natural language processing (NLP). How to combine the attention mechanism with deep CTR model is a promising direction because it may ensemble the advantages of both sides. Although some CTR model such as Attentional Factorization Machine (AFM) has been proposed to model the weight of second order interaction features, we posit the evaluation of feature importance before explicit feature interaction procedure is also important for CTR prediction tasks because the model can learn to selectively highlight the informative features and suppress less useful ones if the task has many input features. In this paper, we propose a new neural CTR model named Field Attentive Deep Field-aware Factorization Machine (FAT-DeepFFM) by combining the Deep Field-aware Factorization Machine (DeepFFM) with Compose-Excitation network (CENet) field attention mechanism which is proposed by us as an enhanced version of Squeeze-Excitation network (SENet) to highlight the feature importance. We conduct extensive experiments on two real-world datasets and the experiment results show that FAT-DeepFFM achieves the best performance and obtains different improvements over the state-of-the-art methods. We also compare two kinds of attention mechanisms (attention before explicit feature interaction vs. attention after explicit feature interaction) and demonstrate that the former one outperforms the latter one significantly.

研究动机与目标

通过在深度学习模型中增强显式特征交互前的特征重要性评估，提升CTR预测性能。
解决现有模型在高维输入空间中未能充分优先考虑信息特征的局限性。
设计一种新颖的注意力机制，根据其与预测任务的相关性自适应地加权字段。
通过实证比较在显式特征交互前与后应用注意力机制的有效性。
通过改进的特征表示与注意力机制整合，在真实世界CTR预测基准上实现最先进性能。

提出的方法

提出一种新型字段注意力机制——组合-激励网络（CENet），作为SENet的增强版本，用于建模字段级别的特征重要性。
将CENet集成到深度字段感知因子分解机（DeepFFM）框架中，以在显式特征交互前应用注意力机制。
采用两阶段注意力机制：首先压缩以聚合全局字段级信息，然后激励以学习每个字段的动态注意力权重。
将注意力权重应用于缩放字段嵌入，再在因子分解层中计算成对交互。
在CENet模块中引入残差连接与非线性变换，以提升表征能力。
使用随机梯度下降与Sigmoid交叉熵损失函数进行端到端模型训练，用于CTR预测。

实验结果

研究问题

RQ1能否通过在交互前选择性地强调信息特征，利用字段级注意力机制提升CTR预测性能？
RQ2在深度CTR模型中，交互前应用注意力与交互后应用注意力相比有何差异？
RQ3所提出的CENet机制在CTR建模任务中是否优于标准注意力机制（如SENet）？
RQ4将字段注意力集成到DeepFFM中是否能在真实世界数据集上实现最先进性能？
RQ5在高维CTR预测中，特征重要性加权对整体模型性能的贡献如何？

主要发现

FAT-DeepFFM在两个真实世界CTR预测数据集上的所有对比模型中表现最佳。
该模型显著优于最先进方法，在AUC和logloss指标上均表现出一致的性能提升。
在显式特征交互前应用注意力机制，其结果显著优于交互后应用，证实了早期特征加权的重要性。
所提出的CENet机制能有效学习突出信息丰富的字段并抑制不相关字段，从而提升模型泛化能力。
消融实验确认，字段注意力机制对整体性能提升有显著贡献。
该模型在不同数据分布下表现出强鲁棒性与泛化能力，表明其具有良好的实际应用潜力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。