QUICK REVIEW

[论文解读] FCN: Fusing Exponential and Linear Cross Network for Click-Through Rate Prediction

Honghao Li, Yiwen Zhang|arXiv (Cornell University)|Jul 18, 2024

Radiomics and Machine Learning in Medical Imaging被引用 5

一句话总结

本文提出 DCNv3 和 SDCNv3，即显式特征交互网络，通过 Self-Mask 降噪过滤，特征交叉阶次呈指数增长，并配备 Tri-BCE 监督，在六个 CTR 数据集上无需依赖 DNN 的隐式交互即可达到 state-of-the-art。

ABSTRACT

As an important modeling paradigm in click-through rate (CTR) prediction, the Deep & Cross Network (DCN) and its derivative models have gained widespread recognition primarily due to their success in a trade-off between computational cost and performance. This paradigm employs a cross network to explicitly model feature interactions with linear growth, while leveraging deep neural networks (DNN) to implicitly capture higher-order feature interactions. However, these models still face several key limitations: (1) The performance of existing explicit feature interaction methods lags behind that of implicit DNN, resulting in overall model performance being dominated by the DNN; (2) While these models claim to capture high-order feature interactions, they often overlook potential noise within these interactions; (3) The learning process for different interaction network branches lacks appropriate supervision signals; and (4) The high-order feature interactions captured by these models are often implicit and non-interpretable due to their reliance on DNN. To address the identified limitations, this paper proposes a novel model, called Fusing Cross Network (FCN), along with two sub-networks: Linear Cross Network (LCN) and Exponential Cross Network (ECN). FCN explicitly captures feature interactions with both linear and exponential growth, eliminating the need to rely on implicit DNN. Moreover, we introduce the Self-Mask operation to filter noise layer by layer and reduce the number of parameters in the cross network by half. To effectively train these two cross networks, we propose a simple yet effective loss function called Tri-BCE, which provides tailored supervision signals for each network. We evaluate the effectiveness, efficiency, and interpretability of FCN on six benchmark datasets. Furthermore, by integrating LCN and ECN, FCN achieves a new state-of-the-art performance.

研究动机与目标

在超越传统基于 DNN 的隐式交互的前提下，推动可解释的 CTR 模型具备显式特征交互。
提出 Deep Crossing (DCNv3) 以指数方式增长交叉阶以实现真正的深度交叉。
提出 Shallow & Deep Cross Network v3 (SDCNv3) 以融合低阶与高阶显式交互。
引入 Self-Mask 以筛选噪声并在交叉网络中减少参数。
开发 Tri-BCE 损失以为子网络提供自适应监督信号。

提出的方法

将多字段分类输入嵌入并重塑为两种视图，通过分块实现共享 Cross & Masked 向量。
定义 Deep Crossing (DCNv3) 通过在每一层使用交叉向量和带掩蔽的拼接以指数方式增长交叉阶。
引入 Self-Mask: Mask(c_l) = c_l ⊙ max(0, LayerNorm(c_l)) 以过滤噪声并将参数减半。
开发 Shallow & Deep Cross Network v3 (SDCNv3)，将浅层与深层显式交叉结合，配合 Self-Mask 和并行融合方案。
提出 Tri-BCE 损失：L_Tri = L + w_D L_D + w_S L_S，其中自适应权重 w_D = max(0, L_D − L) 和 w_S = max(0, L_S − L)。
提供一个复杂度对比，展示显式仅有的 DCNv3/SDCNv3 在参数和计算性能上的优势。

实验结果

研究问题

RQ1RQ1 DCNv3 和 SDCNv3 是否在大规模数据集上优于其他 CTR 模型？
RQ2RQ2 与竞争的 CTR 模型相比，DCNv3 和 SDCNv3 的效率是否更高？
RQ3RQ3 SDCNv3 是否提供可解释性和降噪能力？
RQ4RQ4 不同配置如何影响模型性能与训练？

主要发现

SDCNv3 在六个数据集上均取得最佳性能，在 PapersWithCode 的 Criteo、KDD12 和 KKBox 基准中排名第一。
DCNv3 相对于强基线在显式交互方面表现出色，在 Avazu 和 Criteo 上的 Logloss 和 AUC 方面实现可衡量的改进。
SDCNv3 仅使用显式特征交互就达到了最先进的结果，凸显在 Tri-BCE 监督配合下显式交叉的有效性。
Tri-BCE 为子网络提供自适应监督信号，提升训练动态和最终预测效果。
模型在效率-复杂度方面表现出有利的权衡，Self-Mask 减少参数数量并避免了重型隐式-DNN 组件。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。