[论文解读] AutoDis: Automatic Discretization for Embedding Numerical Features in CTR Prediction
AutoDis 提出了一种端到端可微分的框架,用于在点击率(CTR)预测中自动离散化数值特征,通过可学习的元嵌入(meta-embeddings)和可微分的聚合机制,联合优化离散化规则与 CTR 模型。该方法通过联合学习特征离散化与表征,提升了模型容量与性能,在公开数据集和工业数据集上均优于当前最先进(SOTA)的方法。
Learning sophisticated feature interactions is crucial for Click-Through Rate (CTR) prediction in recommender systems. Various deep CTR models follow an Embedding & Feature Interaction paradigm. The majority focus on designing network architectures in Feature Interaction module to better model feature interactions while the Embedding module, serving as a bottleneck between data and Feature Interaction module, has been overlooked. The common methods for numerical feature embedding are Normalization and Discretization. The former shares a single embedding for intra-field features and the latter transforms the features into categorical form through various discretization approaches. However, the first approach surfers from low capacity and the second one limits performance as well because the discretization rule cannot be optimized with the ultimate goal of CTR model. To fill the gap of representing numerical features, in this paper, we propose AutoDis, a framework that discretizes features in numerical fields automatically and is optimized with CTR models in an end-to-end manner. Specifically, we introduce a set of meta-embeddings for each numerical field to model the relationship among the intra-field features and propose an automatic differentiable discretization and aggregation approach to capture the correlations between the numerical features and meta-embeddings. Comprehensive experiments on two public and one industrial datasets are conducted to validate the effectiveness of AutoDis over the SOTA methods.
研究动机与目标
- 解决 CTR 模型中固定或非优化离散化规则带来的性能瓶颈问题。
- 通过与 CTR 模型端到端联合学习,提升数值特征的有效离散化策略以优化表征。
- 克服传统嵌入模块在数值特征字段中未能充分利用字段内特征关系的瓶颈。
- 设计一种可微分框架,联合优化离散化与特征交互,以提升 CTR 预测性能。
提出的方法
- 为每个数值字段引入元嵌入,以建模字段内的特征关系。
- 提出一种可微分的离散化机制,通过基于梯度的优化学习最优分箱规则。
- 采用可微分的聚合层,根据学习到的离散化结果将数值特征映射到元嵌入。
- 通过在离散化和聚合组件中反向传播梯度,实现端到端训练。
- 使用可学习的路由机制,以可微分方式将数值值分配到离散分箱。
- 将整个流程集成到深度 CTR 模型中,实现特征离散化与交互的联合优化。
实验结果
研究问题
- RQ1自动化的、可微分的数值特征离散化能否提升 CTR 预测性能?
- RQ2与手工设计或固定分箱策略相比,离散化规则的端到端优化有何优势?
- RQ3元嵌入在建模字段内数值特征关系方面能带来多大程度的提升?
- RQ4所提出的框架在多样化数据集(包括工业规模数据)上是否具备良好的泛化能力?
主要发现
- AutoDis 在两个公开的 CTR 预测数据集和一个工业数据集上均达到最先进性能。
- 该模型在 CTR 预测准确率上显著优于传统的归一化方法和固定离散化基线。
- 消融实验验证了元嵌入和可微分离散化组件对性能提升的贡献。
- 采用可微分离散化的端到端训练可带来更优的特征表征和更强的模型泛化能力。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。