QUICK REVIEW

[论文解读] Generative Job Recommendations with Large Language Model

Zhi Zheng, Zhaopeng Qiu|arXiv (Cornell University)|Jul 5, 2023

Radiomics and Machine Learning in Medical Imaging被引用 15

一句话总结

论文提出了 GIRL，这是一个三阶段训练框架，使用 LLM 根据简历生成个性化职位描述，然后训练奖励模型并应用基于 PPO 的强化学习，以使输出与招聘方偏好保持一致，同时利用生成的 JD 加强传统的判别式职位推荐。

ABSTRACT

The rapid development of online recruitment services has encouraged the utilization of recommender systems to streamline the job seeking process. Predominantly, current job recommendations deploy either collaborative filtering or person-job matching strategies. However, these models tend to operate as "black-box" systems and lack the capacity to offer explainable guidance to job seekers. Moreover, conventional matching-based recommendation methods are limited to retrieving and ranking existing jobs in the database, restricting their potential as comprehensive career AI advisors. To this end, here we present GIRL (GeneratIve job Recommendation based on Large language models), a novel approach inspired by recent advancements in the field of Large Language Models (LLMs). We initially employ a Supervised Fine-Tuning (SFT) strategy to instruct the LLM-based generator in crafting suitable Job Descriptions (JDs) based on the Curriculum Vitae (CV) of a job seeker. Moreover, we propose to train a model which can evaluate the matching degree between CVs and JDs as a reward model, and we use Proximal Policy Optimization (PPO)-based Reinforcement Learning (RL) method to further fine-tine the generator. This aligns the generator with recruiter feedback, tailoring the output to better meet employer preferences. In particular, GIRL serves as a job seeker-centric generative model, providing job suggestions without the need of a candidate set. This capability also enhances the performance of existing job recommendation models by supplementing job seeking features with generated content. With extensive experiments on a large-scale real-world dataset, we demonstrate the substantial effectiveness of our approach. We believe that GIRL introduces a paradigm-shifting approach to job recommendation systems, fostering a more personalized and comprehensive job-seeking experience.

研究动机与目标

说明需要可解释的、基于生成的职位推荐超越传统判别方法的动机。
提出一个三步训练流程（SFT、奖励建模、强化学习）来训练一个 LLM 以从简历生成职位描述。
展示生成的 JD 如何作为解释以及提升下游推荐性能。
展示基于生成的推荐在冷启动场景下能够超越基线的判别模型，尤其是在 Dot 预测器中表现更明显。

提出的方法

将生成式职位推荐表述为从 CV C 生成 JD J' 的过程，使用基于 LLM 的生成器 G。
步骤 1：有监督微调（SFT），通过匹配的 CV-JD 对来教会 G 通过提示模板生成合适的 JD。
步骤 2：奖励模型训练（RMT），使用匹配/不匹配的数据对来预测简历与 JD 之间的招聘方风格匹配。
步骤 3：使用 PPO 的招聘者反馈进行强化学习（RLRF），以将 G 与招聘方偏好对齐，联合奖励包含 U(C,J') 和 KL-divergence 项。
生成增强的推荐：将生成的 JD 与传统编码器结合，通过在 MLP 或点积预测器中引入增强特征（J'、embedding）来改进排序。
关键方程包括：（1）SFT 损失 -log Pr(C|J,T,G)；（2）排序损失 L_rmt = log sigma(U(C,J^+) - U(C,J^−))；（3)-(8) 基于 PPO 的 actor-critic 更新，包含 KL 散度和优势估计。

实验结果

研究问题

RQ1RQ1：基于 LLM 的生成器是否能够为求职者生成高质量的 JD？
RQ2RQ2：生成的 JD 是否提升了判别式职位推荐模型的性能？
RQ3RQ3：所提出的 SFT、奖励建模和 RL 训练步骤是否有效？
RQ4RQ4：不同的生成设置（如生成 JD 的数量）如何影响性能和成本？

主要发现

模型	AUC (↑)	对数损失 (↓)
Base (MLP)	0.6349	0.4043
GIRL-SFT (MLP)	0.6438	0.3973 (+1.7%)
GIRL (MLP)	0.6476	0.3908 (+3.3%)
Base (Dot)	0.6258	0.4964
GIRL-SFT (Dot)	0.6291	0.3688 (+20.3%)
GIRL (Dot)	0.6436	0.3567 (+28.1%)

GIRL 与 GIRL-SFT 在生成质量方面优于基线，评估标准基于 ChatGPT 的标准（细节水平、相关性、简明性）。
在判别式推荐中，GIRL（带 RL）在 MLP 和 Dot 预测器上获得更高的 AUC 和更低的 LogLoss（例如，MLP 的 AUC 为 0.6476；Dot 的 AUC 为 0.6436）。
基于 RL 的微调（GIRL）在生成质量和推荐性能上均优于仅 SFT 的（GIRL-SFT）。
在冷启动条件下，使用生成的 JD 进行增强尤其有益，且在基于 Dot 的预测器中收益更大。
增加生成 JD 的数量在一定程度上可以提升性能，但达到某一点后计算成本上升，收益减小。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。