[论文解读] Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification
本文研究将大型语言模型特征与传统机器学习模型相结合,用于分类交通事件严重性,在三个国际数据集上比较多种LLM和ML算法。随机森林和XGBoost在具有LLM特征时,往往能与传统特征工程相匹配或带来改进。
This research showcases the innovative integration of Large Language Models into machine learning workflows for traffic incident management, focusing on the classification of incident severity using accident reports. By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of severity classification across several machine learning algorithms. Our contributions are threefold. First, we present an extensive comparison of various machine learning models paired with multiple large language models for feature extraction, aiming to identify the optimal combinations for accurate incident severity classification. Second, we contrast traditional feature engineering pipelines with those enhanced by language models, showcasing the superiority of language-based feature engineering in processing unstructured text. Third, our study illustrates how merging baseline features from accident reports with language-based features can improve the severity classification accuracy. This comprehensive approach not only advances the field of incident management but also highlights the cross-domain application potential of our methodology, particularly in contexts requiring the prediction of event outcomes from unstructured textual data or features translated into textual representation. Specifically, our novel methodology was applied to three distinct datasets originating from the United States, the United Kingdom, and Queensland, Australia. This cross-continental application underlines the robustness of our approach, suggesting its potential for widespread adoption in improving incident management processes globally.
研究动机与目标
- 评估来自完整文本事件报告的LLM提取特征是否能在严重性分类上优于传统特征工程。
- 评估LLM与机器学习模型的组合在事件严重性预测中的效果。
- 确定将基线特征与NLP派生特征结合是否能提升预测准确性。
提出的方法
- 将事故报告转换为将列名和值结合的全文表示。
- 使用各种LLMs(如BERT变体、XLNet、RoBERTa、ALBERT)从全文描述中提取数值特征。
- 在基线、NLP和组合特征集上训练并比较ML模型(XGBoost、LightGBM、Random Forest、KNN)。
- 通过等量采样平衡数据集并去除零方差特征。
- 使用交叉验证进行评估,指标包括F1分数、准确率、精确度和召回率。
实验结果
研究问题
- RQ1与传统特征工程相比,LLM派生特征是否能提升事件严重性分类性能?
- RQ2哪些LLM与ML模型的组合能给出最佳的严重性预测?
- RQ3将基线特征与NLP特征结合是否优于任一单独特征?
主要发现
- 当与传统特征结合时,LLM特征往往提升或达到相同性能,特别是在基于树的模型中。
- 在此任务中,不同语言模型之间没有明显差异,表明所用事故叙述中的判别信息有限。
- 在昆士兰数据上,使用组合特征(报告 + NLP)时,最高F1分数为0.65,使用RandomForest搭配GPT-2特征。
- 在昆士兰数据上,使用BERT特征的XGBoost取得了0.56的竞争性F1分数。
- 总体而言,将LLM特征与基线特征结合在严重性分类中往往优于单独使用任一特征。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。