[论文解读] NEAR REAL-TIME MAP BUILDING WITH MULTI-CLASS IMAGE SET LABELLING AND CLASSIFICATION OF ROAD CONDITIONS USING CONVOLUTIONAL NEURAL NETWORKS
本文提出了一种近实时的制图系统,利用卷积神经网络(CNN)对北美交通摄像头图像中的道路状况进行分类。该系统评估了六种深度学习模型——VGG-16、ResNet50、Xception、InceptionResNetV2、EfficientNet-B0 和 EfficientNet-B4,其中 EfficientNet-B4 在 6 个周期后达到 90.6% 的验证准确率,而 EfficientNet-B0 在推理时间减半的情况下仍保持 90.3% 的高准确率,展现出在动态道路状况制图中实现可扩展实时部署的潜力。
Weather is an important factor affecting transportation and road safety. In this paper, we leverage state-of-the-art convolutional neural networks in labelling images taken by street and highway cameras located across across North America. Road camera snapshots were used in experiments with multiple deep learning frameworks to classify images by road condition. The training data for these experiments used images labelled as dry, wet, snow/ice, poor, and offline. The experiments tested different configurations of six convolutional neural networks (VGG-16, ResNet50, Xception, InceptionResNetV2, EfficientNet-B0 and EfficientNet-B4) to assess their suitability to this problem. The precision, accuracy, and recall were measured for each framework configuration. In addition, the training sets were varied both in overall size and by size of individual classes. The final training set included 47,000 images labelled using the five aforementioned classes. The EfficientNet-B4 framework was found to be most suitable to this problem, achieving validation accuracy of 90.6%, although EfficientNet-B0 achieved an accuracy of 90.3% with half the execution time. It was observed that VGG-16 with transfer learning proved to be very useful for data acquisition and pseudo-labelling with limited hardware resources, throughout this project. The EfficientNet-B4 framework was then placed into a real-time production environment, where images could be classified in real-time on an ongoing basis. The classified images were then used to construct a map showing real-time road conditions at various camera locations across North America. The choice of these frameworks and our analysis take into account unique requirements of real-time map building functions. A detailed analysis of the process of semi-automated dataset labelling using these frameworks is also presented in this paper.
研究动机与目标
- 开发一种可扩展的近实时系统,利用公开的北美交通摄像头图像对全北美范围的道路状况进行制图。
- 评估多种最先进的 CNN 架构在多类别道路状况分类(干燥、潮湿、积雪/结冰、差、离线)中的表现。
- 基于准确率、推理速度和硬件效率优化模型选择,以实现实时部署。
- 在计算资源有限的条件下,通过迁移学习和伪标签技术实现半自动数据集标注。
提出的方法
- 在包含 47,000 张道路摄像头图像快照的数据集上,对六种深度 CNN 架构(VGG-16、ResNet50、Xception、InceptionResNetV2、EfficientNet-B0 和 EfficientNet-B4)进行训练。
- 采用数据增强和迁移学习技术,尤其利用 VGG-16 在计算资源受限条件下实现初始伪标签标注。
- 采用修正的 Adam 优化器,基础初始学习率为 0.0001,并使用类别交叉熵损失函数进行多分类任务。
- 采用五折交叉验证,训练集与验证集按 90/10 划分,其中 42,606 张图像用于训练,4,736 张用于验证。
- 将性能最佳的模型 EfficientNet-B4 部署于实时处理流水线,对输入图像进行分类,并将结果流式传输至地理地图可视化系统。
- 将分类结果输出存储于 CSV 和 PostgreSQL 数据库中,以便与实时制图系统集成。
实验结果
研究问题
- RQ1哪种深度学习架构在从交通摄像头图像中分类道路状况方面达到最高准确率?
- RQ2模型推理速度与参数量如何影响大规模制图系统中的实时部署?
- RQ3在道路状况分类任务中,迁移学习与伪标签技术能在多大程度上降低人工标注成本?
- RQ4数据集规模与类别不平衡对模型泛化能力与性能有何影响?
- RQ5能否利用公开的交通摄像头数据流,构建一个统一的、跨司法管辖区的实时道路状况监控系统?
主要发现
- EfficientNet-B4 在 6 个周期后达到最高的验证准确率 90.6%,在分类性能上优于其他模型。
- EfficientNet-B0 在仅 600ms 推理时间内实现 90.3% 的准确率,展现出速度与准确率之间的理想平衡。
- 在硬件资源有限的条件下,VGG-16 搭配迁移学习在初始数据获取与伪标签标注方面表现出极高的有效性。
- 随着训练集规模增大且更具多样性,模型性能持续提升,表明数据可扩展性具有显著优势。
- Xception、InceptionResNetV2 和 EfficientNet 框架在硬件资源充足时表现出色。
- 最终系统成功利用 782 张分类图像生成了北美道路状况的实时地图,验证了端到端处理流程的可行性。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。