[论文解读] FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture
FPCNet 在编码器-解码器框架中引入多扩张(MD)和 SE-Upsampling(SEU)模块,实现像素级路面裂缝检测,具备多上下文特征与自适应特征加权,达到 state-of-the-art 的准确性和速度。
Timely, accurate and automatic detection of pavement cracks is necessary for making cost-effective decisions concerning road maintenance. Conventional crack detection algorithms focus on the design of single or multiple crack features and classifiers. However, complicated topological structures, varying degrees of damage and oil stains make the design of crack features difficult. In addition, the contextual information around a crack is not investigated extensively in the design process. Accordingly, these design features have limited discriminative adaptability and cannot fuse effectively with the classifiers. To solve these problems, this paper proposes a deep learning network for pavement crack detection. Using the Encoder-Decoder structure, crack characteristics with multiple contexts are automatically learned, and end-to-end crack detection is achieved. Specifically, we first propose the Multi-Dilation (MD) module, which can synthesize the crack features of multiple context sizes via dilated convolution with multiple rates. The crack MD features obtained in this module can describe cracks of different widths and topologies. Next, we propose the SE-Upsampling (SEU) module, which uses the Squeeze-and-Excitation learning operation to optimize the MD features. Finally, the above two modules are integrated to develop the fast crack detection network, namely, FPCNet. This network continuously optimizes the MD features step-by-step to realize fast pixel-level crack detection. Experiments are conducted on challenging public CFD datasets and G45 crack datasets involving various crack types under different shooting conditions. The distinct performance and speed improvements over all the datasets demonstrate that the proposed method outperforms other state-of-the-art crack detection methods.
研究动机与目标
- Motivate accurate, automatic pavement crack detection for maintenance decision-making under diverse crack topologies and imaging conditions.
- Develop an end-to-end deep learning network that leverages contextual crack information across multiple scales.
- Address limitations of single-context FCNs by enabling multi-context feature learning and adaptive fusion.
提出的方法
- Introduce the Multi-Dilation (MD) module that fuses crack features from multiple context sizes using dilated convolutions with rates {1,2,3,4} and a global pooling path.
- Develop the SE-Upsampling (SEU) module to fuse upsampled MD features with encoder features via addition and channel-wise weighting through Squeeze-and-Excitation.
- Embed the MD and SEU modules in an Encoder-Decoder architecture to form FPCNet for end-to-end pixel-level crack detection.
- Use a 1x1 convolution followed by a sigmoid to produce per-pixel crack probability maps.
- Train with a binary cross-entropy plus Dice loss, using data augmentation and SGD with momentum for robust learning.
实验结果
研究问题
- RQ1Can a multi-context feature representation improve crack detection across varying crack widths and topologies?
- RQ2Does adaptive weighting of crack features via SE-Learning improve pixel-level segmentation over simple feature concatenation?
- RQ3Is end-to-end FPCNet faster and more accurate than existing FCN-based crack detectors across public datasets?
- RQ4How does FPCNet perform on challenging crack types and imaging conditions (oil stains, shadows, low contrast) compared to state-of-the-art methods?
主要发现
| 方法 | 容忍边距 | 精确度 | 召回率 | F1 得分 |
|---|---|---|---|---|
| CrackForest [28] | 5 | 82.28% | 89.44% | 85.71% |
| MFCD [36] | 5 | 89.90% | 89.47% | 88.04% |
| Method [37] | 2 | 90.70% | 84.60% | 87.00% |
| Method [18] | 2 | 91.19% | 94.81% | 92.44% |
| FCN [21] | 2 | 97.29% | 94.56% | 95.90% |
| FPCNet | 2 | 97.48% | 96.39% | 96.93% |
- On CFD dataset, FPCNet achieves an F1 score of 96.93%, surpassing the prior best by 4.49%.
- FPCNet attains higher precision and recall than FCN and several state-of-the-art methods on CFD (Precision 97.48%, Recall 96.39%).
- FPCNet runs at 14.7 FPS (67.9 ms per image), about 5.7x faster than the compared Method [18] under similar hardware.
- SEU fusion reduces redundant feature usage and emphasizes crack-relevant cues via channel-wise weights, improving robustness to noise and varying crack widths.
- On G45 dataset, FPCNet delivers strong per-type performance: Transverse 97.51% F1, Longitudinal 95.76% F1, Block 90.71% F1, Alligator 94.75% F1.
- MD module enables multi-context learning with low additional cost, maintaining speed while expanding context without larger filters.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。