QUICK REVIEW

[論文レビュー] Reading Car License Plates Using Deep Convolutional Neural Networks and LSTMs

Hui Li, Chunhua Shen|arXiv (Cornell University)|Jan 21, 2016

Vehicle License Plate Recognition参考文献 18被引用数 177

ひとこと要約

本論文は自然場面におけるナンバープレートを検出するためのカスケード CNN ベースのシステムと、二つの認識アプローチを提案する。セグメンテーションに基づくCNNと BRNN-LSTM-CTC を用いたセグメンテーションフリーの系列ラベリング法を組み合わせ、高い精度を達成する。

ABSTRACT

In this work, we tackle the problem of car license plate detection and recognition in natural scene images. Inspired by the success of deep neural networks (DNNs) in various vision applications, here we leverage DNNs to learn high-level features in a cascade framework, which lead to improved performance on both detection and recognition. Firstly, we train a $37$-class convolutional neural network (CNN) to detect all characters in an image, which results in a high recall, compared with conventional approaches such as training a binary text/non-text classifier. False positives are then eliminated by the second plate/non-plate CNN classifier. Bounding box refinement is then carried out based on the edge information of the license plates, in order to improve the intersection-over-union (IoU) ratio. The proposed cascade framework extracts license plates effectively with both high recall and precision. Last, we propose to recognize the license characters as a {sequence labelling} problem. A recurrent neural network (RNN) with long short-term memory (LSTM) is trained to recognize the sequential features extracted from the whole license plate via CNNs. The main advantage of this approach is that it is segmentation free. By exploring context information and avoiding errors caused by segmentation, the RNN method performs better than a baseline method of combining segmentation and deep CNN classification; and achieves state-of-the-art recognition accuracy.

研究の動機と目的

複雑な自然画像でナンバープレートを高い再現率と精度で検出するためのカスケード CNN フレームワークを開発する。
BRNN-LSTM-CTC を用いたセグメンテーションベースの CNN 手法と、セグメンテーションフリーの系列ラベリングアプローチの双方を提案し、認識を改善する。
データ拡張と CNN 入力への Local Binary Pattern (LBP) の組み込みによりロバスト性を向上させる。
エッジベースの特徴を用いてナンバープレートのバウンディングボックスを精練し、IoU を改善し誤検出を減らす。

提案手法

複数のスケールで検出用のテキストサリエンシーマップを生成するため、4 層 37 クラスの文字 CNN を訓練する。
偽陽性を除去しナンバープレートを検証するため、別の4層の plate/non-plate CNN を使用する。
エッジベースの射影でバウンディングボックスを洗練し、プレート局在化を引き締める。
セグメント化された従来の OCR 的パイプラインで文字認識を行うため、LBP 強化入力を持つ9層 CNN を開発する。
マルチレベル CNN 特徴を連結して256-Dの系列にし、BRNN と LSTM と CTC を用いてプレート文字列をデコードするセグメンテーションフリー認識法を提案する。
プレ trained 9層 CNN がプレート全体にわたるスライディングウィンドウ特徴を抽出し、その後 BRNN-LSTM-CTC デコーディングを行う sequence labeling パイプラインを開発する。

実験結果

リサーチクエスチョン

RQ1多段階 CNN カスケードは、乱雑な自然シーンで高い再現率と精度を持ってナンバープレートを検出できるか？
RQ2BRNN-LSTM-CTC を用いたセグメンテーションフリーの系列ラベリング手法は、セグメンテーションベースの認識法よりナンバープレートの認識性能を上回るか？
RQ3CNN 入力への Local Binary Pattern (LBP) 特徴の組み込みは認識精度を向上させるか？
RQ4エッジベースのバウンディングボックス refinements は IoU の改善と誤検出の削減にどれほど効果的か？

主な発見

カスケードフレームワークは自然環境におけるナンバープレート検出で高い再現率と精度を達成する。
LBP 特徴を持つ 9 層 CNN ベースの recognizer は文字分類性能を向上させる。
セグメンテーション不要の BRNN-LSTM-CTC アプローチは、文字の分割なしに全体のプレートを読み取り、最先端の認識精度を実現する。
エッジ情報を用いたバウンディングボックスの精練は局在化の品質（IoU）を改善し、誤検出のプレート切り出しを減らす。
実験では、セグメンテーションベースとセグメンテーションフリーの2つの認識アプローチを評価し、全体の LPDR パフォーマンスの向上に互いに補完的であることが示された。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。