QUICK REVIEW

[論文レビュー] Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN

Teik Koon Cheang, Yong Shean Chong|arXiv (Cornell University)|Jan 23, 2017

Vehicle License Plate Recognition参考文献 12被引用数 51

ひとこと要約

セグメンテーションなしの VLPR アプローチで、特徴抽出に ConvNet、系列モデリングに RNN を用い、全てのライセンスプレート画像をエンドツーエンドで処理し、スライディングウィンドウ法を上回る。

ABSTRACT

While vehicle license plate recognition (VLPR) is usually done with a sliding window approach, it can have limited performance on datasets with characters that are of variable width. This can be solved by hand-crafting algorithms to prescale the characters. While this approach can work fairly well, the recognizer is only aware of the pixels within each detector window, and fails to account for other contextual information that might be present in other parts of the image. A sliding window approach also requires training data in the form of presegmented characters, which can be more difficult to obtain. In this paper, we propose a unified ConvNet-RNN model to recognize real-world captured license plate photographs. By using a Convolutional Neural Network (ConvNet) to perform feature extraction and using a Recurrent Neural Network (RNN) for sequencing, we address the problem of sliding window approaches being unable to access the context of the entire image by feeding the entire image as input to the ConvNet. This has the added benefit of being able to perform end-to-end training of the entire model on labelled, full license plate images. Experimental results comparing the ConvNet-RNN architecture to a sliding window-based approach shows that the ConvNet-RNN architecture performs significantly better.

研究の動機と目的

実世界のデータセットで文字の幅が可変である VLPR を動機づける。
事前にセグメントされた文字に依存するスライディングウィンドウ手法の制約を克服する。
画像全体を入力として認識に用いるエンドツーエンドの ConvNet-RNN アーキテクチャを提案する。

提案手法

全ライセンスプレート画像から特徴を抽出する畳み込みニューラルネットワーク（ConvNet）を使用する。
抽出した特徴に対して系列モデル化を行うリカレントニューラルネットワーク（RNN）を採用する。
ラベル付きの全ライセンスプレート画像上で、全体の ConvNet-RNN モデルのエンドツーエンド訓練を可能にする。
事前にセグメント化された文字や手作業での事前スケーリングに依存しない。

実験結果

リサーチクエスチョン

RQ1ConvNet-RNN は事前にセグメント化された部品なしでライセンスプレートの文字を認識できるか。
RQ2文脈を持つ全体画像を処理することは、スライディングウィンドウ検出より認識を改善するか。
RQ3現実世界のプレート写真で segmentation-free VLPR のエンドツーエンド訓練は実現可能か。

主な発見

ConvNet-RNN アーキテクチャは全ライセンスプレート画像を処理し、エンドツーエンド訓練を可能にする。
対象データ上で、スライディングウィンドウ手法と比較して ConvNet-RNN は著しく良い性能を達成する。
全体画像からの文脈情報を利用することで、従来法より認識結果が改善される。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。