QUICK REVIEW

[論文レビュー] End to End Learning for Self-Driving Cars

Mariusz Bojarski, Davide Testa|arXiv (Cornell University)|Apr 25, 2016

Advanced Neural Network Applications参考文献 5被引用数 3,101

ひとこと要約

CNNは単一の前方カメラの生のピクセルを直接ステアリング命令へマッピングし、人間データを最小限で学習することで、シミュレーションと実車で多様な道路で自動運転を可能にする。

ABSTRACT

We trained a convolutional neural network (CNN) to map raw pixels from a single front-facing camera directly to steering commands. This end-to-end approach proved surprisingly powerful. With minimum training data from humans the system learns to drive in traffic on local roads with or without lane markings and on highways. It also operates in areas with unclear visual guidance such as in parking lots and on unpaved roads. The system automatically learns internal representations of the necessary processing steps such as detecting useful road features with only the human steering angle as the training signal. We never explicitly trained it to detect, for example, the outline of roads. Compared to explicit decomposition of the problem, such as lane marking detection, path planning, and control, our end-to-end system optimizes all processing steps simultaneously. We argue that this will eventually lead to better performance and smaller systems. Better performance will result because the internal components self-optimize to maximize overall system performance, instead of optimizing human-selected intermediate criteria, e.g., lane detection. Such criteria understandably are selected for ease of human interpretation which doesn't automatically guarantee maximum system performance. Smaller networks are possible because the system learns to solve the problem with the minimal number of processing steps. We used an NVIDIA DevBox and Torch 7 for training and an NVIDIA DRIVE(TM) PX self-driving car computer also running Torch 7 for determining where to drive. The system operates at 30 frames per second (FPS).

研究の動機と目的

エンドツーエンド学習が、手作り特徴量を使わずに生の画像入力だけで車両を制御できることを実証する。
CNNが限られたラベル付きデータから内部の道路表現と運転方針を学習できることを示す。
高速道路、一般道、未舗装路を含む多様な走行シナリオで性能を評価する。
道路試験前の頑健性向上のためのデータ拡張とシミュレーションの実現可能性を評価する。

提案手法

センターカメラ1台のYUV画像入力を、逆のターン半径出力へマッピングする9層CNNを訓練する。
損失として、ネットワーク出力と人間ドライバーステアリング（オフセンター/回転画像の場合は拡張ステアリング）との平均二乗誤差を用いる。
偏差からの回復を教えるため、人工的なシフトや回転で訓練データを拡張する。
多様な道路・照明・天候でデータ収集を行い、車線標示の有無を問わず高速道路および一般道を含める。
2段階の評価で検証する：事前に録画したビデオを用いたシミュレーションと、DRIVE PX 車載コンピュータを用いた路上実証試験。

実験結果

リサーチクエスチョン

RQ1エンドツーエンド学習が、路面・車線標識検出の明示なしに前方カメラ入力からステアリングへマッピングできるか。
RQ2学習した方針は、さまざまな道路種別・天候・照明条件でどれだけ一般化するか。
RQ3データ拡張とシミュレーションが、実Road試験前の頑健性に及ぼす影響はどの程度か。
RQ4シミュレーションと実路上で達成可能な自律性はどの程度か。
RQ5エンドツーエンドアプローチは、モジュール型の手作り感知・制御パイプラインと比較してどうか。

主な発見

CNNは明示的な道路輪郭なしで、ステアリング角度だけを学習信号として有用な道路特徴と運転挙動を獲得する。
システムはNVIDIAハードウェア上で30 FPSで動作し、約72時間の走行データを用いて訓練された。
路上試験では、ニュージャージー州モンスマス郡で典型的な走行の約98%に対して自動ステアリングを実現し、10マイルの多車線高速道路走行で介入ゼロを達成した。
車は高速道路、一般道、住宅地道路を、晴れ・曇り・雨・雪の条件下で、未舗装路や駐車場を含む環境でも走行可能だった。
シミュレーション実験は、人間の介入をカウントし6秒間の取り直しモデルを適用することで自律性を推定し、路上試験前の指標を提供する。
CNN内部状態の可視化は、初期の特徴マップが舗装道路上の路面の輪郭に反応する一方で、非路面シーンではノイズのように見えることを示し、明示的な監督なしで学習された表現を示唆する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。