QUICK REVIEW

[論文レビュー] The Convolutional Tsetlin Machine

Ole‐Christoffer Granmo, Sondre Glimsdal|arXiv (Cornell University)|May 23, 2019

Machine Learning and Algorithms参考文献 43被引用数 48

ひとこと要約

The Convolutional Tsetlin Machine (CTM) extends the interpretable Tsetlin Machine to image data by using clause-based convolution filters with location-aware patches, achieving competitive accuracy on MNIST, Kuzushiji-MNIST, Fashion-MNIST, and 2D Noisy XOR.

ABSTRACT

Convolutional neural networks (CNNs) have obtained astounding successes for important pattern recognition tasks, but they suffer from high computational complexity and the lack of interpretability. The recent Tsetlin Machine (TM) attempts to address this lack by using easy-to-interpret conjunctive clauses in propositional logic to solve complex pattern recognition problems. The TM provides competitive accuracy in several benchmarks, while keeping the important property of interpretability. It further facilitates hardware-near implementation since inputs, patterns, and outputs are expressed as bits, while recognition and learning rely on straightforward bit manipulation. In this paper, we exploit the TM paradigm by introducing the Convolutional Tsetlin Machine (CTM), as an interpretable alternative to CNNs. Whereas the TM categorizes an image by employing each clause once to the whole image, the CTM uses each clause as a convolution filter. That is, a clause is evaluated multiple times, once per image patch taking part in the convolution. To make the clauses location-aware, each patch is further augmented with its coordinates within the image. The output of a convolution clause is obtained simply by ORing the outcome of evaluating the clause on each patch. In the learning phase of the TM, clauses that evaluate to 1 are contrasted against the input. For the CTM, we instead contrast against one of the patches, randomly selected among the patches that made the clause evaluate to 1. Accordingly, the standard Type I and Type II feedback of the classic TM can be employed directly, without further modification. The CTM obtains a peak test accuracy of 99.4% on MNIST, 96.31% on Kuzushiji-MNIST, 91.5% on Fashion-MNIST, and 100.0% on the 2D Noisy XOR Problem, which is competitive with results reported for simple 4-layer CNNs, BinaryConnect, Logistic Circuits and an FPGA-accelerated Binary CNN.

研究の動機と目的

CNNの解釈可能な代替手段として、Convolutional Tsetlin Machine (CTM) を導入する。
TMの学習ルールを、畳み込み様のフィルタリングによって画像パッチ上で動作するよう適応させる。
標準ベンチマークおよび 2D XOR タスクにおける CTM の認識能力と学習性能を示す。

提案手法

画像をバイナリ入力として表現し、サイズ W×W×Z×2 のクローズベース畳み込みフィルタを定義する。
各画像パッチにエンコードされた位置情報を付与して、クローズを位置認識可能にする。
各クローズをすべてのパッチで評価し、OR によって集約して、画像ごとにクローズ出力を生成する。
クラシックTMからの Type I および Type II フィードバックを、クローズ内部のTsetlin Automataの更新に適用し、CTM設定へ適合させるために、クローズを活性化したパッチの中からランダムに1つを選択する。
任意で整数クローズ加重を組み込み、クローズ間で加重多数決を行う。
ビットレベルの入力と単純なビット操作による並列化可能でハードウェアに優しい処理を示す。

実験結果

リサーチクエスチョン

RQ1CTM は解釈性を維持しつつ、画像分類において競争力のある精度を達成できるか？
RQ2TM の学習フィードバック（Type I および Type II）を畳み込み型、パッチベースの設定に適用するにはどうすればよいか？
RQ3位置認識性とパッチ単位のクローズ出力が認識性能に与える影響は何か？
RQ4クローズ加重は CTM の精度と計算効率にどう影響するか？

主な発見

CTM は MNIST で 99.4%、Kuzushiji-MNIST で 96.31%、Fashion-MNIST で 91.5%、2D Noisy XOR で 100.0% のピークテスト精度を達成し、競争力のあるベンチマークとともに動作する。
CTM はクローズの数と画像パッチ数に対して計算が線形に成長し、並列化可能な更新の恩恵を受ける。
位置情報を組み込むことで、フィルターが画像タスクに適した位置認識パターンになる。
クローズ加重は性能と効率をさらに向上させ、複数のクローズを単一の加重投票へ置換できる。
CTM は、選択されたデータセットで単純なCNN、BinaryConnect、Logistic Circuits、および FPGA 加速型 Binary CNN に対して競争力のある結果を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。