QUICK REVIEW

[論文レビュー] The Greedy and Recursive Search for Morphological Productivity

Caleb Belth, Sarah R. Payne|arXiv (Cornell University)|May 12, 2021

Language Development and Disorders参考文献 37被引用数 38

ひとこと要約

本論文は ATP を紹介する。これはTolerance Principle を用いて生産的な形態規則を発見する貪欲で再帰的なアブダクティブモデルであり、少数の語彙で生産性を評価し、英語とドイツ語の人間の発達パターンに一致する。主要なタスクでニューラルベースラインを上回り、限られた訓練データを用いた Wug-test の生成を人間データと整合させる。

ABSTRACT

As children acquire the knowledge of their language's morphology, they invariably discover the productive processes that can generalize to new words. Morphological learning is made challenging by the fact that even fully productive rules have exceptions, as in the well-known case of English past tense verbs, which features the -ed rule against the irregular verbs. The Tolerance Principle is a recent proposal that provides a precise threshold of exceptions that a productive rule can withstand. Its empirical application so far, however, requires the researcher to fully specify rules defined over a set of words. We propose a greedy search model that automatically hypothesizes rules and evaluates their productivity over a vocabulary. When the search for broader productivity fails, the model recursively subdivides the vocabulary and continues the search for productivity over narrower rules. Trained on psychologically realistic data from child-directed input, our model displays developmental patterns observed in child morphology acquisition, including the notoriously complex case of German noun pluralization. It also produces responses to nonce words that, despite receiving only a fraction of the training data, are more similar to those of human subjects than current neural network models' responses are.

研究の動機と目的

限られた子ども向けデータから自動的に生産的形態規則を仮説化する計算的アプローチを実証する。
広い規則が機能しない場合に、再帰的な細分化がより狭い生産的規則を発見できることを示す。
ATP の発達的な軌跡と正確さを英語とドイツ語の形態において人間データおよびニューラルネットワークのベースラインと比較する。
現実的な訓練条件下で、ATP の Wug-test の生成を人間のパフォーマンスと比較評価する。

提案手法

Abduction of Tolerable Productivity (ATP) を提案する。これは lemmas と features を inflected forms に対応づける決定木を構築する再帰的アブダクティブ探索である。
各分岐で、subset 内で最も頻繁に出現する接尾辞を最大化することにより一貫性を最大化する特徴を選択する。
耐容性原理をパスする生産的な語尾を持つ特徴を反復的に追加する。
基本ケース：最頻接尾辞が TP を通過するか、特徴が残っていない場合に停止し、例外を記憶する。
屈折生成は学習済みの木を横断し、生産的な規則が適用されない場合は最も近い近傍の記憶を使用する。
コードとデータ：ATP の実装と利用手順はオンラインで利用可能。

実験結果

リサーチクエスチョン

RQ1Tolerance Principle を用いた貪欲で再帰的な探索は、限られたデータから生産的な形態規則を自動的に仮説化できるか？
RQ2語彙をより狭いグループに細分化することは、複雑な形態（例：ドイツ語の複数形）の生産的規則の発見を改善するか？
RQ3ATP が学習した生産的規則と Wug-test の生成は、子ども向け発話で訓練された場合、人間データおよびニューラルネットワークのベースラインとどれくらい一致するか？
RQ4ATP の発達的妥当性は、英語の過去形、英語の複数形 -s、ドイツ語の複数形、現在分詞における子どもに観察される獲得順序と一致するか？

主な発見

ATP は子どもの研究で観察される英語およびドイツ語の形態に近い獲得順序で生産的接尾辞規則を発見する。
ATP は英語の過去形およびドイツ語の複数形化で ED ニューロモデルを複数のデータセット規模で上回る。
ATP は現実的な訓練規模（400語）で人間データと相関する Wug-test の生成を、ニューラルベースラインを上回る。
ATP はドイツ語で性情報の有無にかかわらず正確さを保ち、音韻規則の頑健な抽出を示す。
ATP は学習した規則を明示的に表す透明な決定木を提供する（例：英語の過去形 -ed ルールおよびドイツ語の五つの接尾辞）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。