QUICK REVIEW

[論文レビュー] BiTimeBERT: Extending Pre-Trained Language Representations with Bi-Temporal Information

Jiexin Wang, Adam Jatowt|arXiv (Cornell University)|Apr 27, 2022

Topic Modeling被引用数 22

ひとこと要約

BiTimeBERTは、二十年間のニュースコーパスに対して時間認識マスキングと文書日付推定の目的でTransformerエンコーダを事前学習し、時間認識の言語表現を作成することで、標準BERTより時間関連タスクを改善します。

ABSTRACT

Time is an important aspect of documents and is used in a range of NLP and IR tasks. In this work, we investigate methods for incorporating temporal information during pre-training to further improve the performance on time-related tasks. Compared with common pre-trained language models like BERT which utilize synchronic document collections (e.g., BookCorpus and Wikipedia) as the training corpora, we use long-span temporal news article collection for building word representations. We introduce BiTimeBERT, a novel language representation model trained on a temporal collection of news articles via two new pre-training tasks, which harnesses two distinct temporal signals to construct time-aware language representations. The experimental results show that BiTimeBERT consistently outperforms BERT and other existing pre-trained models with substantial gains on different downstream NLP tasks and applications for which time is of importance (e.g., the accuracy improvement over BERT is 155\% on the event time estimation task).

研究の動機と目的

時系列情報を事前学習言語モデルに組み込む利点を調査する。
Temporalニュースコレクションで訓練されたBiTimeBERTを、2つの新しい事前学習目的とともに開発する。
タイムスタンプと内容時刻信号が時間依存型NLP/IRタスクに与える影響を評価する。
BiTimeBERTを多様な時間関連のダウンストリームタスクで評価し、ベースラインと比較する。

提案手法

BERT baseから初期化されたTransformerエンコーダを使用し、NYTニュースコーパス（1987–2007）で継続的に事前学習する。
時間表現を最初にマスクし、次に他のトークンをマスクするTime-Aware Masked Language Modeling（TAMLM）を導入する。
次の文予測を文書日付（DD）へ置換して、選択した粒度で文書のタイムスタンプを予測する。
任意でTIRをTAMLMベースの目標として代替的に検証する。
複数の粒度で正確さ（ACC）とMAEを用いてダウンストリームタスクを評価する。
事前学習時に文書のタイムスタンプと内容時刻（時間表現）の2つの時系列信号を参照する。

Figure 1 . An illustration of BiTimeBERT training, which includes the TAMLM and DD tasks.

実験結果

リサーチクエスチョン

RQ1タイムスタンプ信号と内容時刻信号は、事前学習済み言語表現にどのように影響するか。
RQ2時間依存性の高いNLP/IRタスクに対して、時間認識を伴う事前学習目的は性能を向上させるか。
RQ3下流タスクの性能に対する時間の粒度の影響はどの程度か。
RQ4BiTimeBERTは事前学習ウィンドウを超える長期時間タスクに一般化できるか。

主な発見

BiTimeBERTは、複数の時間関連タスクで、粒度を問わずBERTおよびBERT-NYTを上回る。
イベント発生時刻推定で、BiTimeBERTはベースラインに対して大幅な改善を示し、トップ1文書情報を用いる場合に特に顕著。
BiTimeBERTは、複雑な多段パイプラインに頼るSOTAの時間推定手法と比較して、競争力のあるまたは最先端の結果を達成する。
内容時刻をTAMLMで、タイムスタンプをDDで組み込むことで、特にタスクデータが限られている場合に、時間認識表現が強化される。

Figure 2 . Example of the replacement procedure in TIR task.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。