QUICK REVIEW

[論文レビュー] Self-Supervised Contrastive Pre-Training For Time Series via Time-Frequency Consistency

Xiang Zhang, Ziyuan Zhao|ArXiv.org|Jun 17, 2022

EEG and Brain-Computer Interfaces被引用数 127

ひとこと要約

時間周波数整合性（TF-C）を導入し、時系列の自己教師付き事前学習で、時間ベースと周波数ベースの埋め込みを揃えることで、未知のターゲットデータセットへの移行を改善する。

ABSTRACT

Pre-training on time series poses a unique challenge due to the potential mismatch between pre-training and target domains, such as shifts in temporal dynamics, fast-evolving trends, and long-range and short-cyclic effects, which can lead to poor downstream performance. While domain adaptation methods can mitigate these shifts, most methods need examples directly from the target domain, making them suboptimal for pre-training. To address this challenge, methods need to accommodate target domains with different temporal dynamics and be capable of doing so without seeing any target examples during pre-training. Relative to other modalities, in time series, we expect that time-based and frequency-based representations of the same example are located close together in the time-frequency space. To this end, we posit that time-frequency consistency (TF-C) -- embedding a time-based neighborhood of an example close to its frequency-based neighborhood -- is desirable for pre-training. Motivated by TF-C, we define a decomposable pre-training model, where the self-supervised signal is provided by the distance between time and frequency components, each individually trained by contrastive estimation. We evaluate the new method on eight datasets, including electrodiagnostic testing, human activity recognition, mechanical fault detection, and physical status monitoring. Experiments against eight state-of-the-art methods show that TF-C outperforms baselines by 15.4% (F1 score) on average in one-to-one settings (e.g., fine-tuning an EEG-pretrained model on EMG data) and by 8.4% (precision) in challenging one-to-many settings (e.g., fine-tuning an EEG-pretrained model for either hand-gesture recognition or mechanical fault prediction), reflecting the breadth of scenarios that arise in real-world applications. Code and datasets: https://github.com/mims-harvard/TFC-pretraining.

研究の動機と目的

事前学習データとターゲットデータとの間にあるドメインシフトにもかかわらず、時系列データのロバストな事前学習を動機づける。
ターゲットドメインデータを事前学習時に必要としない、一般化可能な事前学習原理としてTF-Cを提案する。
TF-Cを強制するための時間エンコーダと周波数エンコーダ、および跨空間プロジェクターを備えた分解可能なモデルを開発する。
時間ベースと周波数ベースの対比損失と、表現を統合する整合性損失（トリプレット風）を導入する。
最先端のベースラインと比較して、多様なデータセットとタスクにおける転移利得を実証する。

提案手法

時間領域と周波数領域の2つの並列エンコーダを定義し、共有の時間-周波数空間への跨空間プロジェクションを行う。
時間ベースの拡張のバンクを適用し、スペクトルを撹乱する周波数拡張戦略を適用する。
時間ベースと周波数ベースの表現を別々に整列させるためにNT-Xent形式の対比損失を用いる。
ドメインを跨ぐ時間-周波数表現の近さを促す、トリプレットに触発された整合性損失を導入する。
対比項と整合項のバランスを取るTF-C目的関数に損失を統合する（L_TF-C = λ(L_T + L_F) + (1−λ)L_C）。

実験結果

リサーチクエスチョン

RQ1同じ時系列の時間ベースと周波数ベースの表現を、ターゲットドメインデータなしで共有潜在空間に整列させることができるか？
RQ2周波数領域の拡張とTF-C整合性目的が、未知のターゲットデータセットへの転移を改善するか？
RQ3多様な時系列タスクにおける、TF-Cの性能は最先端の自己教師付きベースラインと比べてどうか？
RQ4TF-C事前学習はone-to-oneおよびone-to-many転移設定の両方で有益か？
RQ5提案された周波数撹乱が表現の頑健性に与える影響は何か？

主な発見

TF-Cはone-to-one転移設定で平均F1スコアを15.4%上回り、すべてのベースラインを上回る。
TF-Cは難しいone-to-many転移設定で精度を8.4%向上させる。
このアプローチは、EEG、EMG、ECG、歩行、振動信号を含む8つのデータセット全体で強い転移を示す。
時間エンコーダ、周波数エンコーダ、2つの跨空間プロジェクターの4成分アーキテクチャを用いて、共有の時間-周波数空間へ埋め込む。
周波数領域の拡張は、時系列の対照学習において有効で新規である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。