QUICK REVIEW

[論文レビュー] Geodesic Optimization for Predictive Shift Adaptation on EEG data

Apolline Mellot, Antoine Collas|arXiv (Cornell University)|Jul 4, 2024

Functional Brain Connectivity Studies被引用数 5

ひとこと要約

GOPSAはSPD多様体上でドメイン特異的な測地輸送を学習し、共分散に基づくEEG特徴と予測シフトを共同で揃えることで、安静時EEGからのサイト間年齢予測を向上させる。

ABSTRACT

Electroencephalography (EEG) data is often collected from diverse contexts involving different populations and EEG devices. This variability can induce distribution shifts in the data $X$ and in the biomedical variables of interest $y$, thus limiting the application of supervised machine learning (ML) algorithms. While domain adaptation (DA) methods have been developed to mitigate the impact of these shifts, such methods struggle when distribution shifts occur simultaneously in $X$ and $y$. As state-of-the-art ML models for EEG represent the data by spatial covariance matrices, which lie on the Riemannian manifold of Symmetric Positive Definite (SPD) matrices, it is appealing to study DA techniques operating on the SPD manifold. This paper proposes a novel method termed Geodesic Optimization for Predictive Shift Adaptation (GOPSA) to address test-time multi-source DA for situations in which source domains have distinct $y$ distributions. GOPSA exploits the geodesic structure of the Riemannian manifold to jointly learn a domain-specific re-centering operator representing site-specific intercepts and the regression model. We performed empirical benchmarks on the cross-site generalization of age-prediction models with resting-state EEG data from a large multi-national dataset (HarMNqEEG), which included $14$ recording sites and more than $1500$ human participants. Compared to state-of-the-art methods, our results showed that GOPSA achieved significantly higher performance on three regression metrics ($R^2$, MAE, and Spearman's $ρ$) for several source-target site combinations, highlighting its effectiveness in tackling multi-source DA with predictive shifts in EEG data analysis. Our method has the potential to combine the advantages of mixed-effects modeling with machine learning for biomedical applications of EEG, such as multicenter clinical trials.

研究の動機と目的

EEGに基づく予測モデルにおいて、入力データ X（共分散行列）とターゲット変数 y の両方の分布シフトに対処する。
ターゲットドメインでの再学習を必要としない、複数ソースのテスト時ドメイン適応法を開発する。
各ドメイン平均から恒等値への測地に沿う平行輸送を介してドメイン特異的な測地切片を学習するリーマン混合効果モデルを定義する。
EEG共分散データからの年齢予測におけるサイト間一般化の改善を示す。
複数センターのEEG解析に向け、混合効果モデルと機械学習を組み合わせた枠組みを提供する。

提案手法

EEG共分散行列をSPD多様体 S_d^{++} の点として表現し、アフィン不変リーマン計量を用いて測地距離を計算する。
各ドメイン平均から恒等値への測地に沿う平行輸送を介してドメイン特異的な測地切片を学習するリーマン混合効果モデルを定義する。
リーマン対数写像後の接空間で共有線形回帰を学習し、輸送度合いを制御するドメイン重み α_k の訓練時最適化を行う。
訓練時には、全Kソースドメインのドメイン輸送とリッジ回帰係数を共同で最適化する（アルゴリズム1）。
テスト時には、ターゲット平均 y_T を予測と一致させるよう輸送パラメータ γ_T を最適化して新しいターゲットドメインに適応する（アルゴリズム2）。
主要式には、輸送 φ(Σ_i, Σ_k̄, α) = uvec(log_I(PT(Σ_i, Σ_k̄, α))) と、リッジ正則化回帰のもとでの γ_S および γ_T の最適化が含まれる。

Figure 1: Joint shift in $X$ and $y$ distributions on the HarMNqEEG dataset [ 31 ] . Subset of mean PSDs ( A ) and age distributions ( B ) from three recording sites used for the empirical benchmarks.

実験結果

リサーチクエスチョン

RQ1EEGデータにおいて、入力共分散行列と結果変数の両方にシフトが生じる場合、どのようにドメイン適応を行うか？
RQ2SPD多様体上の測地に基づく多源・テスト時適応法は、EEGベース回帰タスクの既存の再中心化やドメイン適応ベースラインより優れているか？
RQ3グローバル回帰モデルを共有しつつ、ドメイン特異的な測地切片（平行輸送）を学習する利点は何か、跨サイトのEEG年齢予測で。

主な発見

サイト組み合わせ	ダミー	DAなし	再中心化	DO切片	GOPSA
Ba,Cho,G,S	0.53 ± 0.02	0.63 ± 0.02	0.52 ± 0.02	0.75 ± 0.02	0.78 ± 0.01
Be,Chb,S	0.58 ± 0.02	0.73 ± 0.01	0.43 ± 0.02	0.69 ± 0.02	0.72 ± 0.02
Ba,Co,G	0.63 ± 0.02	0.64 ± 0.02	0.42 ± 0.02	0.71 ± 0.01	0.74 ± 0.01
Cu03,M,R,S	0.63 ± 0.02	0.63 ± 0.01	0.46 ± 0.02	0.76 ± 0.01	0.76 ± 0.02
Ba,Be,Cho, Co,Cu90,G,R	0.77 ± 0.02	0.79 ± 0.01	0.44 ± 0.03	0.86 ± 0.01	0.87 ± 0.01
Mean	0.63 ± 0.02	0.68 ± 0.01	0.45 ± 0.02	0.75 ± 0.01	0.78 ± 0.01

GOPSAはHarMNqEEGのサイト組み合わせにおいて、Dummy DO、No DA、Re-center、DO Interceptなどのいくつかのベースラインより回帰指標で高い性能を示す。
Spearmanのρでは、サイト組み合わせの平均で0.78 ± 0.01を達成し、DO Intercept (0.75 ± 0.02) を含む他の手法を上回る。
R^2では平均で0.61 ± 0.02を達成し、Do Intercept (0.58 ± 0.02)より高い。
MAEでは平均で8.25 ± 0.19で、Re-center (8.55 ± 0.18)より良い。
特定のサイトペア間でも、GOPSAはベースライン手法を一貫して上回り、多サイトEEGデータにおける予測シフトの処理に有効であることを示している。

Figure 2: Normalized performance of the different methods on several source-target combinations for three metrics: Spearman’s $\rho$ $\uparrow$ (left), $R^{2}$ score $\uparrow$ (middle) and Mean Absolute Error $\downarrow$ (right). As a large variability in the score values was present between the s

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。