QUICK REVIEW

[論文レビュー] Continuous Telemonitoring of Heart Failure using Personalised Speech Dynamics

Yue Pan, Xingyao Wang|arXiv (Cornell University)|Feb 23, 2026

Phonocardiography and Auscultation Techniques被引用数 0

ひとこと要約

The paper introduces Longitudinal Intra-Patient Tracking (LIPT) with a Personalised Sequential Encoder (PSE) to monitor heart failure via speech, outperforming cross-sectional methods in HF trajectory detection and deterioration prediction.

ABSTRACT

Remote monitoring of heart failure (HF) via speech signals provides a non-invasive and cost-effective solution for long-term patient management. However, substantial inter-individual heterogeneity in vocal characteristics often limits the accuracy of traditional cross-sectional classification models. To address this, we propose a Longitudinal Intra-Patient Tracking (LIPT) scheme designed to capture the trajectory of relative symptomatic changes within individuals. Central to this framework is a Personalised Sequential Encoder (PSE), which transforms longitudinal speech recordings into context-aware latent representations. By incorporating historical data at each timestamp, the PSE facilitates a holistic assessment of the clinical trajectory rather than modelling discrete visits independently. Experimental results from a cohort of 225 patients demonstrate that the LIPT paradigm significantly outperforms the classic cross-sectional approaches, achieving a recognition accuracy of 99.7% for clinical status transitions. The model's high sensitivity was further corroborated by additional follow-up data, confirming its efficacy in predicting HF deterioration and its potential to secure patient safety in remote, home-based settings. Furthermore, this work addresses the gap in existing literature by providing a comprehensive analysis of different speech task designs and acoustic features. Taken together, the superior performance of the LIPT framework and PSE architecture validates their readiness for integration into long-term telemonitoring systems, offering a scalable solution for remote heart failure management.

研究の動機と目的

Address inter-individual variability in speech-based HF assessment.
Develop a longitudinal tracking framework to monitor intra-patient HF trajectories.
Design a Personalised Sequential Encoder to encode continuous speech histories.
Validate the approach on a cohort of hospitalized HF patients and follow-up data.

提案手法

Extract global and frame-level acoustic features from speech tasks.
Apply statistical screening to identify HF-relevant features (HF-voice A/B).
Propose Longitudinal Intra-Patient Tracking (LIPT) with a Personalised Sequential Encoder (PSE) to model intra-patient trajectories.
Train and compare cross-sectional and longitudinal models (XGBoost and FNN) for HF state transition detection.
Evaluate across multiple speech tasks (vowels, short sentences, long sentence) and analyze task effectiveness.
Validate the approach on decompensated vs post-treatment states and on follow-up rehospitalisation data.

実験結果

リサーチクエスチョン

RQ1Can longitudinal modelling outperform traditional cross-sectional approaches for HF status estimation from speech?
RQ2Which speech tasks and feature sets yield the strongest signals for HF trajectory tracking?
RQ3How effective is the Personalised Sequential Encoder at capturing intra-patient temporal dynamics?
RQ4How well does the LIPT/PSE approach generalise to follow-up data including rehospitalisation prediction?

主な発見

LIPT significantly outperforms cross-sectional approaches across architectures, e.g., accuracy improved from around 69% (cross-sectional) to up to 81.8% (longitudinal FNN) for selected feature sets.
RASTA frame-level features achieve very high performance; combining RASTA with selected global features yields sensitivity around 99.8% and specificity around 99.7%.
PSE with frame-level RASTA features reaches macro-F1 of 99.5% (decompensated to post-treatment) and 99.7% precision, indicating strong detection of HF trajectory changes.
In follow-up evaluation, RASTA-based models effectively identify rehospitalisation with AUROC up to 0.94, though stable cases show higher false-positive rates requiring calibration.
Longer, more comprehensive speech tasks (counting 1–60) provide the best intra-patient longitudinal information, while vowels offer clinical practicality.
The study supports the feasibility of personalised speech modelling for scalable remote HF monitoring and highlights directions for calibration and broader data.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。