QUICK REVIEW

[Paper Review] A Method for Analysis of Patient Speech in Dialogue for Dementia Detection

Saturnino Luz, Sofia de la Fuente García|arXiv (Cornell University)|Nov 25, 2018

Speech and dialogue systems33 references34 citations

TL;DR

This paper proposes a content-free dialogue analysis method using additive logistic regression (Real Adaboost) on spontaneous speech features—such as speech rate and turn-taking patterns—from patient-interviewer dialogues to detect Alzheimer’s-type dementia. Despite relying solely on low-level interaction features, the model achieves 86.5% accuracy, demonstrating that non-invasive, low-cost mental health monitoring tools are feasible using naturally occurring speech data.

ABSTRACT

We present an approach to automatic detection of Alzheimer's type dementia based on characteristics of spontaneous spoken language dialogue consisting of interviews recorded in natural settings. The proposed method employs additive logistic regression (a machine learning boosting method) on content-free features extracted from dialogical interaction to build a predictive model. The model training data consisted of 21 dialogues between patients with Alzheimer's and interviewers, and 17 dialogues between patients with other health conditions and interviewers. Features analysed included speech rate, turn-taking patterns and other speech parameters. Despite relying solely on content-free features, our method obtains overall accuracy of 86.5\%, a result comparable to those of state-of-the-art methods that employ more complex lexical, syntactic and semantic features. While further investigation is needed, the fact that we were able to obtain promising results using only features that can be easily extracted from spontaneous dialogues suggests the possibility of designing non-invasive and low-cost mental health monitoring tools for use at scale.

Motivation & Objective

To develop a low-cost, non-invasive method for early Alzheimer’s-type dementia detection using spontaneous speech in natural dialogue settings.
To investigate whether content-free linguistic features—such as speech rate and turn-taking—can effectively differentiate dementia patients from non-demented individuals.
To reduce reliance on complex, hard-to-acquire linguistic features (e.g., lexical, syntactic) by focusing on interaction dynamics in real-world dialogues.
To enable scalable, deployable mental health monitoring tools by leveraging easily extractable speech parameters from everyday conversations.
To contribute a new framework for dementia detection based on dialogue structure rather than narrative or monologue-based speech tasks.

Proposed method

The method employs additive logistic regression (Real Adaboost), a machine learning boosting algorithm, to classify patients as having Alzheimer’s-type dementia (ATD) or non-ATD based on dialogue features.
Features extracted include speech rate, turn-taking patterns, and other prosodic and interactional parameters from spontaneous dialogues, with no reliance on lexical or semantic content.
The model is trained on 21 ATD patient dialogues and 17 non-ATD patient dialogues from the Carolina Conversations Collection (CCC), using leave-one-out cross-validation (LOOCV).
Vocalisation graphs are used to represent and analyze dialogue interaction patterns, enabling the modeling of speaker turn transitions and speech dynamics.
The approach avoids complex speech recognition by focusing on robust, low-level dialogue features that are stable even in noisy, real-world settings.
Performance is evaluated using standard metrics: overall accuracy, micro and macro F-measure, with comparisons to alternative classifiers like SVM, random forests, and C4.5.

Experimental results

Research questions

RQ1Can content-free dialogue features such as speech rate and turn-taking patterns reliably distinguish Alzheimer’s-type dementia patients from non-demented individuals in spontaneous conversations?
RQ2How does a machine learning model based solely on interaction dynamics compare in performance to models using rich linguistic features like syntax and semantics?
RQ3To what extent can low-cost, non-invasive tools based on spontaneous speech be effective for early dementia detection in real-world settings?
RQ4Does the use of dialogue-level features extracted from naturalistic interviews yield comparable accuracy to more complex, content-dependent approaches?

Key findings

The proposed method achieved an overall accuracy of 86.5% using only content-free dialogue features, outperforming logistic regression and matching or exceeding other classifiers like SVM and random forests.
The micro F-measure was 0.878, and the macro F-measure was 0.76, indicating strong performance on both the positive (AD) and negative (non-AD) classes.
Real Adaboost achieved the highest accuracy among all tested algorithms, slightly outperforming SVM (83.7%) and random forests (81.1%).
The results are comparable to state-of-the-art methods that use complex lexical, syntactic, and semantic features, despite relying only on easily extractable prosodic and interactional features.
The study demonstrates that dialogue interaction patterns—such as turn-taking and speech rate—can serve as robust, non-invasive biomarkers for early dementia detection.
The findings support the feasibility of developing scalable, low-cost mental health monitoring systems using spontaneous speech data collected in natural environments.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.