Youxiang Zhu, Nana Lin, Kiran Sandilya Balivada, Daniel Haehn, Xiaohui Liang
Detecting dementia via picture description is a challenging text classification task where powerful Large Language Models (LLMs) have not yet outperformed Pre-trained Language Models (PLMs), with previous studies achieving notable accuracy (>80%). The difficulty lies in the limited explicit features for detection, making it hard even for humans to distinguish between healthy and dementia-affected speech transcripts. Additionally, LLM-extracted features are not task-oriented, resulting in low classification effectiveness. In this paper, we present an accurate and interpretable classification approach by Adversarial Text Generation (ATG), a novel decoding strategy for dementia detection. We further develop a comprehensive set of instructions corresponding to various tasks and use them to guide ATG, achieving the best accuracy of 85%. We found that dementia detection can be related to tasks such as assessing attention to detail, language, and clarity with specific features of the environment, character, and other picture content or language-related features. Future work includes incorporating multi-modal LLMs to interpret speech and picture information.
Accepted by the 2024 Conference on Empirical Methods in Natural Language Processing EMNLP 2024.