Exploiting Relevance of Speech to Sleepiness Detection via Attention Mechanism (ICC 2023)

Bang Tran, Youxiang Zhu, James W. Schwoebel, and Xiaohui Liang

Excessive sleepiness in critical tasks and jobs can lead to adverse outcomes, such as work accidents and car crashes. Detecting and monitoring sleepiness levels can prevent these adverse events from happening. In this paper, we propose an attention-based sleepiness detection method using HuBERT embeddings and eGeMAPS features of human speech. Specifically, we propose an attention-based convolutional neural network (CNN) model that achieves accurate 82.57% sleepiness detection using HuBERT embeddings plus age and gender as inputs. We also show that the embedded attention layers significantly improve the detection accuracy in different cases of inputs. We then explore the attention weights from the attention layers and observe that the long and semantically-different responses from “Picture description,” “Microphone test,” and “Free speech” tasks are more relevant to sleepiness detection when the model is trained with HuBERT only; the short and semantically-similar responses from “Sustained phonation” and “Diadochokinetic” tasks are more relevant when trained with HuBERT plus age and gender. The attention mechanism enables our model to take all responses as one input, simplifying the data pre-processing and identifying the relevant speech responses to sleepiness detection.

Accepted by IEEE International Conference on Communications (ICC) 2023.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: