关键词: classification k-nearest neighbor listening effort physiological measures social context

Mesh : Humans Pupil / physiology Speech Intelligibility / physiology Speech Perception / physiology Middle Aged Aged

来  源:   DOI:10.1177/23312165241232551   PDF(Pubmed)

Abstract:
In daily life, both acoustic factors and social context can affect listening effort investment. In laboratory settings, information about listening effort has been deduced from pupil and cardiovascular responses independently. The extent to which these measures can jointly predict listening-related factors is unknown. Here we combined pupil and cardiovascular features to predict acoustic and contextual aspects of speech perception. Data were collected from 29 adults (mean  =  64.6 years, SD  =  9.2) with hearing loss. Participants performed a speech perception task at two individualized signal-to-noise ratios (corresponding to 50% and 80% of sentences correct) and in two social contexts (the presence and absence of two observers). Seven features were extracted per trial: baseline pupil size, peak pupil dilation, mean pupil dilation, interbeat interval, blood volume pulse amplitude, pre-ejection period and pulse arrival time. These features were used to train k-nearest neighbor classifiers to predict task demand, social context and sentence accuracy. The k-fold cross validation on the group-level data revealed above-chance classification accuracies: task demand, 64.4%; social context, 78.3%; and sentence accuracy, 55.1%. However, classification accuracies diminished when the classifiers were trained and tested on data from different participants. Individually trained classifiers (one per participant) performed better than group-level classifiers: 71.7% (SD  =  10.2) for task demand, 88.0% (SD  =  7.5) for social context, and 60.0% (SD  =  13.1) for sentence accuracy. We demonstrated that classifiers trained on group-level physiological data to predict aspects of speech perception generalized poorly to novel participants. Individually calibrated classifiers hold more promise for future applications.
摘要:
在日常生活中,声学因素和社会环境都会影响听力投入。在实验室环境中,关于倾听努力的信息已经独立地从瞳孔和心血管反应中推导出来。这些措施可以在多大程度上共同预测听力相关因素是未知的。在这里,我们结合了瞳孔和心血管特征来预测语音感知的声学和上下文方面。数据来自29名成年人(平均=64.6岁,SD=9.2)伴听力损失。参与者在两个个性化的信噪比(对应于正确句子的50%和80%)和两个社交环境(两个观察者的存在和不存在)下执行了语音感知任务。每个试验提取七个特征:基线瞳孔大小,瞳孔扩张峰值,平均瞳孔扩张,跳间间隔,血容量脉冲振幅,射前周期和脉冲到达时间。这些特征被用来训练k-最近邻分类器来预测任务需求,社会语境和句子准确性。对组级数据的k倍交叉验证揭示了高于机会分类的准确性:任务需求,64.4%;社会背景,78.3%;句子准确性,55.1%。然而,当分类器在不同参与者的数据上进行训练和测试时,分类准确性降低.单独训练的分类器(每个参与者一个)比小组级别的分类器表现更好:任务需求为71.7%(SD=10.2),社会背景下88.0%(标准差=7.5),句子准确性为60.0%(SD=13.1)。我们证明了分类器在小组水平的生理数据上进行了训练,以预测言语感知的各个方面,对新参与者的推广效果较差。单独校准的分类器为未来的应用带来了更多的希望。
公众号