关键词: audiovisual cross-cultural illusion multisensory speech

来  源:   DOI:10.3389/fnins.2024.1421713   PDF(Pubmed)

Abstract:
In the McGurk effect, visual speech from the face of the talker alters the perception of auditory speech. The diversity of human languages has prompted many intercultural studies of the effect in both Western and non-Western cultures, including native Japanese speakers. Studies of large samples of native English speakers have shown that the McGurk effect is characterized by high variability in the susceptibility of different individuals to the illusion and in the strength of different experimental stimuli to induce the illusion. The noisy encoding of disparity (NED) model of the McGurk effect uses principles from Bayesian causal inference to account for this variability, separately estimating the susceptibility and sensory noise for each individual and the strength of each stimulus. To determine whether variation in McGurk perception is similar between Western and non-Western cultures, we applied the NED model to data collected from 80 native Japanese-speaking participants. Fifteen different McGurk stimuli that varied in syllable content (unvoiced auditory \"pa\" + visual \"ka\" or voiced auditory \"ba\" + visual \"ga\") were presented interleaved with audiovisual congruent stimuli. The McGurk effect was highly variable across stimuli and participants, with the percentage of illusory fusion responses ranging from 3 to 78% across stimuli and from 0 to 91% across participants. Despite this variability, the NED model accurately predicted perception, predicting fusion rates for individual stimuli with 2.1% error and for individual participants with 2.4% error. Stimuli containing the unvoiced pa/ka pairing evoked more fusion responses than the voiced ba/ga pairing. Model estimates of sensory noise were correlated with participant age, with greater sensory noise in older participants. The NED model of the McGurk effect offers a principled way to account for individual and stimulus differences when examining the McGurk effect in different cultures.
摘要:
在麦格克效应中,说话者脸上的视觉语音改变了听觉语音的感知。人类语言的多样性促使许多跨文化研究在西方和非西方文化中的影响,包括母语为日语的人。对以英语为母语的大量样本的研究表明,McGurk效应的特征是不同个体对错觉的敏感性以及不同实验刺激诱发错觉的强度具有高度变异性。McGurk效应的视差(NED)模型的噪声编码使用贝叶斯因果推断的原理来解释这种可变性,分别估计每个人的易感性和感觉噪声以及每个刺激的强度。为了确定McGurk感知的差异在西方和非西方文化之间是否相似,我们将NED模型应用于从80名以日语为母语的参与者收集的数据.15种不同的McGurk刺激,其音节内容各不相同(无声的听觉“pa”视觉“ka”或有声的听觉“ba”视觉“ga”)与视听一致的刺激交织在一起。McGurk效应在刺激和参与者之间差异很大,虚幻融合反应的百分比在刺激中从3%到78%不等,在参与者中从0%到91%不等。尽管有这种可变性,NED模型准确地预测了感知,预测个体刺激的融合率,误差为2.1%,个体参与者的融合率,误差为2.4%。包含无声pa/ka配对的刺激比有声ba/ga配对引起更多的融合反应。感官噪声的模型估计与参与者年龄相关,老年参与者的感觉噪音更大。在研究不同文化中的McGurk效应时,McGurk效应的NED模型提供了一种原则性的方法来解释个体和刺激差异。
公众号