speech intelligibility

语音清晰度
  • 文章类型: Journal Article
    沟通是人类互动的一个基本方面,然而,许多人必须每天在不太理想的声学环境中说话。调整他们的语音以确保在这些不同的环境中的可理解性可能会施加显著的认知负担。了解谈话者的这种负担对公共空间和工作场所环境的设计具有重要意义,以及演讲者培训计划。这项研究的目的是研究室内声学和说话风格如何通过对心理需求和瞳孔测量的自我评估来影响认知负荷。指示19位以美国英语为母语的成年人以随意和清晰的语音阅读句子-一种已知可增强清晰度的技术-跨越三个混响级别(0.05s,1.2s,和1.83s在500-1000赫兹)。我们的发现表明,在测试的混响范围内,说话风格始终比房间声学更能影响说话者的认知负荷。具体来说,瞳孔测量数据表明,与在具有长混响的房间里说话相比,用清晰的语言说话会提高认知负荷,挑战传统观点的清晰的语音作为一个“简单”的策略,以提高清晰度。这些结果强调了在优化室内声学和开发语音产生训练时,考虑说话者认知负荷的重要性。
    Communication is a fundamental aspect of human interaction, yet many individuals must speak in less-than-ideal acoustic environments daily. Adapting their speech to ensure intelligibility in these varied settings can impose a significant cognitive burden. Understanding this burden on talkers has significant implications for the design of public spaces and workplace environments, as well as speaker training programs. The aim of this study was to examine how room acoustics and speaking style affect cognitive load through self-rating of mental demand and pupillometry. Nineteen adult native speakers of American English were instructed to read sentences in both casual and clear speech-a technique known to enhance intelligibility-across three levels of reverberation (0.05 s, 1.2 s, and 1.83 s at 500-1000 Hz). Our findings revealed that speaking style consistently affects the cognitive load on talkers more than room acoustics across the tested reverberation range. Specifically, pupillometry data suggested that speaking in clear speech elevates the cognitive load comparably to speaking in a room with long reverberation, challenging the conventional view of clear speech as an \'easy\' strategy for improving intelligibility. These results underscore the importance of accounting for talkers\' cognitive load when optimizing room acoustics and developing speech production training.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:对具有发育性语言障碍(结果:口头词汇)和语音障碍(结果:语音可理解性)特征的学龄前儿童的干预技术进行描述性比较和对比,并分析其有效性和理论。
    方法:这是一个带有叙事综合的系统综述。这一进程得到了由相关专业人员和有经验的人组成的专家指导小组的支持。
    方法:OvidEmcare,MEDLINE完成,CINAHL,APAPsycINFO,ERIC,和2012年1月的通信来源进行了搜索。相关研究来自最初发表的综述(截至2012年1月)。
    方法:对患有特发性言语或语言需求的学龄前儿童(80%年龄为2:0-5:11岁)的干预措施;结果与口语词汇或言语可理解性有关。
    方法:搜索于2023年1月27日进行。两名独立研究人员在摘要和全文水平进行了筛选。有关干预内容的数据(例如,技术)和格式/交付(例如,剂量,位置)被提取。根据Campbell等人的方法对数据进行叙述合成。
    结果:包括24项研究:18项用于口语词汇,6项用于语音可理解性。有11项随机对照试验,2个队列研究和11个病例系列。相似性包括对输入相关技术和类似治疗活动的关注。言语研究更有可能是专业主导和临床主导,而不是在家里和通过父母。分析受到研究设计和术语异质性的限制,以及干预报告中的差距。缺少对专家指导小组重要的信息。
    结论:已经确定并综合了口头词汇和语音可理解性干预技术之间的异同。然而,由于研究设计和研究中的异质性问题,有效性分析受到限制.这对该领域证据基础的发展有影响。
    CRD42022373931。
    OBJECTIVE: To descriptively compare and contrast intervention techniques for preschool children with features of developmental language disorder (outcome: oral vocabulary) and speech sound disorder (outcome: speech comprehensibility) and analyse them in relation to effectiveness and theory.
    METHODS: This is a systematic review with narrative synthesis. The process was supported by an expert steering group consisting of relevant professionals and people with lived experience.
    METHODS: Ovid Emcare, MEDLINE Complete, CINAHL, APA PsycINFO, ERIC, and Communication Source from January 2012 were searched. Relevant studies were obtained from an initial published review (up to January 2012).
    METHODS: Interventions for preschool children (80% aged 2:0-5:11 years) with idiopathic speech or language needs; outcomes relating to either oral vocabulary or speech comprehensibility.
    METHODS: Searches were conducted on 27 January 2023. Two independent researchers screened at abstract and full-text levels. Data regarding intervention content (eg, techniques) and format/delivery (eg, dosage, location) were extracted. Data were synthesised narratively according to the methods of Campbell et al.
    RESULTS: 24 studies were included: 18 for oral vocabulary and 6 for speech comprehensibility. There were 11 randomised controlled trials, 2 cohort studies and 11 case series. Similarities included a focus on input-related techniques and similar therapy activities. Speech studies were more likely to be professional-led and clinic-led, rather than at home and through a parent. Analysis was restricted by heterogeneity in study design and terminology, as well as gaps within intervention reporting. Information deemed important to the expert steering group was missing.
    CONCLUSIONS: Similarities and differences between intervention techniques for oral vocabulary and speech comprehensibility have been identified and synthesised. However, analysis of effectiveness was limited due to issues with study design and heterogeneity within studies. This has implications for the progression of the evidence base within the field.
    UNASSIGNED: CRD42022373931.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人们普遍认为,倾听是衡量听力表现的重要结果。然而,关于衡量倾听努力的最佳方法仍然存在争议。这项研究试图在有经验的成人助听器用户中使用功能性近红外光谱(fNIRS)来测量听力的神经相关性。该研究评估了放大和信噪比(SNR)对脑血氧合的影响,期望更容易的听力条件与前额叶皮层的氧合减少有关。30名经验丰富的成人助听器用户在噪声测试句子中重复来自低上下文修订的语音感知的句子最后单词。参与者在硬SNR(个人SNR-50)或简单SNR(个人SNR-50+10dB)下重复单词,佩戴助听器时适合指定目标或不佩戴助听器。除了评估听力准确性和主观听力努力,使用fNIRS测量前额血氧合。不出所料,更容易收听的条件(即,简单的SNR,使用助听器)导致更好的听力准确性,较低的主观倾听努力,与更难听的条件相比,整个前额叶皮层的氧合更低。听力准确性和主观听力努力也是氧合的重要预测因素。
    There is broad consensus that listening effort is an important outcome for measuring hearing performance. However, there remains debate on the best ways to measure listening effort. This study sought to measure neural correlates of listening effort using functional near-infrared spectroscopy (fNIRS) in experienced adult hearing aid users. The study evaluated impacts of amplification and signal-to-noise ratio (SNR) on cerebral blood oxygenation, with the expectation that easier listening conditions would be associated with less oxygenation in the prefrontal cortex. Thirty experienced adult hearing aid users repeated sentence-final words from low-context Revised Speech Perception in Noise Test sentences. Participants repeated words at a hard SNR (individual SNR-50) or easy SNR (individual SNR-50 + 10 dB), while wearing hearing aids fit to prescriptive targets or without wearing hearing aids. In addition to assessing listening accuracy and subjective listening effort, prefrontal blood oxygenation was measured using fNIRS. As expected, easier listening conditions (i.e., easy SNR, with hearing aids) led to better listening accuracy, lower subjective listening effort, and lower oxygenation across the entire prefrontal cortex compared to harder listening conditions. Listening accuracy and subjective listening effort were also significant predictors of oxygenation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在连续的语音感知中,内源性神经活动被时间锁定到声学刺激特征,例如语音幅度包络。这种语音-大脑耦合可以使用非侵入性脑成像技术进行解码,包括脑电图(EEG)。神经解码可以提供临床用途,作为大脑对刺激编码的客观测量,例如在人工耳蜗听音期间,其中语音信号在频谱上严重降级。然而,声学和语言因素之间的相互作用可能导致自上而下的感知调制,从而使听力学应用复杂化。为了解决这种歧义,我们使用声编码语音评估了在听觉听觉听众(n=38;18-35岁)中使用EEG在频谱退化下语音包络的神经解码。我们通过采用可理解的(英语)和不可理解的(荷兰语)刺激来将感官编码与高阶处理分离,使用重复的短语检测任务来维持听觉注意力。对特定对象和组解码器进行了训练,以从保留的EEG数据中重建语音包络,通过随机置换测试确定解码器的重要性。而语音包络重建没有频谱分辨率的变化,通常,可理解的语音与更好的解码精度相关。不同受试者和群体分析的结果相似,在组解码中频谱退化的影响不太一致。置换测试揭示了实验条件在解码器统计意义上的可能差异。总的来说,虽然在个体和群体水平上观察到了鲁棒的神经解码,参与者内部的变异性很可能会阻止临床使用这种措施来区分个体的光谱退化水平和清晰度。
    During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    患有单侧耳聋(SSD)或不对称听力损失(AHL)的人在嘈杂的听力情况和声音定位中特别难以理解语音。这项多中心研究的目的是评估人工耳蜗(CI)对单侧耳聋(SSD)或不对称听力损失(AHL)的影响,特别是关于声音定位和语音清晰度,对电声音高匹配有额外的兴趣。在7个欧洲三级转诊中心进行了一项前瞻性纵向研究,其中包括19名接受人工耳蜗植入的SSD和16名AHL受试者。根据植入前后的均方根误差和符号偏差研究了声音定位精度。术前和激活后的几个时间点评估了安静中的语音识别和噪声中的语音接收阈值。使用音高匹配跟踪CI的音高感知。收集直到激活后12个月的数据。在SSD和AHL科目中,CI显著改善了植入侧声源的声音定位,从而整体声音定位。语音识别在安静与植入耳显著提高。在噪音中,仅在SSD受试者中发现了显着的头影效应。然而,AHL受试者的评估受到样本量小的限制.没有观察到植入物耳朵的音高感知的均匀发展。本研究中显示的益处证实并扩大了现有的证据,证明CI在SSD和AHL中的有效性。特别是,显示出改善的定位是由于植入物侧的定位精度增加。
    People with single-sided deafness (SSD) or asymmetric hearing loss (AHL) have particular difficulty understanding speech in noisy listening situations and in sound localization. The objective of this multicenter study is to evaluate the effect of a cochlear implant (CI) in adults with single-sided deafness (SSD) or asymmetric hearing loss (AHL), particularly regarding sound localization and speech intelligibility with additional interest in electric-acoustic pitch matching. A prospective longitudinal study at 7 European tertiary referral centers was conducted including 19 SSD and 16 AHL subjects undergoing cochlear implantation. Sound localization accuracy was investigated in terms of root mean square error and signed bias before and after implantation. Speech recognition in quiet and speech reception thresholds in noise for several spatial configurations were assessed preoperatively and at several post-activation time points. Pitch perception with CI was tracked using pitch matching. Data up to 12 months post activation were collected. In both SSD and AHL subjects, CI significantly improved sound localization for sound sources on the implant side, and thus overall sound localization. Speech recognition in quiet with the implant ear improved significantly. In noise, a significant head shadow effect was found for SSD subjects only. However, the evaluation of AHL subjects was limited by the small sample size. No uniform development of pitch perception with the implant ear was observed. The benefits shown in this study confirm and expand the existing body of evidence for the effectiveness of CI in SSD and AHL. Particularly, improved localization was shown to result from increased localization accuracy on the implant side.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    用于噪声中语音识别的频带重要性函数,通常在存在稳定背景噪声的情况下确定,指示扩展高频(EHF;8-20kHz)的可忽略作用。然而,最近的发现表明,EHF线索支持多说话者环境中的语音识别,特别是当掩蔽器具有相对于目标降低的EHF水平时。当目标讲话者面对收听者时,此场景可能发生在自然听觉场景中,但面具人不是.在这项研究中,我们通过对频带进行陷波滤波来测量从40到20000Hz的五个频带对于语音识别的重要性。刺激包括从0°记录的女性目标说话者和从0°或56.25°记录的空间共同定位的两个说话者女性掩蔽者,模拟一个面具面对听众或背对听众,分别。结果表明,峰值频带在0.4-1.3kHz频带中的重要性,并且在正面掩蔽条件下去除EHF频带的效果可忽略不计。然而,在非面对的情况下,峰值更宽,EHF重要性更高,并且与面对掩蔽条件下的3.3-8.3kHz频带相当。这些发现表明,在讲话者头部方向不匹配的听力条件下,EHF包含语音识别的重要线索。
    Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8-20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environments, particularly when the masker has reduced EHF levels relative to the target. This scenario can occur in natural auditory scenes when the target talker is facing the listener, but the maskers are not. In this study, we measured the importance of five bands from 40 to 20 000 Hz for speech-in-speech recognition by notch-filtering the bands individually. Stimuli consisted of a female target talker recorded from 0° and a spatially co-located two-talker female masker recorded either from 0° or 56.25°, simulating a masker either facing the listener or facing away, respectively. Results indicated peak band importance in the 0.4-1.3 kHz band and a negligible effect of removing the EHF band in the facing-masker condition. However, in the non-facing condition, the peak was broader and EHF importance was higher and comparable to that of the 3.3-8.3 kHz band in the facing-masker condition. These findings suggest that EHFs contain important cues for speech recognition in listening conditions with mismatched talker head orientations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    有证据表明,小脑在大脑中的作用并不局限于运动功能。相反,小脑活动似乎对于依赖精确事件定时和预测的各种任务至关重要。由于其复杂的结构和在通信中的重要性,人类的语音需要一个特别精确和预测协调的神经过程被成功地理解。最近的研究表明,小脑确实是语音处理的主要贡献者,但是这种贡献是如何实现的机制仍然知之甚少。本研究旨在揭示皮质-小脑协调的潜在机制,并证明其语音特异性。在对脑磁图数据的重新分析中,我们发现小脑的活动与噪声语音的节奏序列一致,不管它的清晰度。然后我们测试了这些“夹带”响应是否持续存在,以及它们如何与其他大脑区域相互作用,当有节奏的刺激停止并且时间预测必须更新时。我们发现,只有可理解的语音在小脑中产生持续的有节奏的反应。在这个“夹带回声,“但不是在有节奏的演讲中,小脑活动与左额下回有关,特别是以对应于先前刺激节奏的速率。这一发现代表了语音处理中特定小脑驱动的时间预测及其传递到皮质区域的证据。
    Evidence accumulates that the cerebellum\'s role in the brain is not restricted to motor functions. Rather, cerebellar activity seems to be crucial for a variety of tasks that rely on precise event timing and prediction. Due to its complex structure and importance in communication, human speech requires a particularly precise and predictive coordination of neural processes to be successfully comprehended. Recent studies proposed that the cerebellum is indeed a major contributor to speech processing, but how this contribution is achieved mechanistically remains poorly understood. The current study aimed to reveal a mechanism underlying cortico-cerebellar coordination and demonstrate its speech-specificity. In a reanalysis of magnetoencephalography data, we found that activity in the cerebellum aligned to rhythmic sequences of noise-vocoded speech, irrespective of its intelligibility. We then tested whether these \"entrained\" responses persist, and how they interact with other brain regions, when a rhythmic stimulus stopped and temporal predictions had to be updated. We found that only intelligible speech produced sustained rhythmic responses in the cerebellum. During this \"entrainment echo,\" but not during rhythmic speech itself, cerebellar activity was coupled with that in the left inferior frontal gyrus, and specifically at rates corresponding to the preceding stimulus rhythm. This finding represents evidence for specific cerebellum-driven temporal predictions in speech processing and their relay to cortical regions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在倾听的同时,我们通常同时参加活动。例如,在招待会上,人们经常站着谈话。已知的是,收听和姿势控制彼此相关联。先前的研究集中在当语音识别任务具有相当高的认知控制要求时,听力和姿势控制的相互作用。这项研究旨在确定当语音识别任务需要最少的认知控制时,听力和姿势控制是否相互作用。即,当单词没有背景噪音时,或大内存负载。这项研究包括22名年轻人,27名中年人,21名老年人。参与者执行语音识别任务(听觉单一任务),姿势控制任务(姿势单一任务)和组合姿势控制和语音识别任务(双重任务),以评估多任务处理的效果。通过更改单词的级别(25或30dBSPL)和平台的移动性(稳定或移动)来操纵听力和姿势控制任务的难度级别。患有听力障碍的成年人的声级增加。在双重任务中,听力表现下降,尤其是中老年人,而姿势控制有所改善。这些结果表明,即使对听力的认知控制需求很小,与姿势控制的相互作用发生。相关分析显示,听力损失比言语识别和姿势控制的年龄更好。
    While listening, we commonly participate in simultaneous activities. For instance, at receptions people often stand while engaging in conversation. It is known that listening and postural control are associated with each other. Previous studies focused on the interplay of listening and postural control when the speech identification task had rather high cognitive control demands. This study aimed to determine whether listening and postural control interact when the speech identification task requires minimal cognitive control, i.e., when words are presented without background noise, or a large memory load. This study included 22 young adults, 27 middle-aged adults, and 21 older adults. Participants performed a speech identification task (auditory single task), a postural control task (posture single task) and combined postural control and speech identification tasks (dual task) to assess the effects of multitasking. The difficulty levels of the listening and postural control tasks were manipulated by altering the level of the words (25 or 30 dB SPL) and the mobility of the platform (stable or moving). The sound level was increased for adults with a hearing impairment. In the dual-task, listening performance decreased, especially for middle-aged and older adults, while postural control improved. These results suggest that even when cognitive control demands for listening are minimal, interaction with postural control occurs. Correlational analysis revealed that hearing loss was a better predictor than age of speech identification and postural control.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    语音识别测试广泛用于临床和研究听力学。这项研究的目的是开发一种新颖的语音识别测试,该测试结合了不同语音识别测试的概念,以减少训练效果,并允许大量的语音材料。新测试由每个试验中的四个不同的单词组成,具有固定结构的有意义的结构,所谓的短语。使用各种免费数据库来选择单词并确定其频率。频繁使用的名词被分为主题类别,并与相关的形容词和不定式相结合。丢弃不适当和不自然的组合后,并消除(子)短语的重复,总共有772个短语。随后,这些短语是使用文本到语音系统合成的。与使用真实扬声器的录音相比,合成显着减少了工作量。排除异常值后,在固定的信噪比(SNR)下,对31名正常听力参与者的短语测得的语音识别得分显示,每个短语的语音识别阈值(SRT)变化高达4dB。中值SRT为-9.1dBSNR,因此与现有的句子测试相当。心理测量功能的斜率为每dB15个百分点,也具有可比性,可以有效地用于听力学。总结,在模块化系统中创建语音材料的原理具有许多潜在的应用。
    Speech-recognition tests are widely used in both clinical and research audiology. The purpose of this study was the development of a novel speech-recognition test that combines concepts of different speech-recognition tests to reduce training effects and allows for a large set of speech material. The new test consists of four different words per trial in a meaningful construct with a fixed structure, the so-called phrases. Various free databases were used to select the words and to determine their frequency. Highly frequent nouns were grouped into thematic categories and combined with related adjectives and infinitives. After discarding inappropriate and unnatural combinations, and eliminating duplications of (sub-)phrases, a total number of 772 phrases remained. Subsequently, the phrases were synthesized using a text-to-speech system. The synthesis significantly reduces the effort compared to recordings with a real speaker. After excluding outliers, measured speech-recognition scores for the phrases with 31 normal-hearing participants at fixed signal-to-noise ratios (SNR) revealed speech-recognition thresholds (SRT) for each phrase varying up to 4 dB. The median SRT was -9.1 dB SNR and thus comparable to existing sentence tests. The psychometric function\'s slope of 15 percentage points per dB is also comparable and enables efficient use in audiology. Summarizing, the principle of creating speech material in a modular system has many potential applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号