Speech Acoustics

语音声学
  • 文章类型: Journal Article
    当前的研究比较了三种语音任务-交互式,独奏习惯性,独奏清晰。
    九位肌萎缩侧索硬化症继发构音障碍患者参与了这项研究。每个演讲者都通过视频会议软件与典型的对话者配对。扬声器产生了元音/i,,,/h/-vowel-/d/words。对于单独任务,说话者以他们习惯性和清晰的说话风格大声朗读刺激。对于交互式任务,演讲者为对话者提供了目标刺激,以在四种可能性中进行选择。我们测量了长和短元音之间的持续时间差异,以及相邻元音之间的F1/F2欧几里得距离,并且还根据元音的声学特性来确定元音的分类程度。
    在交互式任务中,长元音和短元音之间的时间对比度高于两个单独任务。在交互式任务中,某些对相邻元音对之间的频谱距离也比习惯性语音任务高。最后,在交互任务中元音分类精度最高。
    总的来说,我们发现有证据表明,构音障碍患者在结构化互动中产生的元音比在单独任务中产生的元音具有更大的声学对比度。此外,他们对元音段的语音调整与独奏语音中观察到的不同。
    UNASSIGNED: The current study compared temporal and spectral acoustic contrast between vowel segments produced by speakers with dysarthria across three speech tasks-interactive, solo habitual, and solo clear.
    UNASSIGNED: Nine speakers with dysarthria secondary to amyotrophic lateral sclerosis participated in the study. Each speaker was paired with a typical interlocutor over videoconferencing software. The speakers produced the vowels /i, ɪ, ɛ, æ/ in /h/-vowel-/d/ words. For the solo tasks, speakers read the stimuli aloud in both their habitual and clear speaking styles. For the interactive task, speakers produced a target stimulus for their interlocutor to select among the four possibilities. We measured the duration difference between long and short vowels, as well as the F1/F2 Euclidean distance between adjacent vowels, and also determined how well the vowels could be classified based on their acoustic characteristics.
    UNASSIGNED: Temporal contrast between long and short vowels was higher in the interactive task than in both solo tasks. Spectral distance between adjacent vowel pairs was also higher for some pairs in the interactive task than the habitual speech task. Finally, vowel classification accuracy was highest in the interactive task.
    UNASSIGNED: Overall, we found evidence that individuals with dysarthria produced vowels with greater acoustic contrast in structured interactions than they did in solo tasks. Furthermore, the speech adjustments they made to the vowel segments differed from those observed in solo speech.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:目的是描述声学,听觉感知,喉内收肌张力障碍(AdLD)患者在伦巴第效应(LE)下的主观声音变化。
    方法:声音努力的主观感知(OMNI声音努力量表OMNI-VES),最大语音时间(MPT),并在10例AdLD患者和10例典型语音患者的静止状态和LE下评估了发声障碍的感知严重程度(GRBAS量表)。要求扬声器产生持续的元音/a/并大声朗读语音平衡的文本。使用PRAAT软件,分析了以下声学参数:平均音调(Hz),最小和最大强度(dB),局部无声帧的分数,语音中断的次数,语音中断程度(%),倒谱峰突出度平滑(CPPS)(dB)。
    结果:在LE下,AdLD组显示GRBAS和主观努力的G和S参数均降低,平均MPT显著增加;在对照组中没有显著变化。在LE下的两组中,持续元音/a/的音高和强度与LE一致显着增加。在AdLD组中,OMNI-VES得分的平均增益和语音分析的每个参数的平均增益均显着大于对照组。
    结论:在LE下获得的听觉反馈剥夺改善了主观,知觉-听觉,和AdLD患者的声学参数。这些发现鼓励进一步的研究,为听觉系统在AdLD发病机理中的作用提供新的知识,并开发新的治疗策略。
    方法:4喉镜,2024.
    OBJECTIVE: The aim was to describe the acoustic, auditory-perceptive, and subjective voice changes under the Lombard effect (LE) in adductor laryngeal dystonia (AdLD) patients.
    METHODS: Subjective perception of vocal effort (OMNI Vocal Effort Scale OMNI-VES), Maximum Phonation Time (MPT), and the perceptual severity of dysphonia (GRBAS scale) were assessed in condition of stillness and under LE in 10 AdLD patients and in 10 patients with typical voice. Speakers were asked to produce the sustained vowel /a/ and to read a phonetically balanced text aloud. Using the PRAAT software, the following acoustic parameters were analyzed: Mean Pitch (Hz), Minimum and Maximum Intensity (dB), the Fraction of Locally Unvoiced Frames, the Number of Voice Breaks, the Degree of Voice Breaks (%), the Cepstral Peak Prominence-Smoothed (CPPS) (dB).
    RESULTS: Under LE, the AdLD group showed a decrease of both G and S parameters of GRBAS and subjective effort, mean MPT increased significantly; in the controls there were no significant changes. In both groups under LE, pitch and intensity of the sustained vowel /a/ significantly increased consistently with LE. In the AdLD group the mean gain of OMNI-VES score and the mean gain of each parameter of the speech analysis were significantly greater than the controls\' ones.
    CONCLUSIONS: Auditory feedback deprivation obtained under LE improves subjective, perceptual-auditory, and acoustics parameters of AdLD patients. These findings encourage further research to provide new knowledge into the role of the auditory system in the pathogenesis of AdLD and to develop new therapeutic strategies.
    METHODS: 4 Laryngoscope, 134:3754-3760, 2024.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这项研究的目的是确定(a)声门停止产生的声学测量的诊断准确性(GSP;强度差异,斜坡,完全停止发声)以区分单侧声带麻痹/麻痹(UVFP)患者和对照组;(b)如果GSP的声学测量与语音障碍严重程度的声学测量显着相关,声学语音质量指数(AVQI);和(C)如果声学测量来自另一种类型的发声停止,无声辅音的制作,组间也有显著差异。
    97例单侧轻瘫/麻痹患者和35例正常喉镜征象的对照患者产生两组5例重复[i]和4例重复[isi]。令牌在组之间按类型随机分配,并使用定制的Praat程序进行盲化分析,该程序计算了[i]间令牌的元音最大值和声门停止最小值之间的强度差异和斜率,以及[isi]内令牌的元音最大值和无声辅音最小值。获得了[i]间令牌的发声终止数。
    来自[i]间令牌的开始和偏移强度差异以及发声闭合次数在曲线下的面积最大(分别为.854、.856和.835)。AVQI和具有弱/中等效应大小的所有GSP声学测量之间的相关系数显着(p<.01)。在对照和UVFP参与者之间,内部[isi]的声学测量没有发现显着差异。
    声学GSP措施显示出良好的诊断准确性,并与语音障碍的严重程度有关。在对照组和UVFP参与者之间,对内侧无声擦音辅音的声学测量没有显着差异,这表明无声擦音的发声停止与GSP的发声停止不同。
    UNASSIGNED: The aim of this study was to determine (a) diagnostic accuracy of acoustic measures of glottal stop production (GSP; intensity differences, slopes, complete voicing cessation) to distinguish between unilateral vocal fold paresis/paralysis (UVFP) patients and controls; (b) if acoustic measures of GSP significantly correlated with an acoustic measure of voice disorder severity, acoustic voice quality index (AVQI); and (c) if acoustic measures from another type of voicing cessation, voiceless consonant production, also significantly differed between groups.
    UNASSIGNED: Ninety-seven patients with unilateral paresis/paralysis and 35 controls with normal laryngostroboscopic signs produced two sets of five repeated [i] and four repeated [isi]. Tokens were randomized by type between groups and analyzed blinded using a customized Praat program that computed intensity differences and slopes between vowel maxima and glottal stop minima for inter-[i] tokens and vowel maxima and voiceless consonant minima for intra-[isi] tokens. The number of voicing cessations for inter-[i] tokens was obtained.
    UNASSIGNED: Onset and offset intensity differences and number of voicing cessations from inter-[i] tokens had the greatest areas under the curve (.854, .856, and .835, respectively). Correlation coefficients were significant (p < .01) between AVQI and all GSP acoustic measures with weak/medium effect sizes. No significant differences were found between controls and participants with UVFP for acoustic measures from intra-[isi].
    UNASSIGNED: Acoustic GSP measures demonstrated good diagnostic accuracy and some relationship to severity of voice disorder. No significant differences in acoustic measures for medial voiceless fricative consonants between controls and participants with UVFP suggested that voicing cessation for voiceless fricatives differs from voicing cessation for GSP.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本文评估了一种用于儿童和成人非裔美国人英语口语方言密度预测的创新框架。说话者的方言密度定义为其语音中出现方言特定语言特征的频率。而不是将用户语音中是否存在目标方言视为二元决策,相反,训练分类器来预测方言密度的水平,以在下游任务中提供更高的特异性。为此,来自HuBERT的自监督学习表示,从ASR转录本中提取的手工制作的基于语法的特征,韵律特征,和其他特征集作为XGBoost分类器的输入进行试验。然后,分类器被训练为为短记录的话语分配方言密度标签。对于儿童和成人语音,可以实现较高的方言密度级别分类精度,并在不同年龄和地区方言品种中表现出稳健的表现。此外,这项工作被用作分析哪些声学和语法线索影响方言的机器感知的基础。
    This paper evaluates an innovative framework for spoken dialect density prediction on children\'s and adults\' African American English. A speaker\'s dialect density is defined as the frequency with which dialect-specific language characteristics occur in their speech. Rather than treating the presence or absence of a target dialect in a user\'s speech as a binary decision, instead, a classifier is trained to predict the level of dialect density to provide a higher degree of specificity in downstream tasks. For this, self-supervised learning representations from HuBERT, handcrafted grammar-based features extracted from ASR transcripts, prosodic features, and other feature sets are experimented with as the input to an XGBoost classifier. Then, the classifier is trained to assign dialect density labels to short recorded utterances. High dialect density level classification accuracy is achieved for child and adult speech and demonstrated robust performance across age and regional varieties of dialect. Additionally, this work is used as a basis for analyzing which acoustic and grammatical cues affect machine perception of dialect.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:确定在线会议(OMs)中影响发声习惯的因素。
    方法:一项前瞻性试验,包括40名参与者,没有任何已知的听力或声带疾病。受试者参加了一个OM,分为六个随机排序的部分,随着音频/说话设备和语言的改变:计算机的扬声器-麦克风,一个耳塞,双耳塞或耳机;带/不带视频,母语(希伯来语)与第二语言(英语)。每个部分都包括言论自由,持续发声,一个标准化的通道。参与者对每个部分的声音努力进行排名。三名失明的评估者使用GRBAS量表对声音进行了独立评分,并进行了声学分析。
    结果:各节之间自我报告的发声努力没有显着差异。第二语言说话导致强度显着增加(p<0.0001),频率(p=0.015),GRBAS(p=0.008),和应变(p<0.0001)评分。使用计算机的扬声器/麦克风导致显著更高的应变(p<0.0001)。使用耳机,单个或两个耳塞导致较低的强度和较低的应变评分。在有或没有视频的OM之间没有检测到差异。
    结论:使用计算机的麦克风/扬声器或用第二语言说话,可能导致与声乐创伤相关的声乐习惯。
    OBJECTIVE: To identify factors that influence vocal habits during online meetings (OMs).
    METHODS: A prospective trial of forty participants without any known hearing or vocal cord disorders. Subjects participated in an OM divided into six randomly ordered sections, with alterations in audio/speaking equipment and language: the computer\'s speaker-microphone, a single earbud, two-earbuds or headphones; with/without video, native-language-speaking (Hebrew) versus second language-speaking (English). Each section included free speech, sustained phonation, and a standardized passage. Participants ranked their vocal-effort for each section. Three blinded raters independently scored the voice using the GRBAS scale, and acoustic analyses were performed.
    RESULTS: No significant difference in self-reported vocal effort was demonstrated between sections. Second-language speaking resulted in significantly increased intensity (p < 0.0001), frequency (p = 0.015), GRBAS (p = 0.008), and strain (p < 0.0001) scores. Using the computer\'s speaker/microphone resulted in significantly higher strain (p < 0.0001). Using headphones, single or two earbuds resulted in lower intensity and a lower strain score. No differences were detected between OMs with or without video.
    CONCLUSIONS: Using the computer\'s microphone/speaker or speaking in a second language during OMs, may result in vocal habits associated with vocal trauma.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    迄今为止,缺乏对当代保加利亚标准元音的声学进行全面检查,本文旨在填补这一空白。六个声学变量——前三个共振峰频率,持续时间,使用线性混合模型分析了140个说话者的平均f0和11615个元音标记的平均强度,多变量方差分析,和线性判别分析。元音系统,其中包括六个处于强调位置的音素,[εa44%iu],从四个角度检查。首先,将漂亮音节中的元音与其他无重读的元音进行比较,没有发现光谱或持续时间差异,与经常重复的关于长音元音减少的说法相反。第二,压力和非压力元音的比较显示,非高元音[εa﹤]的所有六个变量均存在显着差异。在[iu]中未发现光谱或持续时间差异,这反驳了另一种公认的观点,即高元音在没有压力时会降低。第三,将非高元音与高元音进行比较;在无应力的[a-]和[-u]中,高度对比被完全中和,而[ε-i]保持不同。最后,检查了元音对比的声学相关性,并且证明只有F1,F2频率和持续时间被系统地用于区分元音音素。
    A comprehensive examination of the acoustics of Contemporary Standard Bulgarian vowels is lacking to date, and this article aims to fill that gap. Six acoustic variables-the first three formant frequencies, duration, mean f0, and mean intensity-of 11 615 vowel tokens from 140 speakers were analysed using linear mixed models, multivariate analysis of variance, and linear discriminant analysis. The vowel system, which comprises six phonemes in stressed position, [ε a ɔ i ɤ u], was examined from four angles. First, vowels in pretonic syllables were compared to other unstressed vowels, and no spectral or durational differences were found, contrary to an oft-repeated claim that pretonic vowels reduce less. Second, comparisons of stressed and unstressed vowels revealed significant differences in all six variables for the non-high vowels [ε a ɔ]. No spectral or durational differences were found in [i ɤ u], which disproves another received view that high vowels are lowered when unstressed. Third, non-high vowels were compared with their high counterparts; the height contrast was completely neutralized in unstressed [a-ɤ] and [ɔ-u] while [ε-i] remained distinct. Last, the acoustic correlates of vowel contrasts were examined, and it was demonstrated that only F1, F2 frequencies and duration were systematically employed in differentiating vowel phonemes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    精神分裂症患者的语调平坦且单调。该研究的目的是找到患者与丹麦语健康对照者不同的扁平化言语变量。
    我们比较了未吸毒的精神分裂症患者5名男性,13名妇女和18名对照,18-35岁,都是在哥本哈根长大的,讲现代丹麦标准(rigsdansk)。我们使用了两种不同的任务,对演讲者提出了不同的要求来引发自发的演讲:重述电影剪辑和从书中的图片中讲故事。语言学家使用计算机程序Praat提取语音语言参数。
    我们发现了两个启发任务的不同结果(任务1:电影剪辑的重述,任务2:从书中的图片中讲故事)。控件中任务一的强度变化较高,控件中任务二的音高变化较高。我们发现,在任务1中,控件中的重音强度变化较高,而控件中的每个重音之间的音节较少。我们还发现,在患者组中,任务一和任务二的F1变化较高,而在两个任务中,对照组的F2变化较高。
    患者和对照组的结果各不相同,但是要求也有所不同。需要进一步的研究来阐明声学措施在与精神分裂症相关的诊断或语言治疗中的可能性。
    UNASSIGNED: Patients with schizophrenia have a flat and monotonous intonation. The purpose of the study was to find the variables of flat speech that differed in patients from those in healthy controls in Danish.
    UNASSIGNED: We compared drug-naïve schizophrenic patients 5 men, 13 women and 18 controls, aged 18-35 years, which had all grown up in Copenhagen speaking modern Danish standard (rigsdansk). We used two different tasks that lay different demands on the speaker to elicit spontaneous speech: a retelling of a film clip and telling a story from pictures in a book. A linguist used the computer program Praat to extract the phonetic linguistic parameters.
    UNASSIGNED: We found different results for the two elicitation tasks (Task 1: a retelling of a film clip, task 2: telling a story from pictures in a book). There was higher intensity variation in task one in controls and higher pitch variation in task two in controls. We found a difference in intensity with higher intensity variation in the stresses in the controls in task one and fewer syllables between each stress in the controls. We also found higher F1 variation in task one and two in the patient group and higher F2 variation in the control group in both tasks.
    UNASSIGNED: The results varied between patients and controls, but the demands also made a difference. Further research is needed to elucidate the possibilities of acoustic measures in diagnostics or linguistic treatment related to schizophrenia.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    以前的研究表明,西班牙语中的/l/显示发音变异性的模式,这些模式是由语音的复杂相互作用决定的,语音和方言因素。在这项研究中,我们报告了使用超声舌成像(UTI)的实验结果,该实验在西班牙语使用者的方言横截面中测试/l/-发音。我们表明,对于某些说话者,在短语边缘上下文中/l/的加长伴随着发音上的区别(例如,根/背退缩),而其他人在这些情况下产生/l/的延长实现,而舌头位置没有明显差异。我们还发现了减少话语-中间语音和前声母环境的声学证据(持续时间,强度,讨论了F1频率测量)。然而,在这些情况下,说话者之间的减少的发音相关性并不一致。除了将结果与前驱动的强化和还原模式联系起来,我们的研究结果与西班牙语中有关复制的辩论有关.具体来说,我们认为,在语音分析下,我们的结果不能被直接容纳,假设单词最终辅音定期在单词边界之间进行语音识别。
    Previous research has shown that /l/ in Spanish displays patterns of articulatory variability that are determined by a complex interaction of phonetic, phonological and dialectal factors. In this study, we report the results of an experiment using Ultrasound Tongue Imaging (UTI) that tests /l/-articulations in a dialectal cross-section of Spanish speakers. We show that lengthening of /l/ in phrase-edge contexts is accompanied by articulatory distinctions (e.g. root/dorsum retraction) for some speakers, whereas others produce lengthened realisations of /l/ in these contexts without observable differences in tongue position. We also find acoustic evidence for reduction in utterance-medial intervocalic and preconsonantal environments (duration, intensity, F1 frequency measures are discussed). However, articulatory correlates of reduction are not consistently observed across speakers in these contexts. As well as relating the results to prosodically-driven strengthening and reduction patterns, our findings are of relevance to debates about resyllabification in Spanish. Specifically, we argue that our results cannot be straightforwardly accommodated under phonological analysis assuming that word-final consonants regularly resyllabify across word boundaries prevocalically.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    食管(ES)演讲,气管食管(TE)演讲,和电喉部(EL)是在去除喉部之后的常见的通信方法。我们最近的研究表明,与日常的“习惯性讲话”(HS)相比,使用清晰语音(CS)的广东话者的清晰度可能会增加,但是推理仍然不清楚[Hui,考克斯,黄,陈,和Ng(2022)。FoliaPhoniatr.Logop.74,103-111]。这项研究的目的是评估使用HS和CS的粤语发号音者产生的元音和音调的声学特征。31个咽喉扬声器(9EL,10ES,和12个TE扬声器)阅读HS和CS中的北风和太阳通道。元音共振峰,元音空间区域(VSA),说话率,螺距,和强度进行了检查,并评估了它们与清晰度的关系。统计模型表明,较大的VSA显着提高了清晰度,但是说话速度较慢没有。所有三组的HS和CS之间的元音和音调对比没有差异,但是高音和低音之间的基本频率和强度差异编码的信息量与TE和ES组的清晰度呈正相关,分别。需要继续进行研究,以了解不同的说话条件对改善粤语的听觉和感知特征的影响。
    Esophageal (ES) speech, tracheoesophageal (TE) speech, and the electrolarynx (EL) are common methods of communication following the removal of the larynx. Our recent study demonstrated that intelligibility may increase for Cantonese alaryngeal speakers using clear speech (CS) compared to their everyday \"habitual speech\" (HS), but the reasoning is still unclear [Hui, Cox, Huang, Chen, and Ng (2022). Folia Phoniatr. Logop. 74, 103-111]. The purpose of this study was to assess the acoustic characteristics of vowels and tones produced by Cantonese alaryngeal speakers using HS and CS. Thirty-one alaryngeal speakers (9 EL, 10 ES, and 12 TE speakers) read The North Wind and the Sun passage in HS and CS. Vowel formants, vowel space area (VSA), speaking rate, pitch, and intensity were examined, and their relationship to intelligibility were evaluated. Statistical models suggest that larger VSAs significantly improved intelligibility, but slower speaking rate did not. Vowel and tonal contrasts did not differ between HS and CS for all three groups, but the amount of information encoded in fundamental frequency and intensity differences between high and low tones positively correlated with intelligibility for TE and ES groups, respectively. Continued research is needed to understand the effects of different speaking conditions toward improving acoustic and perceptual characteristics of Cantonese alaryngeal speech.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    评估了正颌手术(OS)对言语的影响,特别是解决骨骼和气道变化对语音共振特征和关节功能的影响。进行了一项前瞻性研究,涉及29名连续接受OS的患者。术前,和短期和长期的术后评估的解剖变化(骨骼和气道测量),语音演化(通过声学分析客观评估:基频,局部抖动,每个元音的局部闪烁,和元音/a/)的共振峰F1和F2,和关节功能(使用代偿肌肉组织,接合点,和语音清晰度)。这些也是通过视觉模拟量表来主观评估的。OS后的关节功能显示立即改善,并在随访一年时进一步进步。这种改善与解剖学变化显着相关,患者也明显察觉到。另一方面,尽管据报道,声乐共振的轻微改变与舌头的解剖变化有关,舌骨,和气道,患者主观上没有察觉。总之,结果表明,OS对患者的发音功能和难以察觉的主观变化有有益的影响。接受OS的患者,除了受益于改善的关节功能,不应该害怕他们在治疗后不会认出自己的声音。
    An evaluation was made of the impact of orthognathic surgery (OS) on speech, addressing in particular the effects of skeletal and airway changes on voice resonance characteristics and articulatory function. A prospective study was carried out involving 29 consecutive patientssubjected to OS. Preoperative, and short and long-term postoperative evaluations were made of anatomical changes (skeletal and airway measurements), speech evolution (assessed objectively by acoustic analysis: fundamental frequency, local jitter, local shimmer of each vowel, and formants F1 and F2 of vowel /a/), and articulatory function (use of compensatory musculature, point of articulation, and speech intelligibility). These were also assessed subjectively by means of a visual analogue scale. Articulatory function after OS showed immediate improvement and had further progressed at one year of follow up. This improvement significantly correlated with the anatomical changes, and was also notably perceived by the patient. On the other hand, although a slight modification in vocal resonance was reported and seen to correlate with anatomical changes of the tongue, hyoid bone, and airway, it was not subjectively perceived by the patients. In conclusion, the results demonstrated that OS had beneficial effects on articulatory function and imperceptible subjective changes in a patient\'s voice. Patients subjected to OS, apart from benefitting from improved articulatory function, should not be afraid that they will not recognise their voice after treatment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号