Speech Acoustics

语音声学
  • 文章类型: Journal Article
    锦语经历了历史的音调分裂,导致复杂的色调系统的发展。然而,关于与基于抽吸的音调分离相关联的声学特性的知识仍然有限。这项研究旨在调查与DongleiKam的音调配准和喉部构型有关的声学线索,南锦的方言.十六名以东雷锦为母语的人士参加,产生词汇音调。进行了统计分析,以检查音调寄存器之间的声学区别,使用语音发作时间的测量,频谱倾斜,噪音,和能量。结果表明,东雷Kam保留了误吸的双向对比,尽管有逐渐亏损的趋势。此外,在Ciyin音调寄存器中检测到呼吸的声音,其特征在于整个元音的频谱倾斜值和频谱噪声升高。此外,机器学习分类器使用语音质量数据有效地识别音调寄存器,这表明呼吸和模态语音之间的发声对比可能有助于音调分裂和音调对比。总之,这些发现增强了我们对Kam呼吸的声学实施的理解,并为喉部对比剂在音调分裂中的作用提供了有价值的见解。
    The Kam language has experienced historical tonal splits, resulting in the development of a complex tonal system. However, there is still limited knowledge regarding the acoustic characteristics associated with aspiration-based tone splitting. This study aims to investigate the acoustic cues related to the tonal registers and laryngeal configurations in Donglei Kam, a dialect of Southern Kam. Sixteen native speakers of Donglei Kam participated, producing lexical tones. Statistical analyses were conducted to examine the acoustic distinctions between tonal registers, using measurements of voice onset time, spectral tilt, noise, and energy. The results indicated that Donglei Kam retained a two-way contrast of aspiration, albeit with a trend toward gradual loss. Additionally, a breathy voice was detected in the Ciyin tonal register, characterized by elevated spectral tilt values and spectral noise throughout the vowels. Moreover, machine learning classifiers effectively identified tonal registers using voice-quality data, suggesting that the phonation contrast between breathy and modal voice could contribute to the tonal split alongside pitch contrast. In summary, these findings enhance our understanding of the acoustic implementation of breathiness in Kam and offer valuable insights into the role of laryngeal contrast in tonal splits.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    自闭症患者的言语韵律异常已被广泛报道。许多关于自闭症谱系障碍儿童和成年人说非音调语言的研究表明,使用韵律线索来标记焦点的缺陷。然而,很少检查自闭症儿童说一种音调语言的重点标记。说广东话的孩子可能会面临额外的困难,因为音调语言要求他们使用韵律提示来同时实现多种功能,例如词汇对比和焦点标记。这项研究通过在声学上评估使用粤语语音韵律来标记患有和不患有自闭症谱系障碍的粤语儿童的信息结构,从而弥合了这一研究差距。我们设计了语音制作任务,以在具有不同音调组合的句子中在这些孩子中引起自然的广泛和狭窄的焦点制作。分析了韵律焦点标记的声学相关性,如f0,每个音节的持续时间和强度,以检查参与者组的效果,焦点条件和词汇音调。我们的结果表明,有和没有自闭症谱系障碍的说广东话的儿童之间的焦点标记模式存在差异。自闭症儿童在标记焦点时,不仅在f0范围和持续时间方面表现出焦点扩展不足,但通常也产生不太独特的色调形状。没有证据表明韵律复杂性(即单音或组合的句子)显着影响这些自闭症儿童及其典型发育(TD)同伴的焦点标记。
    Abnormal speech prosody has been widely reported in individuals with autism. Many studies on children and adults with autism spectrum disorder speaking a non-tonal language showed deficits in using prosodic cues to mark focus. However, focus marking by autistic children speaking a tonal language is rarely examined. Cantonese-speaking children may face additional difficulties because tonal languages require them to use prosodic cues to achieve multiple functions simultaneously such as lexical contrasting and focus marking. This study bridges this research gap by acoustically evaluating the use of Cantonese speech prosody to mark information structure by Cantonese-speaking children with and without autism spectrum disorder. We designed speech production tasks to elicit natural broad and narrow focus production among these children in sentences with different tone combinations. Acoustic correlates of prosodic focus marking like f0, duration and intensity of each syllable were analyzed to examine the effect of participant group, focus condition and lexical tones. Our results showed differences in focus marking patterns between Cantonese-speaking children with and without autism spectrum disorder. The autistic children not only showed insufficient on-focus expansion in terms of f0 range and duration when marking focus, but also produced less distinctive tone shapes in general. There was no evidence that the prosodic complexity (i.e. sentences with single tones or combinations of tones) significantly affected focus marking in these autistic children and their typically-developing (TD) peers.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在接受甲状腺手术的患者中经常观察到单侧声带麻痹。本研究探讨了声学语音分析(客观测量)与语音障碍指数(VHI,自我评估工具)。纳入了有或没有术后单侧声带麻痹(PVCP和NPVCP)的甲状腺手术患者。通过VHI和发音障碍严重程度指数(DSI)工具对患者进行评估。PVCP患者的VHI评分明显高于NPVCP患者。抖动(%)和微光(%)显著增加,而PVCP患者的DSI显著降低。受试者工作特征曲线显示VHI评分与PVCP的诊断相关,其中VHI总分的曲线下面积(AUC)为0.81。在声学参数中,DSI与PVCP高度相关(AUC=0.82,95CI=0.75至0.89)。此外,我们发现VHI评分与语音声学参数之间存在相关性.其中,DSI与功能和VHI评分有中等相关性,R值分别为0.41和0.49。VHI评分和声学参数与PVCP的诊断相关。
    Unilateral vocal cord paralysis is frequently observed in patients who undergo thyroid surgery. This study explored the correlation between acoustic voice analysis (objective measure) and Voice Handicap Index (VHI, a self-assessment tool). One hundred and forty patients who had thyroid surgery with or without postoperative unilateral vocal cord paralysis (PVCP and NPVCP) were included. The patients were evaluated by the VHI and Dysphonia Severity Index (DSI) tools. VHI scores were significantly higher in PVCP patients than in NPVCP patients. Jitter (%) and shimmer (%) were significantly increased, whereas DSI was significantly decreased in PVCP patients. Receiver operating characteristics curve revealed that VHI scores were associated with the diagnosis of PVCP, of which VHI total score yielded an area under the curve (AUC) of 0.81. Among acoustic parameters, DSI was highly associated to PVCP (AUC=0.82, 95%CI=0.75 to 0.89). Moreover, we found a correlation between VHI scores and voice acoustic parameters. Among them, DSI had a moderate correlation with functional and VHI scores, as suggested by an R value of 0.41 and 0.49, respectively. VHI scores and acoustic parameters were associated with the diagnosis of PVCP.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    语音输入的质量影响L1和L2采集的效率。这项研究研究了标准普通话(一种音调语言)中婴儿定向语音(IDS)和外国人定向语音(FDS)的修改,并探讨了IDS和FDS特征如何在双音节单词和更长的话语中表现出来。该研究旨在确定与成人导向语音(ADS)相比,IDS和FDS的哪些特征得到了增强,以及在一组常见的声学参数中测量时,IDS和FDS如何不同。Forwords,发现音调元音持续时间,基频的平均值和范围(F0),IDS和FDS中的词汇音调轮廓相对于ADS得到了增强,除了浸渍音3表现出意外的FDS下降,但与ADS相比,IDS中没有任何修改。对于话语,IDS和FDS强调了时间和F0增强的不同方面:IDS中的平均F0较高,而FDS中的总话语持续时间较长。这些发现增加了有关L1和L2语音输入特征及其在语言习得中的作用的文献。
    The quality of speech input influences the efficiency of L1 and L2 acquisition. This study examined modifications in infant-directed speech (IDS) and foreigner-directed speech (FDS) in Standard Mandarin-a tonal language-and explored how IDS and FDS features were manifested in disyllabic words and a longer discourse. The study aimed to determine which characteristics of IDS and FDS were enhanced in comparison with adult-directed speech (ADS), and how IDS and FDS differed when measured in a common set of acoustic parameters. For words, it was found that tone-bearing vowel duration, mean and range of fundamental frequency (F0), and the lexical tone contours were enhanced in IDS and FDS relative to ADS, except for the dipping Tone 3 that exhibited an unexpected lowering in FDS, but no modification in IDS when compared with ADS. For the discourse, different aspects of temporal and F0 enhancements were emphasized in IDS and FDS: the mean F0 was higher in IDS whereas the total discourse duration was greater in FDS. These findings add to the growing literature on L1 and L2 speech input characteristics and their role in language acquisition.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本研究旨在探讨戒毒对普通话语音声学特征的影响。收集了66名戒毒不同时间戒毒的男性海洛因使用者的言语录音,特别是排毒时间少于2年的早期戒毒使用者,持续戒毒2年的使用者,以及排毒时间超过2年的长期戒毒使用者。声学分析的结果表明,早期戒断用户的响度较低,F1、F2和F3的相对能量,较高的H1-A3和较少的响度峰值/秒,以及更长的无声片段平均持续时间,与持续和长期禁欲的用户相比。研究结果表明,戒毒可能会导致戒断海洛因使用者的言语康复过程(例如,声音嘶哑较少)。本研究不仅为戒毒对言语产生的影响提供了有价值的见解,而且为海洛因使用者的言语康复和戒毒治疗提供了理论依据。
    This study aims to investigate the effect of detoxification on acoustic features of Mandarin speech. Speech recordings were collected from 66 male abstinent heroin users with different durations of drug detoxification, specifically early abstinent users with a detoxification duration of less than 2 years, sustained abstinent users with 2 years of detoxification, and long-term abstinent users with a detoxification duration of more than 2 years. The results of the acoustic analyses showed that early abstinent users exhibited lower loudness, relative energies of F1, F2, and F3, higher H1-A3, and fewer loudness peaks per second, as well as a longer average duration of unvoiced segments, compared to the sustained and long-term abstinent users. The findings suggest that detoxification may lead to a rehabilitation process in the speech production of abstinent heroin users (e.g., less vocal hoarseness). This study not only provides valuable insights into the effect of detoxification on speech production but also provides a theoretical basis for the speech rehabilitation and detoxification treatment of heroin users.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究调查了40名中国外语学习者(EFL学习者)与40名以英语为母语的人在产生四种英语时态宽松对比方面的差异,/i-↔/,/u-/,/α-/,和/℃-ε/,通过检查持续时间的声学测量,前三个共振峰频率,和第一共振峰运动的斜率(F1斜率)。使用离散余弦变换系数对动态共振峰轨迹进行建模,以证明共振峰轨迹的时变特性。采用判别分析来说明中国EFL学习者对不同声学参数的依赖程度。本研究发现:(1)中国英语学习者过分强调持续时间差异,削弱了语谱差异,/u-/,和/α-/对,尽管它们保持了足够的光谱差异,但对于/℃-ε/。相比之下,以英语为母语的人主要使用所有四对的频谱差异;(2)在非低时态-松弛对比中,与以英语为母语的人不同,中国EFL学习者未能表现出不同的F1斜率值,表明在发音过程中舌根位置是非天然的。研究结果强调了动态频谱模式对英语时态和松懈元音之间的区分的贡献,并揭示了精确的发音手势对实现时差对比的影响。
    This study investigated how 40 Chinese learners of English as a foreign language (EFL learners) differed from 40 native English speakers in the production of four English tense-lax contrasts, /i-ɪ/, /u-ʊ/, /ɑ-ʌ/, and /æ-ε/, by examining the acoustic measurements of duration, the first three formant frequencies, and the slope of the first formant movement (F1 slope). The dynamic formant trajectory was modeled using discrete cosine transform coefficients to demonstrate the time-varying properties of formant trajectories. A discriminant analysis was employed to illustrate the extent to which Chinese EFL learners relied on different acoustic parameters. This study found that: (1) Chinese EFL learners overemphasized durational differences and weakened spectral differences for the /i-ɪ/, /u-ʊ/, and /ɑ-ʌ/ pairs, although they maintained sufficient spectral differences for /æ-ε/. In contrast, native English speakers predominantly used spectral differences across all four pairs; (2) in non-low tense-lax contrasts, unlike native English speakers, Chinese EFL learners failed to exhibit different F1 slope values, indicating a non-nativelike tongue-root placement during the articulatory process. The findings underscore the contribution of dynamic spectral patterns to the differentiation between English tense and lax vowels, and reveal the influence of precise articulatory gestures on the realization of the tense-lax contrast.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    演讲者可以将他们的韵律突出放在句子中的任何位置,产生焦点韵律,让听众感知新信息。本研究旨在通过阐明识别焦点位置所涉及的知觉线索和听觉处理能力,研究江淮普通话自下而上的焦点知觉处理中与年龄相关的变化。年轻,中年,年长的江淮普通话使用者完成了焦点识别任务和听觉感知任务。结果表明,年龄的增长导致听众识别焦点位置的准确率下降,当无法获得动态音高提示时,所有参与者的表现最差。听觉加工能力并不能预测中青年听众的注意力知觉表现,但对老年人表现的差异却有很大影响。这些发现表明,与年龄相关的焦点知觉恶化在很大程度上可以归因于知觉线索的听觉处理下降。提取调频线索的能力差可能是老年人难以感知江淮普通话中的焦点韵律的最重要的潜在心理声学因素。这些结果有助于我们理解老年人语言韵律加工的自下而上机制,特别是在音调语言中。
    Speakers can place their prosodic prominence on any locations within a sentence, generating focus prosody for listeners to perceive new information. This study aimed to investigate age-related changes in the bottom-up processing of focus perception in Jianghuai Mandarin by clarifying the perceptual cues and the auditory processing abilities involved in the identification of focus locations. Young, middle-aged, and older speakers of Jianghuai Mandarin completed a focus identification task and an auditory perception task. The results showed that increasing age led to a decrease in listeners\' accuracy rate in identifying focus locations, with all participants performing the worst when dynamic pitch cues were inaccessible. Auditory processing abilities did not predict focus perception performance in young and middle-aged listeners but accounted significantly for the variance in older adults\' performance. These findings suggest that age-related deteriorations in focus perception can be largely attributed to declined auditory processing of perceptual cues. Poor ability to extract frequency modulation cues may be the most important underlying psychoacoustic factor for older adults\' difficulties in perceiving focus prosody in Jianghuai Mandarin. The results contribute to our understanding of the bottom-up mechanisms involved in linguistic prosody processing in aging adults, particularly in tonal languages.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    空间分离和基频(F0)分离是改善多说话者场景中目标语音清晰度的有效线索。以前的研究主要集中在额叶半场内的空间配置,俯瞰同侧和整个中间平面,经常发生本地化混乱的地方。这项研究调查了在上述未充分开发的空间配置下,空间和F0分离对可懂度的影响。通过涉及两到四个说话者的三个实验来测量语音接收阈值,在同侧水平面或整个中间平面,利用具有变化的F0s的单调语音作为刺激。结果表明,对称位置的空间分离(同侧水平面或前后对称,中值平面的上下对称性)对清晰度有积极的贡献。目标方向和相对目标掩蔽物分离都会影响归因于空间分离的掩蔽释放。由于说话者的数量超过两个,从空间分离的掩蔽释放减少。然而,F0分离仍然是非常有效的线索,甚至可以促进空间分离以提高清晰度。进一步的分析表明,当前的可懂度模型在准确预测本研究探索的场景中的可懂度方面遇到困难。
    Spatial separation and fundamental frequency (F0) separation are effective cues for improving the intelligibility of target speech in multi-talker scenarios. Previous studies predominantly focused on spatial configurations within the frontal hemifield, overlooking the ipsilateral side and the entire median plane, where localization confusion often occurs. This study investigated the impact of spatial and F0 separation on intelligibility under the above-mentioned underexplored spatial configurations. The speech reception thresholds were measured through three experiments for scenarios involving two to four talkers, either in the ipsilateral horizontal plane or in the entire median plane, utilizing monotonized speech with varying F0s as stimuli. The results revealed that spatial separation in symmetrical positions (front-back symmetry in the ipsilateral horizontal plane or front-back, up-down symmetry in the median plane) contributes positively to intelligibility. Both target direction and relative target-masker separation influence the masking release attributed to spatial separation. As the number of talkers exceeds two, the masking release from spatial separation diminishes. Nevertheless, F0 separation remains as a remarkably effective cue and could even facilitate spatial separation in improving intelligibility. Further analysis indicated that current intelligibility models encounter difficulties in accurately predicting intelligibility in scenarios explored in this study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这项研究检查了“三只熊通道”(TB)标准的普通话阅读文章,可能会引起语音障碍个体的显着声带变化。结核病与另一个现有的标准阅读通道的相对敏感性,“普通话通道”(PM),还评估了有和没有语音障碍的个体之间的区别。
    42名声音正常的个体和30名声音障碍的个体参与了这项研究。最大基频(f0),最小f0,平均f0,f0范围,最大声带强度,最小强度,平均强度,使用Praat测量所有参与者大声朗读两段的强度范围,以构建语音范围配置文件(SRP)。
    在患有语音障碍的个体中,发现TB的声音范围明显大于PM的声音范围,包括明显更高的最大f0,平均f0,最大强度,平均强度,和显著较大的f0范围和强度范围。与没有声音障碍的人相比,在有声音障碍的人中观察到的声音范围明显有限,与PM相比,在大声朗读TB时具有更明显的限制性SRP。接收者工作特性分析表明,在区分有和没有语音障碍的个体方面,TB比PM更敏感。
    我们的研究结果支持TB作为标准临床评估工具用于评价声带病理变化的潜力。未来的研究应该探索是否可以开发基于通道或变化的治疗方法,以克服特定声音障碍的功能限制和声音范围的限制。
    UNASSIGNED: This study examined whether the \"Three Bears Passage\" (TB), a standard Mandarin reading passage, could elicit significant vocal range variations in individuals with voice disorders. Relative sensitivity of TB versus another existing standard reading passage, \"Passage in Mandarin\" (PM), for differentiating between individuals with and without voice disorders was also evaluated.
    UNASSIGNED: Forty-two individuals with normal voice and 30 individuals with voice disorders participated in the study. Maximum fundamental frequency (f0), minimum f0, mean f0, f0 range, maximum vocal intensity, minimum intensity, mean intensity, and intensity range of all participants reading aloud the two passages were measured with Praat to construct speech range profiles (SRPs).
    UNASSIGNED: Significantly larger vocal range was found for TB than for PM in individuals with voice disorders, including significantly higher maximum f0, mean f0, maximum intensity, mean intensity, and significantly larger f0 range and intensity range. Significantly more limited vocal range was observed in individuals with voice disorders than those without, with more obviously restricted SRPs while reading aloud TB compared to PM. Receiver operating characteristic analysis suggested that TB was more sensitive than PM in distinguishing between individuals with and without voice disorders.
    UNASSIGNED: Our findings supported the potential of TB as a standard clinical assessment tool for evaluating pathological changes in vocal range. Future studies should explore if therapeutic approaches based on the passage or variations of it could be developed for overcoming functional limitations and restrictions in vocal range for specific voice disorders.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    当前的研究调查了自闭症和典型发育(TD)粤语三语儿童的英语韵律重点标记,并研究了与以英语为母语的儿童相比在这方面的潜在差异。
    招募了48名参与者,每组16名演讲者(讲粤语的自闭症[CASD],说广东话的TD[CTD],和会说英语的TD[ETD]儿童),并设计提示问题以引出所需的焦点类型(即,广泛,狭窄,和对比焦点)。平均持续时间,平均基频(F0),F0范围,平均强度,和F0曲线被用作线性混合效应模型拟合和与群体和聚焦条件相关的功能数据分析的声学相关(即,广泛,狭窄,和对比预,on-,和后焦点)。
    CTD组通过减少平均持续时间而具有焦点后压缩(PFC)模式,缩小F0范围,降低平均值F0,F0曲线,在狭窄和对比的后焦点条件下,单词的平均强度,而CASD组仅平均持续时间缩短,F0曲线降低。然而,CTD组和CASD组均未显示大量的焦点扩张(OFE)模式。ETD组通过增加平均持续时间来标记OFE,平均F0,平均强度,和更高的F0曲线在聚焦条件下的单词。
    在PFC方面,CTD组比CASD组使用更多的声学线索。在使用OFE方面,ETD组与CASD和CTD组不同。此外,CASD和CTD组在使用持续时间和强度方面均显示出积极的第一语言迁移,潜在的,在使用F0进行韵律焦点标记时成功获得。同时,说广东话和说英语的团体之间使用OFE的差异,不是PFC,可能表明说广东话的儿童在OFE之前获得PFC。
    UNASSIGNED: The current study investigated English prosodic focus marking by autistic and typically developing (TD) Cantonese trilingual children, and examined the potential differences in this regard compared to native English-speaking children.
    UNASSIGNED: Forty-eight participants were recruited with 16 speakers for each of the three groups (Cantonese-speaking autistic [CASD], Cantonese-speaking TD [CTD], and English-speaking TD [ETD] children), and prompt questions were designed to elicit desired focus type (i.e., broad, narrow, and contrastive focus). Mean duration, mean fundamental frequency (F0), F0 range, mean intensity, and F0 curves were used as the acoustic correlates for linear mixed-effects model fitting and functional data analyses in relation to groups and focus conditions (i.e., broad, narrow, and contrastive pre-, on-, and post-focus).
    UNASSIGNED: The CTD group had post-focus compression (PFC) patterns via reducing mean duration, narrowing F0 range, and lowering mean F0, F0 curve, and mean intensity for words under both narrow and contrastive post-focus conditions, while the CASD group only had shortened mean duration and lowered F0 curves. However, neither the CTD group nor CASD group showed much of on-focus expansion (OFE) patterns. The ETD group marked OFE by increasing mean duration, mean F0, mean intensity, and higher F0 curve for words under on-focus conditions.
    UNASSIGNED: The CTD group utilized more acoustic cues than the CASD group when it comes to PFC. The ETD group differed from the CASD and CTD groups in the use of OFE. Furthermore, both the CASD and CTD groups showed positive first language transfer in the use of duration and intensity and, potentially, successful acquisition in the use of F0 for prosodic focus marking. Meanwhile, the differences in the use of OFE between the Cantonese-speaking and English-speaking groups, not PFC, might indicate that Cantonese-speaking children acquire PFC prior to OFE.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号