speech

演讲
  • 文章类型: Journal Article
    背景:言语功能障碍是帕金森氏病(PD)发展的初始运动表现之一,可以通过智能手机进行测量。
    目的:目的是开发一种基于智能手机的全自动抗噪声系统,该系统可以在孤立的快速眼动睡眠行为障碍(iRBD)受试者中轻松筛查前驱帕金森病言语障碍。
    方法:这项横断面研究定期评估,通过开发的智能手机应用程序,将iRBD患者的日常语音通话数据与早期PD患者和健康对照进行比较。参与者还进行了积极的,经常阅读他们的智能手机上的短文。在诊所进行标准的亲自评估后,连续收集智能手机数据长达3个月。
    结果:从72名参与者中提取了3525个电话,这些电话导致了5990分钟的预处理语音,包括21名iRBD患者,26例PD患者,25个控制iRBD患者和对照组之间的曲线下面积为0.85,与使用高质量麦克风的实验室检查相比,被动和主动智能手机数据的组合提供了可比甚至更灵敏的评估.在iRBD中诱发前驱神经变性的最敏感特征包括电话中不精确的元音发音(P=0.03)和阅读中的单音发音(P=0.05)。对应于大约9个呼叫的18分钟的讲话足以获得最佳的筛选灵敏度。
    结论:我们认为所开发的工具广泛适用于深度纵向数字表型数据,并在神经保护试验中具有未来的应用。深部脑刺激优化,神经精神病学,言语治疗,人群筛查,和超越。©2024作者(S)。由WileyPeriodicalsLLC代表国际帕金森症和运动障碍协会出版的运动障碍。
    BACKGROUND: Speech dysfunction represents one of the initial motor manifestations to develop in Parkinson\'s disease (PD) and is measurable through smartphone.
    OBJECTIVE: The aim was to develop a fully automated and noise-resistant smartphone-based system that can unobtrusively screen for prodromal parkinsonian speech disorder in subjects with isolated rapid eye movement sleep behavior disorder (iRBD) in a real-world scenario.
    METHODS: This cross-sectional study assessed regular, everyday voice call data from individuals with iRBD compared to early PD patients and healthy controls via a developed smartphone application. The participants also performed an active, regular reading of a short passage on their smartphone. Smartphone data were continuously collected for up to 3 months after the standard in-person assessments at the clinic.
    RESULTS: A total of 3525 calls that led to 5990 minutes of preprocessed speech were extracted from 72 participants, comprising 21 iRBD patients, 26 PD patients, and 25 controls. With a high area under the curve of 0.85 between iRBD patients and controls, the combination of passive and active smartphone data provided a comparable or even more sensitive evaluation than laboratory examination using a high-quality microphone. The most sensitive features to induce prodromal neurodegeneration in iRBD included imprecise vowel articulation during phone calls (P = 0.03) and monopitch in reading (P = 0.05). Eighteen minutes of speech corresponding to approximately nine calls was sufficient to obtain the best sensitivity for the screening.
    CONCLUSIONS: We consider the developed tool widely applicable to deep longitudinal digital phenotyping data with future applications in neuroprotective trials, deep brain stimulation optimization, neuropsychiatry, speech therapy, population screening, and beyond. © 2024 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    言语中的情绪有多种表达方式,和语音情感识别(SER)模型可能在看不见的语料库上表现不佳,这些语料库包含与训练数据库中表达的情感因素不同的情感因素。要构造一个对看不见的语料库鲁棒的SER模型,正则化方法或度量损失已经被研究。在本文中,我们提出了一种SER方法,该方法结合了每个训练样本的相对难度和标记可靠性。受代理锚损失的启发,我们提出了一种新的损失函数,该函数为给定小批量中情感标签更难估计的样本提供了更高的梯度。由于注释者可以基于情感表达来标记情感,该情感表达驻留在对话上下文或其他模态中,但在给定的语音话语中并不明显,一些情绪标签可能不可靠,这些不可靠的标签可能会更严重地影响建议的损失功能。在这方面,我们建议对预先训练的SER模型错误分类的样本应用标签平滑。实验结果表明,通过对错误分类的数据采用所提出的带有标签平滑的损失函数,可以提高SER对看不见的语料库的性能。
    Emotions in speech are expressed in various ways, and the speech emotion recognition (SER) model may perform poorly on unseen corpora that contain different emotional factors from those expressed in training databases. To construct an SER model robust to unseen corpora, regularization approaches or metric losses have been studied. In this paper, we propose an SER method that incorporates relative difficulty and labeling reliability of each training sample. Inspired by the Proxy-Anchor loss, we propose a novel loss function which gives higher gradients to the samples for which the emotion labels are more difficult to estimate among those in the given minibatch. Since the annotators may label the emotion based on the emotional expression which resides in the conversational context or other modality but is not apparent in the given speech utterance, some of the emotional labels may not be reliable and these unreliable labels may affect the proposed loss function more severely. In this regard, we propose to apply label smoothing for the samples misclassified by a pre-trained SER model. Experimental results showed that the performance of the SER on unseen corpora was improved by adopting the proposed loss function with label smoothing on the misclassified data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:先前的研究已经评估了PRAAT在全喉切除(TL)患者中进行声音分析的能力,虽然这个软件是为喉音的声学分析而设计的。最近,我们见证了专业声学分析软件的发展,气管食管语音分析(TEVA)。本研究旨在将分析与TL患者的两种程序进行比较。方法:对34名TL患者进行了观察性分析研究,其中使用TEVA和PRAAT软件对元音[a]和[i]的稳定发声进行了定量声学分析,并进行了光谱学表征。结果:语音障碍指数(VHI-10)平均得分为11.29±11.16分,归类为中度障碍。TEVA分析发现基频与基频的值较低PRAAT(p<0.05)。用TEVA观察到闪烁值的显著增加(>20%)。TEVA和PRAAT的光谱分析之间没有发现显着差异。结论:气管食管语音是一种咽喉语音,与喉部语音相比,具有更高的不规则性和噪音。因此,它需要一种更有针对性的方法,使用适应这些不同特征的客观评估工具,像TEVA,专为TL患者设计。这项研究为评估和跟踪气管食管扬声器提供了支持其可靠性和适用性的统计证据。
    Background: Previous studies have assessed the capability of PRAAT for acoustic voice analysis in total laryngectomized (TL) patients, although this software was designed for acoustic analysis of laryngeal voice. Recently, we have witnessed the development of specialized acoustic analysis software, Tracheoesophageal Voice Analysis (TEVA). This study aims to compare the analysis with both programs in TL patients. Methods: Observational analytical study of 34 TL patients where a quantitative acoustic analysis was performed for stable phonation with vowels [a] and [i] as well as spectrographic characterization using the TEVA and PRAAT software. Results: The Voice Handicap Index (VHI-10) showed a mean score of 11.29 ± 11.16 points, categorized as a moderate handicap. TEVA analysis found lower values in the fundamental frequency vs. PRAAT (p < 0.05). A significant increase in shimmer values was observed with TEVA (>20%). No significant differences were found between spectrographic analysis with TEVA and PRAAT. Conclusions: Tracheoesophageal speech is an alaryngeal voice, characterized by a higher degree of irregularity and noise compared to laryngeal speech. Consequently, it necessitates a more tailored approach using objective assessment tools adapted to these distinct features, like TEVA, that are designed specifically for TL patients. This study provides statistical evidence supporting its reliability and suitability for the evaluation and tracking of tracheoesophageal speakers.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在显见的亨廷顿氏病(HD)和显见的突变携带者(preHD)中已经报道了言语改变。我们研究的目的是探索preHD中的这些变化以及它们是否可以用作生物标志物。13个preHD突变携带者执行读取任务,基线和21个月后的持续发声任务和音节重复任务,以及临床检查和MRI。音节重复能力和单音节重复的自我选择速度在时间点之间存在显着差异。临床评分或MRI容量没有变化。语音测量可能是监测preHD亚临床变化的敏感工具。
    Speech alterations have been reported in manifest Huntington\'s disease (HD) and premanifest mutation carriers (preHD). The aim of our study was to explore these alterations in preHD and whether they can be used as biomarkers. 13 preHD mutation carriers performed reading task, sustained phonation task and syllable repetition tasks at baseline and after 21 months, as well as clinical examination and MRI. Syllable repetition capacity and self-chosen velocity of single syllable repetition differed significantly between time points. There were no changes in clinical ratings or MRI volumetry. Measurements of speech might be sensitive tools for monitoring subclinical changes in preHD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    来自语言样本分析的结果可以提供老年人认知障碍的有效指标。
    本研究使用新的自动核心词典分析Cookie盗窃图片描述来评估三组典型使用的差异。
    参与者包括未诊断为认知障碍的成年人(对照),被诊断患有阿尔茨海默病(ProbableAD)的成年人,和诊断为轻度认知障碍(MCI)的成年人。Cookie盗窃图片描述使用CLAN进行了转录和分析。
    结果显示,与MCI和对照组相比,ProbableAD组总体上使用的核心词典单词明显减少。对于核心词典内容词(名词,动词),然而,MCI和ProbableAD组产生的单词明显少于对照组.这些小组在使用核心词典功能词方面没有差异。ProbableAD组比MCI和对照组产生大多数核心词典单词的速度也较慢。MCI组仅比对照组慢两个核心词典内容词。在描述的早期,所有小组都在图片的左上方象限中提到了一个核心词典单词。ProbableAD组在其他象限中提到核心词典单词的速度明显慢于其他组。
    这项标准且易于管理的任务揭示了整体核心词典得分的群体差异以及演讲者产生关键项目的时间。临床医生和研究人员可以使用这些工具进行早期评估和随时间变化的测量。
    UNASSIGNED: Findings from language sample analyses can provide efficient and effective indicators of cognitive impairment in older adults.
    UNASSIGNED: This study used newly automated core lexicon analyses of Cookie Theft picture descriptions to assess differences in typical use across three groups.
    UNASSIGNED: Participants included adults without diagnosed cognitive impairments (Control), adults diagnosed with Alzheimer\'s disease (ProbableAD), and adults diagnosed with mild cognitive impairment (MCI). Cookie Theft picture descriptions were transcribed and analyzed using CLAN.
    UNASSIGNED: Results showed that the ProbableAD group used significantly fewer core lexicon words overall than the MCI and Control groups. For core lexicon content words (nouns, verbs), however, both the MCI and ProbableAD groups produced significantly fewer words than the Control group. The groups did not differ in their use of core lexicon function words. The ProbableAD group was also slower to produce most of the core lexicon words than the MCI and Control groups. The MCI group was slower than the Control group for only two of the core lexicon content words. All groups mentioned a core lexicon word in the top left quadrant of the picture early in the description. The ProbableAD group was then significantly slower than the other groups to mention a core lexicon word in the other quadrants.
    UNASSIGNED: This standard and simple-to-administer task reveals group differences in overall core lexicon scores and the amount of time until the speaker produces the key items. Clinicians and researchers can use these tools for both early assessment and measurement of change over time.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在麦格克效应中,说话者脸上的视觉语音改变了听觉语音的感知。人类语言的多样性促使许多跨文化研究在西方和非西方文化中的影响,包括母语为日语的人。对以英语为母语的大量样本的研究表明,McGurk效应的特征是不同个体对错觉的敏感性以及不同实验刺激诱发错觉的强度具有高度变异性。McGurk效应的视差(NED)模型的噪声编码使用贝叶斯因果推断的原理来解释这种可变性,分别估计每个人的易感性和感觉噪声以及每个刺激的强度。为了确定McGurk感知的差异在西方和非西方文化之间是否相似,我们将NED模型应用于从80名以日语为母语的参与者收集的数据.15种不同的McGurk刺激,其音节内容各不相同(无声的听觉“pa”视觉“ka”或有声的听觉“ba”视觉“ga”)与视听一致的刺激交织在一起。McGurk效应在刺激和参与者之间差异很大,虚幻融合反应的百分比在刺激中从3%到78%不等,在参与者中从0%到91%不等。尽管有这种可变性,NED模型准确地预测了感知,预测个体刺激的融合率,误差为2.1%,个体参与者的融合率,误差为2.4%。包含无声pa/ka配对的刺激比有声ba/ga配对引起更多的融合反应。感官噪声的模型估计与参与者年龄相关,老年参与者的感觉噪音更大。在研究不同文化中的McGurk效应时,McGurk效应的NED模型提供了一种原则性的方法来解释个体和刺激差异。
    In the McGurk effect, visual speech from the face of the talker alters the perception of auditory speech. The diversity of human languages has prompted many intercultural studies of the effect in both Western and non-Western cultures, including native Japanese speakers. Studies of large samples of native English speakers have shown that the McGurk effect is characterized by high variability in the susceptibility of different individuals to the illusion and in the strength of different experimental stimuli to induce the illusion. The noisy encoding of disparity (NED) model of the McGurk effect uses principles from Bayesian causal inference to account for this variability, separately estimating the susceptibility and sensory noise for each individual and the strength of each stimulus. To determine whether variation in McGurk perception is similar between Western and non-Western cultures, we applied the NED model to data collected from 80 native Japanese-speaking participants. Fifteen different McGurk stimuli that varied in syllable content (unvoiced auditory \"pa\" + visual \"ka\" or voiced auditory \"ba\" + visual \"ga\") were presented interleaved with audiovisual congruent stimuli. The McGurk effect was highly variable across stimuli and participants, with the percentage of illusory fusion responses ranging from 3 to 78% across stimuli and from 0 to 91% across participants. Despite this variability, the NED model accurately predicted perception, predicting fusion rates for individual stimuli with 2.1% error and for individual participants with 2.4% error. Stimuli containing the unvoiced pa/ka pairing evoked more fusion responses than the voiced ba/ga pairing. Model estimates of sensory noise were correlated with participant age, with greater sensory noise in older participants. The NED model of the McGurk effect offers a principled way to account for individual and stimulus differences when examining the McGurk effect in different cultures.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:本研究旨在比较四种不同的上颌可摘正畸固位体对言语的影响。
    方法:样本选择的合格标准是:具有可接受闭塞的20-40岁受试者,以葡萄牙语为母语的人。志愿者(n=21)分为四组,随机分配比例为1:1:1:1。使用的四组,以随机顺序,四种类型的固位器分别全职21天,冲洗期为7天。可移除的上颌固位体是:常规的环绕,周围有一个前孔,U形环绕,和热塑性固定器。三名志愿者被排除在外。最终样本包括18名受试者(11名男性;7名女性),平均年龄为27.08岁(SD=4.65)。语音评估是在之前制作的声乐摘录录音中进行的,紧接着,每个固定器安装后21天,对元音共振峰频率F1和F2进行听觉感知和声学分析。重复测量ANOVA和Friedman与Tukey检验用于统计比较。
    结果:在常规环绕和热塑性固定器安装后,语音变化立即增加,并在21天后减少,但没有达到正常水平。然而,这种增加仅在具有前孔和热塑性固定器的环绕中具有统计学意义。元音的共振频率在初始时间被改变,变化仍然是传统的,三周后的U型和热塑性器具。
    结论:热塑性固定器比环绕式矫治器对言语的危害更大。常规和U形固定器对语音的干扰较小。三周的时间不足以适应语言。
    OBJECTIVE: This study aimed to compare the influence of four different maxillary removable orthodontic retainers on speech.
    METHODS: Eligibility criteria for sample selection were: 20-40-year subjects with acceptable occlusion, native speakers of Portuguese. The volunteers (n=21) were divided in four groups randomized with a 1:1:1:1 allocation ratio. The four groups used, in random order, the four types of retainers full-time for 21 days each, with a washout period of 7-days. The removable maxillary retainers were: conventional wraparound, wraparound with an anterior hole, U-shaped wraparound, and thermoplastic retainer. Three volunteers were excluded. The final sample comprised 18 subjects (11 male; 7 female) with mean age of 27.08 years (SD=4.65). The speech evaluation was performed in vocal excerpts recordings made before, immediately after, and 21 days after the installation of each retainer, with auditory-perceptual and acoustic analysis of formant frequencies F1 and F2 of the vowels. Repeated measures ANOVA and Friedman with Tukey tests were used for statistical comparison.
    RESULTS: Speech changes increased immediately after conventional wraparound and thermoplastic retainer installation, and reduced after 21 days, but not to normal levels. However, this increase was statistically significant only for the wraparound with anterior hole and the thermoplastic retainer. Formant frequencies of vowels were altered at initial time, and the changes remained in conventional, U-shaped and thermoplastic appliances after three weeks.
    CONCLUSIONS: The thermoplastic retainer was more harmful to the speech than wraparound appliances. The conventional and U-shaped retainers interfered less in speech. The three-week period was not sufficient for speech adaptation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:自闭症谱系障碍(ASD)是一种神经发育障碍,其病因尚未明确确定。越来越多的证据表明,突触和树突状变化与ASD的病因有关。这项研究的目的是确定ASD患者和健康对照之间的血清血小板反应蛋白1和血小板反应蛋白2是否存在差异。这项研究还调查了ASD的临床症状学与血清血小板反应蛋白1和血小板反应蛋白2水平之间的可能相关性。方法:纳入44例ASD患儿和21例6岁以下健康对照者。采用儿童孤独症评定量表和异常行为检查表对ASD患儿的症状严重程度和行为问题进行评价。使用商业酶联免疫吸附测定试剂盒测量血小板反应蛋白-1和血小板反应蛋白-2的血清水平。结果:两组血清血小板反应蛋白-1和血小板反应蛋白-2水平差异无统计学意义。此外,血小板反应蛋白-2水平与ASD的临床症状和严重程度之间没有相关性。然而,血小板反应蛋白-1水平与儿童自闭症量表总分呈负相关,异常行为检查表量表的不适当言语和刻板印象子量表得分。结论:血小板反应素-1可能在ASD的发病机制中起潜在作用。需要进一步的研究来清楚地阐明Trombospondin-1和ASD之间的关联。
    Background: Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder, the etiology of which has not been clearly determined yet. There is increasing evidence that synaptic and dendritic changes are involved in the etiology of ASD. The aim of this study is to determine whether serum Thrombospondin-1 and Thrombospondin-2 differ between ASD patients and healthy controls. This study also investigates possible correlations between clinical symptomatology of ASD and serum Thrombospondin-1 and Thrombospondin-2 levels. Method: A total of 44 children with ASD and 21 healthy controls under 6 years of age were included in the study. Symptom severity and behavioral problems among children with ASD were evaluated by using Childhood Autism Rating Scale and Abnormal Behavior Checklist. Serum levels of Thrombospondin-1 and Thrombospondin-2 were measured by using commercial enzyme-linked immunosorbent assay kits. Result: No statistically significant differences were found between the two groups in terms of serum Thrombospondin-1 and Thrombospondin-2 levels. In addition, no correlation was determined between Thrombospondin-2 levels and clinical symptomatology and severity of ASD. However, the Thrombospondin-1 level was found to negatively correlated with the total score of Childhood Autism Rating Scale, inappropriate speech and stereotype subscale scores of Aberrant Behavior Checklist scale. Conclusion: Thrombospondin-1 might have a potential role in the etiopathogenesis of ASD. Further studies are required to clearly elucidate the association between Trombospondin-1 and ASD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基于语音样本的帕金森病诊断测试显示出可喜的结果。尽管在Parkinsonism中已知语音产生过程中异常的听觉反馈整合和语音节奏组织受损,这些方面尚未纳入诊断测试。这项研究旨在使用一种新颖的言语行为测试来识别帕金森病,该测试涉及在不同的听觉反馈条件下有节奏地重复音节。该研究包括30名帕金森病(PD)患者和30名健康受试者。参与者被要求有节奏地重复PA-TA-KA音节序列,在各种听力条件下低语和大声说话。结果表明,患有PD的个体在听觉反馈条件改变下难以耳语和发音,表现出延迟的言语发作,与对照组相比,试验中的节律结构不一致。然后将这些参数输入到有监督的机器学习算法中,以区分两组。该算法取得了85.4%的准确率,灵敏度为86.5%,特异性为84.3%。这项初步研究强调了所提出的行为范式作为一种客观和可获得的(成本和时间)测试的潜力,用于识别患有帕金森病的个体。
    Diagnostic tests for Parkinsonism based on speech samples have shown promising results. Although abnormal auditory feedback integration during speech production and impaired rhythmic organization of speech are known in Parkinsonism, these aspects have not been incorporated into diagnostic tests. This study aimed to identify Parkinsonism using a novel speech behavioral test that involved rhythmically repeating syllables under different auditory feedback conditions. The study included 30 individuals with Parkinson\'s disease (PD) and 30 healthy subjects. Participants were asked to rhythmically repeat the PA-TA-KA syllable sequence, both whispering and speaking aloud under various listening conditions. The results showed that individuals with PD had difficulties in whispering and articulating under altered auditory feedback conditions, exhibited delayed speech onset, and demonstrated inconsistent rhythmic structure across trials compared to controls. These parameters were then fed into a supervised machine-learning algorithm to differentiate between the two groups. The algorithm achieved an accuracy of 85.4%, a sensitivity of 86.5%, and a specificity of 84.3%. This pilot study highlights the potential of the proposed behavioral paradigm as an objective and accessible (both in cost and time) test for identifying individuals with Parkinson\'s disease.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Dataset
    许多研究文章探讨了手术干预对语音和语音评估的影响,但是由于缺乏可公开访问的数据集,进步受到限制。为了解决这个问题,记录了107名西班牙语卡斯蒂利亚人的综合语料库,包括对照者和接受上气道手术如扁桃体切除术的患者,功能性鼻内镜手术,和鼻中隔成形术.数据集包含3,800个音频文件,每位患者平均35.51±5.91记录。该资源可以系统地研究上呼吸道手术对语音和语音的影响。以前使用该语料库的研究表明,持续元音发声的关键声学参数没有相关变化,与最初的假设一致。然而,语音录音的分析,特别是鼻化的片段,仍有待进一步研究。此外,该数据集有助于研究上气道手术对说话人识别和识别方法的影响,和反欺骗方法的测试,以提高鲁棒性。
    Many research articles have explored the impact of surgical interventions on voice and speech evaluations, but advances are limited by the lack of publicly accessible datasets. To address this, a comprehensive corpus of 107 Spanish Castilian speakers was recorded, including control speakers and patients who underwent upper airway surgeries such as Tonsillectomy, Functional Endoscopic Sinus Surgery, and Septoplasty. The dataset contains 3,800 audio files, averaging 35.51 ± 5.91 recordings per patient. This resource enables systematic investigation of the effects of upper respiratory tract surgery on voice and speech. Previous studies using this corpus have shown no relevant changes in key acoustic parameters for sustained vowel phonation, consistent with initial hypotheses. However, the analysis of speech recordings, particularly nasalised segments, remains open for further research. Additionally, this dataset facilitates the study of the impact of upper airway surgery on speaker recognition and identification methods, and testing of anti-spoofing methodologies for improved robustness.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号