speech

演讲
  • 文章类型: Journal Article
    人类的语言进化不仅仅是拥有一个可以说话的大脑和发声装置。
    Human speech evolution is not just about having a speech-ready brain and vocal apparatus.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    当一个人听自然语言时,例如,语音信号的特征与对应的诱发脑电图(EEG)之间的关系指示语音信号的神经处理。使用言语的语言表示,我们研究了母语和不理解的外语之间的神经处理差异。我们使用三种刺激进行了实验:一种可理解的语言,一种难以理解的语言,从一种可理解的语言中随机洗牌单词,同时记录母语为荷兰语的参与者的EEG信号。我们在将EEG信号与语音相关联的匹配不匹配任务中使用深度学习模型对语音信号的语言特征的神经跟踪进行建模。同时考虑反映声学处理的词汇分割特征。深度学习模型有效地对连贯语言和无意义语言进行了分类。我们还观察到同一语言中可理解和不可理解的语音刺激之间的跟踪模式存在显着差异。它展示了深度学习框架在客观测量语音理解方面的潜力。
    When a person listens to natural speech, the relation between features of the speech signal and the corresponding evoked electroencephalogram (EEG) is indicative of neural processing of the speech signal. Using linguistic representations of speech, we investigate the differences in neural processing between speech in a native and foreign language that is not understood. We conducted experiments using three stimuli: a comprehensible language, an incomprehensible language, and randomly shuffled words from a comprehensible language, while recording the EEG signal of native Dutch-speaking participants. We modeled the neural tracking of linguistic features of the speech signals using a deep-learning model in a match-mismatch task that relates EEG signals to speech, while accounting for lexical segmentation features reflecting acoustic processing. The deep learning model effectively classifies coherent versus nonsense languages. We also observed significant differences in tracking patterns between comprehensible and incomprehensible speech stimuli within the same language. It demonstrates the potential of deep learning frameworks in measuring speech understanding objectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:声压和呼出流量已被确定为与较高颗粒排放相关的重要因素。这项研究的目的是评估不同的发声如何独立于其他因素影响粒子的产生。
    方法:实验研究。
    方法:33位经验丰富的歌手以正常的响度重复两个不同的句子,并低声耳语。第一句主要由辅音如/k/和/t/以及开放元音组成,而第二句还包括/s/sound,主要包含封闭的元音。使用冷凝颗粒计数器(CPC,3775TSIInc.)和空气动力学粒子分级器(APS,3321TSI公司)。对于大于4nm的颗粒,CPC测量的颗粒数浓度主要反映小于0.5μm的颗粒数,因为这些颗粒主导总数量浓度。APS在0.5-10µm的尺寸范围内测量了粒径分布和数量浓度,数据分为>1µm和<1µm的粒径范围。构建了广义线性混合效应模型来评估影响颗粒生成的因素。
    结果:耳语比说话产生更多的粒子,句子1比句子2在说话时产生更多的粒子。声压级对粒子产生的影响与发声无关。呼出气流的影响没有统计学意义。
    结论:根据我们的结果,发声类型对颗粒产生具有显着的影响,而与其他因素(例如声压级)无关。
    OBJECTIVE: Sound pressure and exhaled flow have been identified as important factors associated with higher particle emissions. The aim of this study was to assess how different vocalizations affect the particle generation independently from other factors.
    METHODS: Experimental study.
    METHODS: Thirty-three experienced singers repeated two different sentences in normal loudness and whispering. The first sentence consisted mainly of consonants like /k/ and /t/ as well as open vowels, while the second sentence also included the /s/ sound and contained primarily closed vowels. The particle emission was measured using condensation particle counter (CPC, 3775 TSI Inc.) and aerodynamic particle sizer (APS, 3321 TSI Inc.). The CPC measured particle number concentration for particles larger than 4 nm and mainly reflects the number of particles smaller than 0.5 µm since these particles dominate total number concentration. The APS measured particle size distribution and number concentration in the size range of 0.5-10 µm and data were divided into >1 µm and <1 µm particle size ranges. Generalized linear mixed-effects models were constructed to assess the factors affecting particle generation.
    RESULTS: Whispering produced more particles than speaking and sentence 1 produced more particles than sentence 2 while speaking. Sound pressure level had effect on particle production independently from vocalization. The effect of exhaled airflow was not statistically significant.
    CONCLUSIONS: Based on our results the type of vocalization has a significant effect on particle production independently from other factors such as sound pressure level.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:脑机接口可以通过将与尝试的语音相关的皮层活动转换为计算机屏幕上的文本来实现瘫痪者的交流。与脑机接口的通信受到广泛的培训要求和有限的准确性的限制。
    方法:一名患有肌萎缩性侧索硬化症(ALS)并伴有严重构音障碍的45岁男子在发病5年后接受了4个微电极阵列在其左腹侧中央前回的手术植入;这些阵列记录了256个皮质内电极的神经活动。我们报告了解码他的皮质神经活动的结果,因为他试图在提示和非结构化的会话环境中讲话。解码后的单词显示在屏幕上,然后使用设计成听起来像他的ALS前语音的文本到语音软件发声。
    结果:在使用的第一天(手术后25天),神经假体在50个单词的词汇量下达到99.6%的准确率.当参与者试图说话时,神经假体的校准需要30分钟的皮质记录,随后进行后续处理。第二天,经过1.4个小时的系统培训,使用125,000个单词的词汇量,神经假体的准确率达到90.2%.随着进一步的培训数据,神经假体在手术植入后8.4个月内保持了97.5%的准确率,参与者使用它以每分钟约32个单词的速度进行自定进度对话,累计超过248个小时。
    结论:在患有ALS和严重构音障碍的人中,经过简短的训练,皮质内语音神经假体达到了适合恢复对话交流的性能水平。(由负责卫生事务的助理国防部长办公室和其他人资助;BrainGate2ClinicalTrials.gov编号,NCT00912041。).
    BACKGROUND: Brain-computer interfaces can enable communication for people with paralysis by transforming cortical activity associated with attempted speech into text on a computer screen. Communication with brain-computer interfaces has been restricted by extensive training requirements and limited accuracy.
    METHODS: A 45-year-old man with amyotrophic lateral sclerosis (ALS) with tetraparesis and severe dysarthria underwent surgical implantation of four microelectrode arrays into his left ventral precentral gyrus 5 years after the onset of the illness; these arrays recorded neural activity from 256 intracortical electrodes. We report the results of decoding his cortical neural activity as he attempted to speak in both prompted and unstructured conversational contexts. Decoded words were displayed on a screen and then vocalized with the use of text-to-speech software designed to sound like his pre-ALS voice.
    RESULTS: On the first day of use (25 days after surgery), the neuroprosthesis achieved 99.6% accuracy with a 50-word vocabulary. Calibration of the neuroprosthesis required 30 minutes of cortical recordings while the participant attempted to speak, followed by subsequent processing. On the second day, after 1.4 additional hours of system training, the neuroprosthesis achieved 90.2% accuracy using a 125,000-word vocabulary. With further training data, the neuroprosthesis sustained 97.5% accuracy over a period of 8.4 months after surgical implantation, and the participant used it to communicate in self-paced conversations at a rate of approximately 32 words per minute for more than 248 cumulative hours.
    CONCLUSIONS: In a person with ALS and severe dysarthria, an intracortical speech neuroprosthesis reached a level of performance suitable to restore conversational communication after brief training. (Funded by the Office of the Assistant Secretary of Defense for Health Affairs and others; BrainGate2 ClinicalTrials.gov number, NCT00912041.).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    科学文献中的共识是,每个孩子都经历着独特的语言发展道路,尽管有共同的发展阶段。有些孩子在语言技能方面表现出色或落后于同龄人。因此,语言习得研究的一个关键挑战是确定影响语言发展中个体差异的因素。
    我们纵向观察了3至24个月生命的儿童,以探索词汇量的早期预测因素。根据24个月儿童的有效词汇量,30名儿童符合我们的样本选择标准:10名晚期说话者和10名早期说话者,我们将它们与10个典型的健谈者进行了比较。我们评估了3、6、9和12个月的互动行为,考虑到声乐制作,凝视母亲的脸,以及母子互动过程中的手势产生,我们考虑了母亲在15个月和18个月时儿童的动作和手势以及接受词汇量的报告。
    结果表明,在声乐作品中,在24个月时可以识别出语言结果的早期前兆,凝视母亲的脸6个月,手势制作12个月。
    我们的研究突出了理论和实践意义。理论上,确定属于晚期或早期说话者组的早期指标强调了这一发育期对未来研究的重要作用。实际上,我们的研究结果强调,在确定词汇延迟的典型年龄之前,必须进行早期调查,以确定词汇发展的预测因子。
    UNASSIGNED: The consensus in scientific literature is that each child undergoes a unique linguistic development path, albeit with shared developmental stages. Some children excel or lag behind their peers in language skills. Consequently, a key challenge in language acquisition research is pinpointing factors influencing individual differences in language development.
    UNASSIGNED: We observed children longitudinally from 3 to 24 months of life to explore early predictors of vocabulary size. Based on the productive vocabulary size of children at 24 months, 30 children met our sample selection criteria: 10 late talkers and 10 early talkers, and we compared them with 10 typical talkers. We evaluated interactive behaviors at 3, 6, 9 and 12 months, considering vocal production, gaze at mother\'s face, and gestural production during mother-child interactions, and we considered mothers\' report of children\'s actions and gestures and receptive-vocabulary size at 15 and 18 months.
    UNASSIGNED: Results indicated early precursors of language outcome at 24 months identifiable as early as 3 months in vocal productions, 6 months for gaze at mother\'s face and 12 months for gestural productions.
    UNASSIGNED: Our research highlights both theoretical and practical implications. Theoretically, identifying the early indicators of belonging to the group of late or early talkers underscores the significant role of this developmental period for future studies. On a practical note, our findings emphasize the crucial need for early investigations to identify predictors of vocabulary development before the typical age at which lexical delay is identified.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们试图在精神病学访谈期间从言语分析中得出精神运动减慢的客观量度,以避免专用神经生理测试的潜在负担。语音延迟,这反映了说话者之间的响应时间,显示了文学的希望。语音数据来自274名被诊断为双相I型抑郁症的受试者,双盲,为期6周的2期临床试验。检查了6个时间点的结构化蒙哥马利-奥斯贝格抑郁量表(MADRS)访谈的录音(k=1,352)。我们评估了语音延迟,和其他方面的演讲,为了时间稳定性,收敛有效性,对临床变化的敏感性/响应性,以及在七个社会语言不同国家的概括。语音延迟与人口统计特征的关联最小,并解释了近三分之一的抑郁症差异(明确定义)。随着抑郁症状的改善,言语潜伏期显着减少,解释了近20%的抑郁症缓解差异。在横截面和纵向上,区分并发抑郁症患者和无并发抑郁症患者的分类都很高(AUC>0.85)。结果在各国之间复制。其他语音功能提供了适度的增量贡献。可以从精神病学访谈中得出具有面部有效性的神经生理学语音参数,而无需增加患者额外测试的负担。
    We sought to derive an objective measure of psychomotor slowing from speech analytics during a psychiatric interview to avoid potential burden of dedicated neurophysiological testing. Speech latency, which reflects response time between speakers, shows promise from the literature. Speech data was obtained from 274 subjects with a diagnosis of bipolar I depression enrolled in a randomized, doubleblind, 6-week phase 2 clinical trial. Audio recordings of structured Montgomery-Åsberg Depression Rating Scale (MADRS) interviews at 6 time points were examined (k = 1,352). We evaluated speech latencies, and other aspects of speech, for temporal stability, convergent validity, sensitivity/responsivity to clinical change, and generalization across seven socio-linguistically diverse countries. Speech latency was minimally associated with demographic features, and explained nearly a third of the variance in depression (categorically defined). Speech latency significantly decreased as depression symptoms improved over time, explaining nearly 20 % of variance in depression remission. Classification for differentiating people with versus without concurrent depression was high (AUCs > 0.85) both cross-sectionally and longitudinally. Results replicated across countries. Other speech features offered modest incremental contribution. Neurophysiological speech parameters with face validity can be derived from psychiatric interviews without the added patient burden of additional testing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:推论放电(CD)机制抑制自我产生的语音声音感知,在精神分裂症中出现中断,并可能导致异常自我体验(ASEs)。然而,目前尚不清楚这种改变及其与ASEs的相关性是否延伸到其他精神病性障碍.
    方法:脑电图用于研究N1事件相关电位(ERP),作为35名精神分裂症患者听觉皮层CD介导抑制的指标,26岁的双相情感障碍患者,和30个健康对照。听觉N1是由两个条件引起的:通过连接的麦克风和耳机讲话时实时收听自发音的元音(听/说话-或先前文献中的说话条件)和被动收听相同的先前录制的自发音元音(听/不说话-或听条件-)。
    结果:在所有组中,与听/不说话相比,在听/说话条件下N1ERP幅度较低。然而,在精神分裂症中N1抑制显著减少,双相情感障碍患者在两组之间表现出中等衰减(即,与对照组无显著差异)。此外,仅在精神分裂症中,N1抑制与ASEs严重程度呈负相关。
    结论:CD机制的功能障碍可能是精神分裂症的定义特征,它连接到ASES的地方。
    结论:这些结果证实了先前的发现,将精神分裂症中的听觉N1ERP抑制与CD机制破坏联系起来,但不是双相情感障碍。
    OBJECTIVE: The Corollary Discharge (CD) mechanism inhibits self-generated speech sound perception, appearing disrupted in schizophrenia and potentially contributing to Anomalous Self-Experiences (ASEs). However, it remains unclear if this alteration and its correlation with ASEs extend to other psychotic disorders.
    METHODS: Electroencephalography was used to study the N1 Event-Related Potential (ERP) as an index of CD-mediated suppression in the auditory cortex across thirty-five participants with schizophrenia, twenty-six with bipolar disorder, and thirty healthy controls. Auditory N1 was elicited by two conditions: real-time listening to self-pronounced vowels while speaking through connected microphone and earphones (listen/talk -or talk condition in previous literature-) and passive listening to the same previously recorded self-uttered vowels (listen/no talk -or listen condition-).
    RESULTS: N1 ERP amplitude was lower in the listen/talk condition compared to listen/no talk across all groups. However, N1 suppression was significantly reduced in schizophrenia, with bipolar patients showing intermediate attenuation between both groups (i.e., non-significantly different from controls). Furthermore, N1 suppression inversely correlated with ASEs severity only in schizophrenia.
    CONCLUSIONS: Dysfunction of the CD mechanism may be a defining feature of schizophrenia, where it is connected to ASEs.
    CONCLUSIONS: These results corroborate previous findings linking auditory N1 ERP suppression with disrupted CD mechanism in schizophrenia, but not in bipolar disorder.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    有证据表明,小脑在大脑中的作用并不局限于运动功能。相反,小脑活动似乎对于依赖精确事件定时和预测的各种任务至关重要。由于其复杂的结构和在通信中的重要性,人类的语音需要一个特别精确和预测协调的神经过程被成功地理解。最近的研究表明,小脑确实是语音处理的主要贡献者,但是这种贡献是如何实现的机制仍然知之甚少。本研究旨在揭示皮质-小脑协调的潜在机制,并证明其语音特异性。在对脑磁图数据的重新分析中,我们发现小脑的活动与噪声语音的节奏序列一致,不管它的清晰度。然后我们测试了这些“夹带”响应是否持续存在,以及它们如何与其他大脑区域相互作用,当有节奏的刺激停止并且时间预测必须更新时。我们发现,只有可理解的语音在小脑中产生持续的有节奏的反应。在这个“夹带回声,“但不是在有节奏的演讲中,小脑活动与左额下回有关,特别是以对应于先前刺激节奏的速率。这一发现代表了语音处理中特定小脑驱动的时间预测及其传递到皮质区域的证据。
    Evidence accumulates that the cerebellum\'s role in the brain is not restricted to motor functions. Rather, cerebellar activity seems to be crucial for a variety of tasks that rely on precise event timing and prediction. Due to its complex structure and importance in communication, human speech requires a particularly precise and predictive coordination of neural processes to be successfully comprehended. Recent studies proposed that the cerebellum is indeed a major contributor to speech processing, but how this contribution is achieved mechanistically remains poorly understood. The current study aimed to reveal a mechanism underlying cortico-cerebellar coordination and demonstrate its speech-specificity. In a reanalysis of magnetoencephalography data, we found that activity in the cerebellum aligned to rhythmic sequences of noise-vocoded speech, irrespective of its intelligibility. We then tested whether these \"entrained\" responses persist, and how they interact with other brain regions, when a rhythmic stimulus stopped and temporal predictions had to be updated. We found that only intelligible speech produced sustained rhythmic responses in the cerebellum. During this \"entrainment echo,\" but not during rhythmic speech itself, cerebellar activity was coupled with that in the left inferior frontal gyrus, and specifically at rates corresponding to the preceding stimulus rhythm. This finding represents evidence for specific cerebellum-driven temporal predictions in speech processing and their relay to cortical regions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • DOI:
    文章类型: Lecture
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    识别仇恨言论(HS)是在线环境中的核心问题。目前的方法不足以进行有效的抢占式HS识别。在这项研究中,我们介绍了应用于流行的alt-rightYouTube视频的自动HS识别分析结果。
    本文描述了自动HS检测的方法学挑战。案例研究涉及当代激进权利话语形成部分的数据。我们的目标是双重的。(1)概述了使用自动HS识别的跨学科混合方法方法。这一方面弥合了技术研究(如机器学习、深度学习,和自然语言处理,NLP)和传统的实证研究。关于另类权利话语和HS,我们问:(2)在流行的alt-rightYouTube视频中识别HS的挑战是什么?
    结果表明,有效和一致地识别HS通信需要进行定性干预,以避免任意或误导性应用。仇恨/非仇恨言论的二元方法往往会迫使将内容指定为HS的理由。对上下文敏感的定性方法可以通过关注这些交流的间接特征来解决这一问题。结果应该引起社会科学和人文学科中采用自动情感分析以及分析HS和激进权利话语的研究人员的兴趣。
    HS的自动识别或调节不能解释间接意义的演变背景。这项研究举例说明了可以有效利用自动仇恨语音识别的过程。需要几个方法步骤才能获得有用的结果,技术定量处理和定性分析对于取得有意义的结果至关重要。关于alt-rightYouTube材料,主要挑战是间接框架。识别要求在更广泛的话语背景下进行定位,而对间接表达的适应使适度和压制在道德和法律上都不稳定。
    UNASSIGNED: Identifying hate speech (HS) is a central concern within online contexts. Current methods are insufficient for efficient preemptive HS identification. In this study, we present the results of an analysis of automatic HS identification applied to popular alt-right YouTube videos.
    UNASSIGNED: This essay describes methodological challenges of automatic HS detection. The case study concerns data on a formative segment of contemporary radical right discourse. Our purpose is twofold. (1) To outline an interdisciplinary mixed-methods approach for using automated identification of HS. This bridges the gap between technical research on the one hand (such as machine learning, deep learning, and natural language processing, NLP) and traditional empirical research on the other. Regarding alt-right discourse and HS, we ask: (2) What are the challenges in identifying HS in popular alt-right YouTube videos?
    UNASSIGNED: The results indicate that effective and consistent identification of HS communication necessitates qualitative interventions to avoid arbitrary or misleading applications. Binary approaches of hate/non-hate speech tend to force the rationale for designating content as HS. A context-sensitive qualitative approach can remedy this by bringing into focus the indirect character of these communications. The results should interest researchers within social sciences and the humanities adopting automatic sentiment analysis and for those analysing HS and radical right discourse.
    UNASSIGNED: Automatic identification or moderation of HS cannot account for an evolving context of indirect signification. This study exemplifies a process whereby automatic hate speech identification could be utilised effectively. Several methodological steps are needed for a useful outcome, with both technical quantitative processing and qualitative analysis being vital to achieve meaningful results. With regard to the alt-right YouTube material, the main challenge is indirect framing. Identification demands orientation in the broader discursive context and the adaptation towards indirect expressions renders moderation and suppression ethically and legally precarious.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号