Voice

语音
  • 文章类型: Journal Article
    网络已经成为一种必不可少的资源,但还不是每个人都可以访问的。辅助技术和创新,智能框架,例如,那些使用对话式人工智能的人,帮助克服一些排斥。然而,一些用户仍然遇到障碍。本文展示了以人为中心的方法如何阐明技术限制和差距。它报告了一个三步过程(焦点小组,共同设计,和初步验证),我们采用了它来调查有语言障碍的人,例如,构音障碍,浏览Web以及如何减少障碍。该方法帮助我们识别挑战并创建新的解决方案,即,网页浏览的模式,通过结合基于语音的会话AI,为受损的语音定制,使用网页视觉增强技术。虽然人工智能研究的当前趋势集中在越来越强大的大型模型上,参与者评论了当前的对话系统如何不能满足他们的需求,以及如何考虑每个人的特殊性对于被称为包容性的技术是很重要的。
    The Web has become an essential resource but is not yet accessible to everyone. Assistive technologies and innovative, intelligent frameworks, for example, those using conversational AI, help overcome some exclusions. However, some users still experience barriers. This paper shows how a human-centered approach can shed light on technology limitations and gaps. It reports on a three-step process (focus group, co-design, and preliminary validation) that we adopted to investigate how people with speech impairments, e.g., dysarthria, browse the Web and how barriers can be reduced. The methodology helped us identify challenges and create new solutions, i.e., patterns for Web browsing, by combining voice-based conversational AI, customized for impaired speech, with techniques for the visual augmentation of web pages. While current trends in AI research focus on more and more powerful large models, participants remarked how current conversational systems do not meet their needs, and how it is important to consider each one\'s specificity for a technology to be called inclusive.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    音乐无处不在,无论是器乐形式还是声乐形式。虽然出生时的言语感知一直是广泛研究语料库的核心,区分器乐或声乐旋律的能力的起源仍未得到很好的研究。在以前的研究中,比较声乐和音乐感知,声音刺激主要与说话有关,包括语言,而不是非语言的歌声。在本研究中,为了更好地将旋律乐器线条与声音进行比较,我们用唱歌作为比较刺激,尽可能地减少两种刺激之间的差异,将语言感知与声乐感知分开。在本研究中,45名新生儿被扫描,10名足月出生婴儿和35名足月龄相同的早产儿(测试时的平均胎龄=40.17周,SD=0.44)使用功能磁共振成像,同时聆听乐器(长笛)演奏或女性声音演唱的五首旋律。要检查基于任务的动态有效连接,我们采用了共激活模式的心理生理相互作用(PPI-CAPs)分析,使用听觉皮层作为种子区域,研究功能磁共振成像任务期间任务驱动的皮质活动调制的时刻变化。我们的发现揭示了特定的条件,动态发生的共激活模式(PPI-CAPs)。在声乐状态下,听觉皮层与感觉运动和显着性网络共同激活,而在仪器状态下,它与视觉皮层和上额叶皮层共同激活。我们的结果表明,声音刺激会引起听觉感知的感觉运动方面,并被处理为更突出的刺激,而仪器条件会激活高阶认知和视觉空间网络。两种听觉刺激的共同神经特征均见于前回和扣带回后回。最后,这项研究增加了有关新生儿早期和专门听觉处理能力的动态大脑连通性的知识,强调动态方法研究新生儿人群脑功能的相关性。
    Music is ubiquitous, both in its instrumental and vocal forms. While speech perception at birth has been at the core of an extensive corpus of research, the origins of the ability to discriminate instrumental or vocal melodies is still not well investigated. In previous studies comparing vocal and musical perception, the vocal stimuli were mainly related to speaking, including language, and not to the non-language singing voice. In the present study, to better compare a melodic instrumental line with the voice, we used singing as a comparison stimulus, to reduce the dissimilarities between the two stimuli as much as possible, separating language perception from vocal musical perception. In the present study, 45 newborns were scanned, 10 full-term born infants and 35 preterm infants at term-equivalent age (mean gestational age at test = 40.17 weeks, SD = 0.44) using functional magnetic resonance imaging while listening to five melodies played by a musical instrument (flute) or sung by a female voice. To examine the dynamic task-based effective connectivity, we employed a psychophysiological interaction of co-activation patterns (PPI-CAPs) analysis, using the auditory cortices as seed region, to investigate moment-to-moment changes in task-driven modulation of cortical activity during an fMRI task. Our findings reveal condition-specific, dynamically occurring patterns of co-activation (PPI-CAPs). During the vocal condition, the auditory cortex co-activates with the sensorimotor and salience networks, while during the instrumental condition, it co-activates with the visual cortex and the superior frontal cortex. Our results show that the vocal stimulus elicits sensorimotor aspects of the auditory perception and is processed as a more salient stimulus while the instrumental condition activated higher-order cognitive and visuo-spatial networks. Common neural signatures for both auditory stimuli were found in the precuneus and posterior cingulate gyrus. Finally, this study adds knowledge on the dynamic brain connectivity underlying the newborns capability of early and specialized auditory processing, highlighting the relevance of dynamic approaches to study brain function in newborn populations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Dataset
    许多研究文章探讨了手术干预对语音和语音评估的影响,但是由于缺乏可公开访问的数据集,进步受到限制。为了解决这个问题,记录了107名西班牙语卡斯蒂利亚人的综合语料库,包括对照者和接受上气道手术如扁桃体切除术的患者,功能性鼻内镜手术,和鼻中隔成形术.数据集包含3,800个音频文件,每位患者平均35.51±5.91记录。该资源可以系统地研究上呼吸道手术对语音和语音的影响。以前使用该语料库的研究表明,持续元音发声的关键声学参数没有相关变化,与最初的假设一致。然而,语音录音的分析,特别是鼻化的片段,仍有待进一步研究。此外,该数据集有助于研究上气道手术对说话人识别和识别方法的影响,和反欺骗方法的测试,以提高鲁棒性。
    Many research articles have explored the impact of surgical interventions on voice and speech evaluations, but advances are limited by the lack of publicly accessible datasets. To address this, a comprehensive corpus of 107 Spanish Castilian speakers was recorded, including control speakers and patients who underwent upper airway surgeries such as Tonsillectomy, Functional Endoscopic Sinus Surgery, and Septoplasty. The dataset contains 3,800 audio files, averaging 35.51 ± 5.91 recordings per patient. This resource enables systematic investigation of the effects of upper respiratory tract surgery on voice and speech. Previous studies using this corpus have shown no relevant changes in key acoustic parameters for sustained vowel phonation, consistent with initial hypotheses. However, the analysis of speech recordings, particularly nasalised segments, remains open for further research. Additionally, this dataset facilitates the study of the impact of upper airway surgery on speaker recognition and identification methods, and testing of anti-spoofing methodologies for improved robustness.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们如何产生和感知声音受到喉生理学和生物力学的限制。这样的约束可以将其自身呈现为在说话者之间共享的语音结果空间中的主要维度。本研究试图在语音产生的三维计算模型中识别语音结果空间中的此类主要维度以及潜在的喉部控制机制。使用声带几何形状和刚度的参数变化进行了大规模语音模拟,声门间隙,声道形状,声门下压.主成分分析应用于结合生理控制参数和语音结果测量的数据。结果表明,三个主要维度至少占总方差的50%。前两个维度描述了呼吸-喉部协调在控制产生的声音中低频和高频谐波之间的能量平衡。第三个维度描述了基频的控制。这三个维度的优势表明,沿着这些主要维度的语音变化可能比其他语音变化更一致地产生和被大多数说话者感知,因此更有可能在进化过程中出现并被用来传达重要的个人信息,如情绪和喉的大小。
    How we produce and perceive voice is constrained by laryngeal physiology and biomechanics. Such constraints may present themselves as principal dimensions in the voice outcome space that are shared among speakers. This study attempts to identify such principal dimensions in the voice outcome space and the underlying laryngeal control mechanisms in a three-dimensional computational model of voice production. A large-scale voice simulation was performed with parametric variations in vocal fold geometry and stiffness, glottal gap, vocal tract shape, and subglottal pressure. Principal component analysis was applied to data combining both the physiological control parameters and voice outcome measures. The results showed three dominant dimensions accounting for at least 50% of the total variance. The first two dimensions describe respiratory-laryngeal coordination in controlling the energy balance between low- and high-frequency harmonics in the produced voice, and the third dimension describes control of the fundamental frequency. The dominance of these three dimensions suggests that voice changes along these principal dimensions are likely to be more consistently produced and perceived by most speakers than other voice changes, and thus are more likely to have emerged during evolution and be used to convey important personal information, such as emotion and larynx size.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:心理治疗作为“谈话治疗”的特征强调了积极倾听者对谈话疗效的重要性。我们测试工作联盟及其利益是否来自声音的表达,本身,或者是否需要主动倾听。我们研究了倾听在工作联盟的社会认同模型中的作用。
    方法:在实验室实验中,大学生参与者向另一个人(同盟国学生)谈论压力管理,他们要么参与或不参与积极倾听。参与者报告了他们对联盟的看法,关键的社会心理变量,和幸福。
    结果:积极倾听导致联盟的评分明显更高,程序正义,社会认同,和身份领导力,与没有积极倾听相比。积极倾听也会带来更大的积极影响和满意度。最终,支持一种解释路径模型,其中主动倾听通过社会认同预测工作联盟,身份领导力,程序正义。
    结论:听力质量以与工作联盟的社会身份模型一致的方式增强联盟和福祉,是促进治疗联盟的战略。
    OBJECTIVE: Characterization of psychotherapy as the \"talking cure\" de-emphasizes the importance of an active listener on the curative effect of talking. We test whether the working alliance and its benefits emerge from expression of voice, per se, or whether active listening is needed. We examine the role of listening in a social identity model of working alliance.
    METHODS: University student participants in a laboratory experiment spoke about stress management to another person (a confederate student) who either did or did not engage in active listening. Participants reported their perceptions of alliance, key social-psychological variables, and well-being.
    RESULTS: Active listening led to significantly higher ratings of alliance, procedural justice, social identification, and identity leadership, compared to no active listening. Active listening also led to greater positive affect and satisfaction. Ultimately, an explanatory path model was supported in which active listening predicted working alliance through social identification, identity leadership, and procedural justice.
    CONCLUSIONS: Listening quality enhances alliance and well-being in a manner consistent with a social identity model of working alliance, and is a strategy for facilitating alliance in therapy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:比较和关联音乐表演焦虑(MPA)和声音自我感知业余福音歌手,关注焦虑和这个样本中表现方面之间的相互作用。
    方法:本研究采用横断面和定量方法,涉及75位来自福音派教会的业余福音歌手,年龄在18至59岁之间。数据收集包括样本识别和表征问卷的管理,肯尼音乐表演焦虑量表(K-MPAI)的巴西葡萄牙语版本,和歌唱语音障碍指数(S-VHI)。描述性分析使用绝对频率和相对频率,集中趋势的措施,和色散(平均值和标准偏差[SD])。为了比较声音自我评估协议和性能方面,应用了Kruskal-Wallis测试.采用Spearman相关检验进行相关分析。所有分析均以5%的显著性水平进行(P<0.05)。
    结果:声乐热身和降温活动,表演后声乐不适,声乐自我评估与S-VHI得分显着相关,变量“比声音更响亮的乐器”与K-MPAI得分相关联。参与者的K-MPAI平均得分为85.12分(SD±36.6),样本的声音障碍平均得分为45.22(SD±32.3)。协议之间没有统计学上的显着相关性。
    结论:合并声乐热身和冷身活动与S-VHI评分较低显著相关。相反,那些经历表演后声乐不适的人在S-VHI上表现出更高的分数。此外,评估方案之间缺乏相关性表明,虽然观察到显著水平的嗓音障碍,无法确定与MPA的直接联系。总的来说,这些发现有助于对塑造声音健康和业余福音歌手表演的多方面因素的细微差别的理解,从而指导该领域未来的研究和干预。
    OBJECTIVE: To compare and correlate musical performance anxiety (MPA) and vocal self-perception among amateur evangelical singers, focusing on the interaction between anxiety and aspects of performance in this sample.
    METHODS: This study employed a cross-sectional and quantitative approach, involving 75 amateur gospel singers from evangelical churches, aged between 18 and 59 years. Data collection included the administration of a sample identification and characterization questionnaire, the Brazilian Portuguese version of the Kenny Music Performance Anxiety Inventory (K-MPAI), and the Singing Voice Handicap Index (S-VHI). The descriptive analysis used absolute and relative frequencies, measures of central tendency, and dispersion (mean and standard deviation [SD]). To compare the vocal self-assessment protocols and performance aspects, the Kruskal-Wallis test was applied. Spearman\'s correlation test was used for correlation analysis. All analyses were conducted with a significance level set at 5% (P < 0.05).
    RESULTS: Vocal warm-up and cool-down activities, vocal discomfort after performance, and vocal self-assessment were significantly associated with scores on S-VHI, and the variable \"instruments louder than voices\" was associated with the K-MPAI score. Participants exhibited a mean K-MPAI score of 85.12 points (SD ± 36.6), and the vocal handicap of the sample had a mean score of 45.22 (SD ± 32.3). There was no statistically significant correlation between the protocols.
    CONCLUSIONS: Incorporating vocal warm-up and cool-down activities was significantly associated with lower scores on S-VHI. Conversely, those experiencing postperformance vocal discomfort exhibited higher scores on S-VHI. Moreover, the absence of correlation between the assessment protocols suggests that while significant levels of voice handicap were observed, a direct link to MPA cannot be definitively established. Overall, these findings contribute to a nuanced understanding of the multifaceted factors shaping vocal health and performance among amateur evangelical singers, thereby guiding future research and interventions in this field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    情感识别(ER)技能的发展轨迹被认为因非语言模态而异,声乐ER比面部ER成熟晚。为了在行为水平上研究导致这种分离的潜在神经机制,本研究检查了青年在声乐和面部ER任务期间的神经功能连接是否显示出不同的发育变化。8-19岁的年轻人(n=41)在进行功能磁共振成像时完成了面部和声音ER任务,在两个时间点(相隔1年;行为数据n=36,对于神经数据,n=28)。偏最小二乘分析显示,ER期间的功能连通性都可以通过模态来区分(面部与面部的连通性模式不同。声乐ER)和跨时间-连通性的变化对于声乐ER特别明显。面部比声音更准确,并与年龄呈正相关;尽管任务绩效在1年内没有明显变化,潜在功能连接模式随时间的变化预测了参与者在时间2的ER准确性。一起来看,这些结果表明,声乐和面部ER由可区分的神经相关因子支持,这些神经相关因子可能经历不同的发育轨迹.我们的发现也是初步证据,表明网络整合的变化可能支持儿童期和青春期ER技能的发展。
    The developmental trajectory of emotion recognition (ER) skills is thought to vary by nonverbal modality, with vocal ER becoming mature later than facial ER. To investigate potential neural mechanisms contributing to this dissociation at a behavioural level, the current study examined whether youth\'s neural functional connectivity during vocal and facial ER tasks showed differential developmental change across time. Youth ages 8-19 (n = 41) completed facial and vocal ER tasks while undergoing functional magnetic resonance imaging, at two timepoints (1 year apart; n = 36 for behavioural data, n = 28 for neural data). Partial least squares analyses revealed that functional connectivity during ER is both distinguishable by modality (with different patterns of connectivity for facial vs. vocal ER) and across time-with changes in connectivity being particularly pronounced for vocal ER. ER accuracy was greater for faces than voices, and positively associated with age; although task performance did not change appreciably across a 1-year period, changes in latent functional connectivity patterns across time predicted participants\' ER accuracy at Time 2. Taken together, these results suggest that vocal and facial ER are supported by distinguishable neural correlates that may undergo different developmental trajectories. Our findings are also preliminary evidence that changes in network integration may support the development of ER skills in childhood and adolescence.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目前,关于人类声音的表征没有共识。本研究的目的是描述146名声音正常的人(西班牙语使用者)的喉外部肌肉组织的肌电行为,年龄在20至50岁之间。使用表面肌电图仪(SEMG)记录不同的发声任务。在所有声乐任务中,据观察,女性在舌骨上肌和胸锁乳突肌的激活(µV)高于男性,而男性在舌骨下肌肉有更高的激活。SEMG是一种有效的程序,可以帮助定义所研究人群中的正常声音特征,在临床检查中提供参考值。然而,有必要采用通用的评估任务系统和标准化的测量技术,以便与未来的研究进行比较。
    Currently, there is no consensus on the characterization of the human voice. The objective of the present study is to describe the myoelectric behavior of the extrinsic musculature of the larynx in 146 people with normal voice (Spanish speakers), aged between 20 and 50 years old. Different vocal tasks were recorded using a surface electromyograph (SEMG). In all vocal tasks, it was observed that women had higher activation (µV) in the suprahyoid and sternocleidomastoid muscles than men, while men had higher activation in the infrahyoid muscles. SEMG is a valid procedure to help define normal vocal characteristics in the studied population, providing reference values during clinical examination. However, it is necessary to adopt a universal system of assessment tasks and standardized measurement techniques to allow for comparisons with future studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人们经常与群体互动(即,合奏)在社交互动中。鉴于群体级别的信息在导航社会环境中很重要,我们期望对与个人威胁和社会归属感相关的群体的感知敏感性。大多数合奏感知研究都集中在视觉合奏上,很少研究听觉或声乐合奏。在四项研究中,我们提供的证据表明(i)感知者仅从声音中准确地提取一个群体的性别构成,(ii)威胁的判断随着男性人数的增加而增加,(iii)听众的归属感取决于小组中同性其他人的数量。这项工作促进了我们对社会认知的理解,人际交往,和集成编码以包括听觉信息,并揭示了人们从对发声群体的简短接触中提取相关社会信息的能力。
    People often interact with groups (i.e., ensembles) during social interactions. Given that group-level information is important in navigating social environments, we expect perceptual sensitivity to aspects of groups that are relevant for personal threat as well as social belonging. Most ensemble perception research has focused on visual ensembles, with little research looking at auditory or vocal ensembles. Across four studies, we present evidence that (i) perceivers accurately extract the sex composition of a group from voices alone, (ii) judgments of threat increase concomitantly with the number of men, and (iii) listeners\' sense of belonging depends on the number of same-sex others in the group. This work advances our understanding of social cognition, interpersonal communication, and ensemble coding to include auditory information, and reveals people\'s ability to extract relevant social information from brief exposures to vocalizing groups.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    为了研究在NICU中播放母亲录制的声音对早产儿母亲心理健康的影响,焦虑和压力量表-21(DASS-21)问卷。
    这是一项在IV级NICU进行的单中心前瞻性随机对照试验。该试验在clinicaltrials.gov(NCT04559620)注册。纳入标准是胎龄在26周至30周之间的早产儿的母亲。在出生后的第一周,对所有登记的母亲进行了DASS-21问卷调查,然后由音乐治疗师记录他们的声音。在介入组中,在生命的15到21天之间,将记录的母亲声音播放到婴儿孵化器中。在生命的21至23天之间施用第二DASS-21。使用Wilcoxon秩和检验比较两组之间的DASS-21得分,并使用Wilcoxon符号秩检验比较干预前后的DASS-21得分。
    40名符合条件的母亲被随机分配:20名归干预组,20名归对照组。两组产妇和新生儿的基线特征相似。在基线或研究干预后,两组之间的DASS-21评分没有显着差异。实验组介入前后的DASS-21评分或其各个组成部分均无差异。对照组在第1周和第4周之间,DASS-21的总评分和DASS-21的焦虑成分显着降低。
    在这项随机对照试验研究中,根据DASS-21问卷的测量,在早产儿培养箱中播放的母亲声音对母亲的心理健康没有任何影响。在这项初步研究中获得的数据在未来的RCT(随机对照试验)中很有用,以解决这一重要问题。
    UNASSIGNED: To study the effects of playing mother\'s recorded voice to preterm infants in the NICU on their mothers\' mental health as measured by the Depression, Anxiety and Stress Scale -21 (DASS-21) questionnaire.
    UNASSIGNED: This was a pilot single center prospective randomized controlled trial done at a level IV NICU. The trial was registered at clinicaltrials.gov (NCT04559620). Inclusion criteria were mothers of preterm infants with gestational ages between 26wks and 30 weeks. DASS-21 questionnaire was administered to all the enrolled mothers in the first week after birth followed by recording of their voice by the music therapists. In the interventional group, recorded maternal voice was played into the infant incubator between 15 and 21 days of life. A second DASS-21 was administered between 21 and 23 days of life. The Wilcoxon rank-sum test was used to compare DASS-21 scores between the two groups and Wilcoxon signed-rank test was used to compare the pre- and post-intervention DASS-21 scores.
    UNASSIGNED: Forty eligible mothers were randomized: 20 to the intervention group and 20 to the control group. The baseline maternal and neonatal characteristics were similar between the two groups. There was no significant difference in the DASS-21 scores between the two groups at baseline or after the study intervention. There was no difference in the pre- and post-interventional DASS-21 scores or its individual components in the experimental group. There was a significant decrease in the total DASS-21 score and the anxiety component of DASS-21 between weeks 1 and 4 in the control group.
    UNASSIGNED: In this pilot randomized control study, recorded maternal voice played into preterm infant\'s incubator did not have any effect on maternal mental health as measured by the DASS-21 questionnaire. Data obtained in this pilot study are useful in future RCTs (Randomized Controlled Trial) to address this important issue.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号