Voice

语音
  • 文章类型: Journal Article
    机器人辅助前列腺粒子植入技术发展迅速。然而,在这个过程中,有一些问题需要解决,如非直观的可视化效果和复杂的机器人控制。提高作业过程的智能化和可视化,提出了一种增强现实环境下前列腺种子植入机器人的语音控制技术。最初,对前列腺的MRI图像进行去噪和分割。通过表面绘制技术重建前列腺及其周围组织的三维模型。结合全息应用程序,构建了前列腺粒子植入的增强现实系统。提出了一种改进的基于迭代最近点的奇异值分解三维配准算法,三维配准实验结果验证了该算法能有效提高三维配准精度。提出了一种基于谱减法和BP神经网络的融合算法。实验结果表明,融合算法的平均时延为1.314s,融合算法可有效提高语音控制系统的可靠性,集成系统可以满足前列腺粒子植入的反应性要求。
    The technology of robot-assisted prostate seed implantation has developed rapidly. However, during the process, there are some problems to be solved, such as non-intuitive visualization effects and complicated robot control. To improve the intelligence and visualization of the operation process, a voice control technology of prostate seed implantation robot in augmented reality environment was proposed. Initially, the MRI image of the prostate was denoised and segmented. The three-dimensional model of prostate and its surrounding tissues was reconstructed by surface rendering technology. Combined with holographic application program, the augmented reality system of prostate seed implantation was built. An improved singular value decomposition three-dimensional registration algorithm based on iterative closest point was proposed, and the results of three-dimensional registration experiments verified that the algorithm could effectively improve the three-dimensional registration accuracy. A fusion algorithm based on spectral subtraction and BP neural network was proposed. The experimental results showed that the average delay of the fusion algorithm was 1.314 s, and the overall response time of the integrated system was 1.5 s. The fusion algorithm could effectively improve the reliability of the voice control system, and the integrated system could meet the responsiveness requirements of prostate seed implantation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    抑郁症的早期筛查有利于患者获得更好的诊断和治疗。虽然已经证明了利用语音数据进行抑郁症检测的有效性,数据集大小不足的问题仍未解决。因此,提出了一种有效识别抑郁症的人工智能方法。基于wav2vec2.0语音的预训练模型被用作特征提取器,从原始音频中自动提取高质量的语音特征。此外,使用小型微调网络作为分类模型,输出抑郁分类结果.随后,所提出的模型在DAIC-WOZ数据集上进行了微调,取得了优异的分类结果.值得注意的是,该模型在二元分类方面表现突出,在测试装置上达到0.9649的精度和0.1875的RMSE。同样,在多分类中获得了令人印象深刻的结果,精度为0.9481,RMSE为0.3810。Wav2vec2.0模型首次用于抑郁症识别,并表现出较强的泛化能力。方法简单,实用,并且适用,这可以帮助医生早期筛查抑郁症。
    The early screening of depression is highly beneficial for patients to obtain better diagnosis and treatment. While the effectiveness of utilizing voice data for depression detection has been demonstrated, the issue of insufficient dataset size remains unresolved. Therefore, we propose an artificial intelligence method to effectively identify depression. The wav2vec 2.0 voice-based pre-training model was used as a feature extractor to automatically extract high-quality voice features from raw audio. Additionally, a small fine-tuning network was used as a classification model to output depression classification results. Subsequently, the proposed model was fine-tuned on the DAIC-WOZ dataset and achieved excellent classification results. Notably, the model demonstrated outstanding performance in binary classification, attaining an accuracy of 0.9649 and an RMSE of 0.1875 on the test set. Similarly, impressive results were obtained in multi-classification, with an accuracy of 0.9481 and an RMSE of 0.3810. The wav2vec 2.0 model was first used for depression recognition and showed strong generalization ability. The method is simple, practical, and applicable, which can assist doctors in the early screening of depression.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    帕金森病(PD)是第二常见的神经退行性疾病,影响数百万人。早期的准确诊断和后续治疗可以减缓疾病进展。然而,在早期对PD进行准确诊断是一项挑战。以前的研究表明,即使对于运动障碍专家来说,在平均改良Hoehn-Yahr分期(mH&Y)达到1.8之前,很难区分PD患者和健康个体.最近的研究表明,构音障碍为PD患者的计算机辅助诊断提供了良好的指标。然而,很少有研究集中在早期诊断PD患者,特别是mH&Y≤1.5的那些。
    我们使用机器学习算法来分析语音特征,并开发了用于区分健康对照(HC)和PD患者的诊断模型,并用于区分HCs和轻度PD患者(mH&Y≤1.5)。使用单独的数据集对模型进行独立验证。
    我们的结果表明,该模型在识别轻度PD(mH&Y≤1.5)和HCs患者方面的卓越诊断性能,ROC曲线下面积为0.93(95%CI:0.851.00),准确度0.85,灵敏度0.95,特异性0.75。
    我们的研究结果有助于在缺乏运动障碍专家和特殊设备的社区和基层医疗机构的早期筛查PD。
    UNASSIGNED: Parkinson\'s disease (PD) is the second most common neurodegenerative disease and affects millions of people. Accurate diagnosis and subsequent treatment in the early stages can slow down disease progression. However, making an accurate diagnosis of PD at an early stage is challenging. Previous studies have revealed that even for movement disorder specialists, it was difficult to differentiate patients with PD from healthy individuals until the average modified Hoehn-Yahr staging (mH&Y) reached 1.8. Recent researches have shown that dysarthria provides good indicators for computer-assisted diagnosis of patients with PD. However, few studies have focused on diagnosing patients with PD in the early stages, specifically those with mH&Y ≤ 1.5.
    UNASSIGNED: We used a machine learning algorithm to analyze voice features and developed diagnostic models for differentiating between healthy controls (HCs) and patients with PD, and for differentiating between HCs and patients with mild PD (mH&Y ≤ 1.5). The models were independently validated using separate datasets.
    UNASSIGNED: Our results demonstrate that, a remarkable diagnostic performance of the model in identifying patients with mild PD (mH&Y ≤ 1.5) and HCs, with area under the ROC curve 0.93 (95% CI: 0.851.00), accuracy 0.85, sensitivity 0.95, and specificity 0.75.
    UNASSIGNED: The results of our study are helpful for screening PD in the early stages in the community and primary medical institutions where there is a lack of movement disorder specialists and special equipment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    通过语音识别说话者的身份是人际交往中的一项重要社会技能。行为证据表明,听众可以比非母语更好地识别其母语的声音,这被称为语言熟悉效应(LFE)。然而,其潜在的神经机制仍不清楚。因此,本研究通过采用功能性近红外光谱(fNIRS)研究了LFE如何在神经水平上发生。晚期不平衡的双语者首先被要求学习将陌生人的声音与他们的身份联系起来,然后根据他们的声音说一种非常熟悉的语言(即,母语中文),或适度熟悉(即,第二语言英语),或完全不熟悉(即,Ewe)给参与者。参与者在中文中最准确地识别了说话者,而在母羊中最不准确。中文的说话者识别速度比英语和母羊快,但两种非母语的反应时间没有差异。在神经层面,识别说中文的声音相对于英语/母羊在额下回产生较少的活动,中央前/中央后回,颈上回,和颞上沟/回,而英语和母羊之间没有发现差异,指示通过本地语言的自动语音编码来促进语音识别。这些发现为语言能力与语音识别之间的相互关系提供了新的思路,揭示了LFE的大脑激活模式取决于语言处理的自动化。
    Recognizing talkers\' identity via speech is an important social skill in interpersonal interaction. Behavioral evidence has shown that listeners can identify better the voices of their native language than those of a non-native language, which is known as the language familiarity effect (LFE). However, its underlying neural mechanisms remain unclear. This study therefore investigated how the LFE occurs at the neural level by employing functional near-infrared spectroscopy (fNIRS). Late unbalanced bilinguals were first asked to learn to associate strangers\' voices with their identities and then tested for recognizing the talkers\' identities based on their voices speaking a language either highly familiar (i.e., native language Chinese), or moderately familiar (i.e., second language English), or completely unfamiliar (i.e., Ewe) to participants. Participants identified talkers the most accurately in Chinese and the least accurately in Ewe. Talker identification was quicker in Chinese than in English and Ewe but reaction time did not differ between the two non-native languages. At the neural level, recognizing voices speaking Chinese relative to English/Ewe produced less activity in the inferior frontal gyrus, precentral/postcentral gyrus, supramarginal gyrus, and superior temporal sulcus/gyrus while no difference was found between English and Ewe, indicating facilitation of voice identification by the automatic phonological encoding in the native language. These findings shed new light on the interrelations between language ability and voice recognition, revealing that the brain activation pattern of the LFE depends on the automaticity of language processing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在音调和非音调语言中,遗传对人类声带音高的影响仍然未知。在音调语言中,如普通话,音高变化区分单词的含义,而在非音调语言中,比如冰岛语,音高是用来传达语调的。我们通过在中国重度抑郁症病例对照队列中寻找与个体间差异的遗传关联来解决这个问题,并将我们的结果与冰岛的全基因组关联研究进行了比较。相同的遗传变异,ABCC9基因内含子中的rs11046212-T,是两个样本中与中位数间距最密切相关的基因座之一。我们的荟萃分析揭示了四个全基因组的显著命中,包括两个小说协会。在音调和非音调语言中发现影响声带音高的遗传变异,这表明在两个不同的语言(冰岛语和普通话)中对人类声带系统具有共同的遗传贡献的可能性。
    The genetic influence on human vocal pitch in tonal and non-tonal languages remains largely unknown. In tonal languages, such as Mandarin Chinese, pitch changes differentiate word meanings, whereas in non-tonal languages, such as Icelandic, pitch is used to convey intonation. We addressed this question by searching for genetic associations with interindividual variation in median pitch in a Chinese major depression case-control cohort and compared our results with a genome-wide association study from Iceland. The same genetic variant, rs11046212-T in an intron of the ABCC9 gene, was one of the most strongly associated loci with median pitch in both samples. Our meta-analysis revealed four genome-wide significant hits, including two novel associations. The discovery of genetic variants influencing vocal pitch across both tonal and non-tonal languages suggests the possibility of a common genetic contribution to the human vocal system shared in two distinct populations with languages that differ in tonality (Icelandic and Mandarin).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    自COVID-19爆发以来,戴口罩已成为一种普遍现象。即使在大流行之后,人们在日常生活中继续保持戴口罩的习惯。虽然现有的研究已经探索了戴口罩如何影响佩戴者日常生活中的行为,它在工作场所的影响受到的关注较少。借鉴自我感知理论,这项研究通过心理安全研究了工作场所戴口罩对配戴者发声行为的积极影响。使用受试者内戴口罩操作的在线实验(N=291)支持了我们的假设。这项研究揭示了戴口罩的积极心理和行为后果,超出了人们的健康状况和日常生活的好处。
    Since the outbreak of COVID-19, mask-wearing has become a widespread phenomenon. Even after the pandemic, people continue to maintain the habit of wearing masks in their daily lives. While existing research has explored how mask-wearing can influence wearers\' behavior in everyday life, its effects in the workplace have received less attention. Drawing on self-perception theory, this study examined the positive effect of mask-wearing in the workplace on wearers\' voice behavior via psychological safety. An online experiment (N = 291) using a within-subject manipulation of wearing masks supported our hypotheses. This study uncovered the positive psychological and behavioral consequences of mask-wearing beyond its benefits in people\'s health conditions and everyday life.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:护士经理的滥用监督会显著影响护士对患者安全的隐瞒声音。印象管理动机和与说话相关的氛围的作用对于理解它们之间的联系至关重要。本研究旨在探讨虐待监督与虐待监督之间的关系,印象管理动机,谈论与气候相关的气候,隐瞒关于病人安全的声音。
    方法:本横断面研究采用便利抽样方法,从台州医院招募419名临床护士,浙江省,中国,2022年11月1日至2023年1月31日。这项研究遵循了STROBE核对表。使用中文版的虐待监督量表和印象管理动机量表对虐待监督和印象管理动机进行评估,分别。使用中文版的“关于患者安全的发言”问卷确定了关于患者安全的保留声音和与说话相关的气候。
    结果:护士领导的滥用监督(β=0.40,p<0.01)和护士的印象管理动机(β=0.10,p<0.01)显着积极地影响了护士对患者安全的声音。我们引入了印象管理动机作为中介变量,滥用监督对护士隐瞒声音的影响降低(β从0.40降至0.38,p<0.01)。护士与说话相关的氛围在虐待监督和印象管理动机之间起调节作用(β=0.24,p<0.05)。
    结论:护理领导的虐待监督可能导致护士出于自我保护印象管理的动机而对患者安全隐瞒声音。这种现象抑制了护士的主观能动性,破坏了他们在提高患者安全方面的积极参与。并阻碍了鼓励充分参与患者安全的文化的培养,这应该引起人们的极大关注。
    BACKGROUND: Abusive supervision by the nurse manager significantly influences nurses\' withholding voice about patient safety. The role of impression management motivation and speak up-related climate is crucial in understanding their connection. This study aimed to explore the relationship between abusive supervision, impression management motivation, speak up-related climate, and withholding voice about patient safety.
    METHODS: This cross-sectional study employed a convenience sampling method to recruit 419 clinical nurses from Taizhou Hospital, Zhejiang Province, China, between 1 November 2022 and 31 January 2023. The study adhered to the STROBE checklist. Abusive supervision and impression management motivation were assessed using the Chinese versions of the Abusive Supervision Scale and the Impression Management Motivation Scale, respectively. Withholding voice about patient safety and speak up-related climate were identified using the Chinese version of the Speaking Up about Patient Safety Questionnaire.
    RESULTS: Nurse leaders\' abusive supervision (β=0.40, p<0.01) and nurses\' impression management motivation (β=0.10, p<0.01) significantly and positively influenced nurses\' withholding voice about patient safety. We introduced impression management motivation as a mediating variable, and the effect of abusive supervision on nurses\' withholding voice decreased (β from 0.40 to 0.38, p< 0.01). Nurses\' speak up-related climate played a moderating role between abusive supervision and impression management motivation (β= 0.24, p<0.05).
    CONCLUSIONS: Abusive supervision by nursing leaders can result in nurses withholding voice about patient safety out of self-protective impression management motives. This phenomenon inhibits nurses\' subjective initiative and undermines their proactive involvement in improving patient safety, and hinders the cultivation of a culture encouraging full participation in patient safety, which should warrant significant attention.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近年来,患有喉部疾病的患者数量显着增加,包括癌症,创伤,以及其他导致失音的疾病。目前,市场对旨在帮助有语音缺陷的个人的医疗和保健产品的迫切需求,促使人工咽喉(AT)的发明。这种用户友好的设备消除了复杂的程序,如语音重建手术的需要。因此,在这次审查中,我们将首先仔细介绍智能AT,它不仅可以作为声音传感器,还可以作为薄膜声音发射器。然后,将仔细讨论检测声音的传感原理,包括电容,压电,电磁,和压阻元件用于声音传感领域。在此之后,还将分析热声理论的发展以及由声音发射器制成的不同材料。之后,将审查智能AT用于语音模式识别的各种算法,包括一些经典算法和神经网络算法。最后,前景,挑战,并对智能AT的结论进行说明。智能AT为语音障碍患者提供了明显的优势,展示重要的社会价值。
    In recent years, there has been a notable rise in the number of patients afflicted with laryngeal diseases, including cancer, trauma, and other ailments leading to voice loss. Currently, the market is witnessing a pressing demand for medical and healthcare products designed to assist individuals with voice defects, prompting the invention of the artificial throat (AT). This user-friendly device eliminates the need for complex procedures like phonation reconstruction surgery. Therefore, in this review, we will initially give a careful introduction to the intelligent AT, which can act not only as a sound sensor but also as a thin-film sound emitter. Then, the sensing principle to detect sound will be discussed carefully, including capacitive, piezoelectric, electromagnetic, and piezoresistive components employed in the realm of sound sensing. Following this, the development of thermoacoustic theory and different materials made of sound emitters will also be analyzed. After that, various algorithms utilized by the intelligent AT for speech pattern recognition will be reviewed, including some classical algorithms and neural network algorithms. Finally, the outlook, challenge, and conclusion of the intelligent AT will be stated. The intelligent AT presents clear advantages for patients with voice impairments, demonstrating significant social values.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    特发性单侧声带麻痹(IUVFP)缺乏有效的治疗方法。在我们的临床检查中,患者在喉神经刺激后报告了更好的发声。
    本研究旨在研究喉返神经(RLN)刺激对IUVFP患者发声的直接影响。
    62例临床确定的IUVFP患者接受了针状电极的RLN刺激。喉镜检查,声学分析,并进行语音感知评估,以定量比较干预前后的声音功能和语音质量。
    喉镜图像显示,在RLN刺激后,瘫痪的声带的运动范围更大(p<.01)和更好的声门闭合(p<.01)。声学分析显示,在干预后,发声障碍严重程度指数显着增加(p<.01),而抖动和闪光减少(p<.05)。根据感性评价,RLN刺激显著增加IUVFP患者的RBH分级(p<0.01)。此外,语音感知的改善与声门闭合的减少呈中度正相关.
    这项研究表明,在RLN刺激后,IUVFP患者的发声有短期改善,它为试验RLN刺激的受控递送和评估任何观察到的反应的耐久性提供了概念验证。
    UNASSIGNED: There is a lack of effective treatment for idiopathic unilateral vocal fold paralysis (IUVFP). A better phonation was reported by patients after laryngeal nerve stimulation during our clinical examination.
    UNASSIGNED: This study aims to investigate immediate effect of recurrent laryngeal nerve (RLN) stimulation on phonation in patients with IUVFP.
    UNASSIGNED: Sixty-two patients with clinically identified IUVFP underwent RLN stimulation with needle electrodes. Laryngoscopy, acoustic analysis, and voice perception assessment were performed for quantitative comparison of vocal function and voice quality before and after the intervention.
    UNASSIGNED: Laryngoscopic images showed a larger motion range of the paralyzed vocal fold (p < .01) and better glottal closure (p < .01) after RLN stimulation. Acoustic analysis revealed that the dysphonia severity index increased significantly (p < .01) while the jitter and shimmer decreased after the intervention (p < .05). According to perceptual evaluation, RLN stimulation significantly increased RBH grades in patients with IUVFP (p < .01). Furthermore, the improvement in voice perception had a moderate positive correlation with the decrease in the glottal closure.
    UNASSIGNED: This study shows a short-term improvement of phonation in IUVFP patients after RLN stimulation, which provides proof-of-concept for trialing a controlled delivery of RLN stimulation and assessing durability of any observed responses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:基于Web的医疗保健具有改善行动不便的患者的医疗保健获取和便利性的潜力,但它的成功取决于医生的积极参与。基于互联网的医疗保健计划的经济回报是可以激励医生继续参与的重要因素。尽管一些研究已经检查了基于网络的健康咨询的沟通模式和影响,医生的沟通特征与其经济回报之间的相关性仍未被探索。
    目的:本研究旨在探讨两种医患沟通方式的语言特征,器乐和情感,确定医生的经济回报,以患者同意每次咨询支付的酬金来衡量。我们还研究了通信媒体(基于网络的文本消息和语音消息)的调节作用以及不同通信功能对经济回报的复合影响。
    方法:我们从4个疾病专科的528名医生那里收集了40,563次基于网络的咨询,中国基于网络的医疗保健平台。使用语言查询和单词计数来提取交流特征,采用多元线性回归和K-均值聚类对数据进行分析。
    结果:我们发现使用认知处理语言(即,与洞察力相关的词,因果关系,试探性,和确定性)在工具交流中和情感交流中与积极情绪相关的单词与医生的经济回报呈正相关。然而,大量使用与差异相关的单词可能会产生不利影响。我们还发现,将语音消息用于服务交付会放大认知处理语言的影响,但并未减轻情感处理语言的影响。最高的经济回报与咨询有关,在咨询中,医生很少使用与消极情绪相关的表达;使用更多与积极情绪相关的术语;后来,使用工具性交流语言。
    结论:我们的研究提供了关于医生沟通特征与其经济回报之间关系的经验证据。它有助于从专业客户的角度更好地理解患者与医生的互动,并对医生和基于网络的医疗保健平台高管具有实际意义。
    BACKGROUND: Web-based health care has the potential to improve health care access and convenience for patients with limited mobility, but its success depends on active physician participation. The economic returns of internet-based health care initiatives are an important factor that can motivate physicians to continue their participation. Although several studies have examined the communication patterns and influences of web-based health consultations, the correlation between physicians\' communication characteristics and their economic returns remains unexplored.
    OBJECTIVE: This study aims to investigate how the linguistic features of 2 modes of physician-patient communication, instrumental and affective, determine the physician\'s economic returns, measured by the honorarium their patients agree to pay per consultation. We also examined the moderating effects of communication media (web-based text messages and voice messages) and the compounding effects of different communication features on economic returns.
    METHODS: We collected 40,563 web-based consultations from 528 physicians across 4 disease specialties on a large, web-based health care platform in China. Communication features were extracted using linguistic inquiry and word count, and we used multivariable linear regression and K-means clustering to analyze the data.
    RESULTS: We found that the use of cognitive processing language (ie, words related to insight, causation, tentativeness, and certainty) in instrumental communication and positive emotion-related words in affective communication were positively associated with the economic returns of physicians. However, the extensive use of discrepancy-related words could generate adverse effects. We also found that the use of voice messages for service delivery magnified the effects of cognitive processing language but did not moderate the effects of affective processing language. The highest economic returns were associated with consultations in which the physicians used few expressions related to negative emotion; used more terms associated with positive emotions; and later, used instrumental communication language.
    CONCLUSIONS: Our study provides empirical evidence about the relationship between physicians\' communication characteristics and their economic returns. It contributes to a better understanding of patient-physician interactions from a professional-client perspective and has practical implications for physicians and web-based health care platform executives.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号