vocal biomarkers

  • 文章类型: Journal Article
    这篇综述探讨了可解释人工智能(XAI)在通过声乐生物标志物检测和分析肺部疾病方面的新兴领域。肺部疾病,通常在早期阶段难以捉摸,构成重大公共卫生挑战。人工智能的最新进展带来了早期检测的创新方法,然而,许多AI模型的黑箱性质限制了它们的临床适用性。XAI成为一个关键工具,提高AI驱动诊断的透明度和可解释性。本文综述了XAI在肺部疾病声乐生物标志物分析中的应用研究现状,强调这些技术如何阐明特定的声音特征和肺部病理之间的联系。我们批判性地检查所采用的方法,所研究的肺部疾病的类型,以及各种XAI模型的性能。XAI帮助早期发现的潜力,监测疾病进展,强调了肺医学中的个性化治疗策略。此外,这次审查确定了当前的挑战,包括数据异质性和模型泛化性,并提出了未来的研究方向。通过在肺部疾病检测的背景下提供可解释的AI特征的全面分析,这篇综述旨在弥合先进的计算方法和临床实践之间的差距,为更透明铺平道路,可靠,和有效的诊断工具。
    This review delves into the burgeoning field of explainable artificial intelligence (XAI) in the detection and analysis of lung diseases through vocal biomarkers. Lung diseases, often elusive in their early stages, pose a significant public health challenge. Recent advancements in AI have ushered in innovative methods for early detection, yet the black-box nature of many AI models limits their clinical applicability. XAI emerges as a pivotal tool, enhancing transparency and interpretability in AI-driven diagnostics. This review synthesizes current research on the application of XAI in analyzing vocal biomarkers for lung diseases, highlighting how these techniques elucidate the connections between specific vocal features and lung pathology. We critically examine the methodologies employed, the types of lung diseases studied, and the performance of various XAI models. The potential for XAI to aid in early detection, monitor disease progression, and personalize treatment strategies in pulmonary medicine is emphasized. Furthermore, this review identifies current challenges, including data heterogeneity and model generalizability, and proposes future directions for research. By offering a comprehensive analysis of explainable AI features in the context of lung disease detection, this review aims to bridge the gap between advanced computational approaches and clinical practice, paving the way for more transparent, reliable, and effective diagnostic tools.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    人类的声音有可能作为早期检测的有价值的生物标志物,诊断,和监测儿科病情。此范围审查综合了当前有关人工智能(AI)在分析儿科语音作为健康生物标志物中的应用的知识。纳入的研究包括来自0-17岁儿童人群的录音,利用特征提取方法,并使用AI模型分析病理生物标志物。提取了62项研究的数据,包括学习和参与者特征,记录源,特征提取方法,和AI模型。来自35项研究的39个模型的数据进行了准确性评估,灵敏度,和特异性。该评论显示了儿科语音研究的全球代表性,专注于发展,呼吸,演讲,和语言条件。最常研究的病症是自闭症谱系障碍,智障人士,窒息,和哮喘。Mel-频率倒谱系数是最常用的特征提取方法,而支持向量机是主要的人工智能模型。使用人工智能对儿科语音的分析证明了作为一种非侵入性的承诺,用于广泛儿科疾病的具有成本效益的生物标志物。需要进一步的研究来标准化特征提取方法和AI模型,以用于评估儿科语音作为健康生物标志物。标准化具有在各种条件和语音记录类型的临床环境中增强这些工具的准确性和适用性的巨大潜力。该领域的进一步发展对于为全球儿科人群创建创新的诊断工具和干预措施具有巨大的潜力。
    The human voice has the potential to serve as a valuable biomarker for the early detection, diagnosis, and monitoring of pediatric conditions. This scoping review synthesizes the current knowledge on the application of artificial intelligence (AI) in analyzing pediatric voice as a biomarker for health. The included studies featured voice recordings from pediatric populations aged 0-17 years, utilized feature extraction methods, and analyzed pathological biomarkers using AI models. Data from 62 studies were extracted, encompassing study and participant characteristics, recording sources, feature extraction methods, and AI models. Data from 39 models across 35 studies were evaluated for accuracy, sensitivity, and specificity. The review showed a global representation of pediatric voice studies, with a focus on developmental, respiratory, speech, and language conditions. The most frequently studied conditions were autism spectrum disorder, intellectual disabilities, asphyxia, and asthma. Mel-Frequency Cepstral Coefficients were the most utilized feature extraction method, while Support Vector Machines were the predominant AI model. The analysis of pediatric voice using AI demonstrates promise as a non-invasive, cost-effective biomarker for a broad spectrum of pediatric conditions. Further research is necessary to standardize the feature extraction methods and AI models utilized for the evaluation of pediatric voice as a biomarker for health. Standardization has significant potential to enhance the accuracy and applicability of these tools in clinical settings across a variety of conditions and voice recording types. Further development of this field has enormous potential for the creation of innovative diagnostic tools and interventions for pediatric populations globally.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:数字时代见证了对新闻和信息的数字平台的日益依赖,再加上“deepfake”技术的出现。Deepfakes,利用语音记录和图像的大量数据集的深度学习模型,对媒体真实性构成重大威胁,可能导致不道德的滥用,如冒充和传播虚假信息。
    目标:为了应对这一挑战,这项研究旨在引入先天生物过程的概念,以区分真实的人类声音和克隆的声音。我们建议存在或不存在某些感知特征,比如讲话中的停顿,可以有效区分克隆和真实的音频。
    方法:共招募了49名具有不同种族背景和口音的成年参与者。每个参与者贡献语音样本,用于训练多达3个不同的语音克隆文本到语音模型和3个控制段落。随后,克隆模型生成了控制段落的合成版本,产生由每个参与者多达9个克隆音频样本和3个对照样本组成的数据集。我们分析了呼吸等生物行为引起的语音停顿,吞咽,和认知过程。计算了对应于语音暂停简档的五个音频特征。评估了这些特征的真实音频和克隆音频之间的差异,和5个经典的机器学习算法实现了使用这些特征来创建预测模型。通过对看不见的数据进行测试,评估了最优模型的泛化能力,结合了一个朴素的生成器,一个模型天真的段落,和幼稚的参与者。
    结果:克隆音频显示暂停之间的时间显着增加(P<.001),语音段长度的变化减少(P=0.003),发言时间的总比例增加(P=.04),语音中的micro和macropauses比率降低(P=0.01)。使用这些功能实现了五个机器学习模型,AdaBoost模型展示了最高的性能,实现5倍交叉验证平衡精度为0.81(SD0.05)。其他模型包括支持向量机(平衡精度0.79,SD0.03),随机森林(平衡精度0.78,SD0.04),逻辑回归,和决策树(平衡精度0.76,SD0.10和0.72,SD0.06)。在评估最优AdaBoost模型时,在预测未知数据时,它实现了0.79的总体测试准确性。
    结论:引入感知,机器学习模型中的生物特征在区分真实的人类声音和克隆音频方面显示出有希望的结果。
    BACKGROUND: The digital era has witnessed an escalating dependence on digital platforms for news and information, coupled with the advent of \"deepfake\" technology. Deepfakes, leveraging deep learning models on extensive data sets of voice recordings and images, pose substantial threats to media authenticity, potentially leading to unethical misuse such as impersonation and the dissemination of false information.
    OBJECTIVE: To counteract this challenge, this study aims to introduce the concept of innate biological processes to discern between authentic human voices and cloned voices. We propose that the presence or absence of certain perceptual features, such as pauses in speech, can effectively distinguish between cloned and authentic audio.
    METHODS: A total of 49 adult participants representing diverse ethnic backgrounds and accents were recruited. Each participant contributed voice samples for the training of up to 3 distinct voice cloning text-to-speech models and 3 control paragraphs. Subsequently, the cloning models generated synthetic versions of the control paragraphs, resulting in a data set consisting of up to 9 cloned audio samples and 3 control samples per participant. We analyzed the speech pauses caused by biological actions such as respiration, swallowing, and cognitive processes. Five audio features corresponding to speech pause profiles were calculated. Differences between authentic and cloned audio for these features were assessed, and 5 classical machine learning algorithms were implemented using these features to create a prediction model. The generalization capability of the optimal model was evaluated through testing on unseen data, incorporating a model-naive generator, a model-naive paragraph, and model-naive participants.
    RESULTS: Cloned audio exhibited significantly increased time between pauses (P<.001), decreased variation in speech segment length (P=.003), increased overall proportion of time speaking (P=.04), and decreased rates of micro- and macropauses in speech (both P=.01). Five machine learning models were implemented using these features, with the AdaBoost model demonstrating the highest performance, achieving a 5-fold cross-validation balanced accuracy of 0.81 (SD 0.05). Other models included support vector machine (balanced accuracy 0.79, SD 0.03), random forest (balanced accuracy 0.78, SD 0.04), logistic regression, and decision tree (balanced accuracies 0.76, SD 0.10 and 0.72, SD 0.06). When evaluating the optimal AdaBoost model, it achieved an overall test accuracy of 0.79 when predicting unseen data.
    CONCLUSIONS: The incorporation of perceptual, biological features into machine learning models demonstrates promising results in distinguishing between authentic human voices and cloned audio.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    声音生物标志物在心理健康评估中的应用越来越受到关注。这项研究旨在通过引入一种新颖的人声评分系统来进一步研究这一行,该系统旨在为现实世界中的用户提供心理适应性跟踪见解。
    对104名门诊精神病患者进行了一项前瞻性队列研究,以验证“心理健康声乐生物标志物”(MFVB)评分。MFVB评分来自八个人声特征,根据文献综述进行选择。使用M3清单评估参与者的心理健康症状严重程度,作为测量抑郁症的诊断工具,焦虑,创伤后应激障碍,和双相情感障碍症状。
    MFVB表现出了根据个人心理健康症状严重程度升高的风险对个人进行分层的能力。持续观察增强了MFVB的疗效,对于两周内汇总的数据,风险比从单个30秒语音样本的1.53(1.09-2.14,p=0.0138)提高到2.00(1.21-3.30,p=0.0068)。在每周使用MFVB5-6次的参与者中观察到较高的风险比8.50(2.31-31.25,p=0.0013),强调频繁和连续观察的效用。参与者的反馈证实了应用程序的用户友好性及其感知的好处。
    MFVB是在现实条件下进行客观心理健康跟踪的有前途的工具,具有成本效益的潜力,可扩展,以及传统精神病学评估的隐私保护辅助。用户反馈表明,声乐生物标志物可以提供个性化的见解,并支持与改善心理健康风险和结果相关的临床治疗和其他有益活动。
    UNASSIGNED: The utility of vocal biomarkers for mental health assessment has gained increasing attention. This study aims to further this line of research by introducing a novel vocal scoring system designed to provide mental fitness tracking insights to users in real-world settings.
    UNASSIGNED: A prospective cohort study with 104 outpatient psychiatric participants was conducted to validate the \"Mental Fitness Vocal Biomarker\" (MFVB) score. The MFVB score was derived from eight vocal features, selected based on literature review. Participants\' mental health symptom severity was assessed using the M3 Checklist, which serves as a transdiagnostic tool for measuring depression, anxiety, post-traumatic stress disorder, and bipolar symptoms.
    UNASSIGNED: The MFVB demonstrated an ability to stratify individuals by their risk of elevated mental health symptom severity. Continuous observation enhanced the MFVB\'s efficacy, with risk ratios improving from 1.53 (1.09-2.14, p=0.0138) for single 30-second voice samples to 2.00 (1.21-3.30, p=0.0068) for data aggregated over two weeks. A higher risk ratio of 8.50 (2.31-31.25, p=0.0013) was observed in participants who used the MFVB 5-6 times per week, underscoring the utility of frequent and continuous observation. Participant feedback confirmed the user-friendliness of the application and its perceived benefits.
    UNASSIGNED: The MFVB is a promising tool for objective mental health tracking in real-world conditions, with potential to be a cost-effective, scalable, and privacy-preserving adjunct to traditional psychiatric assessments. User feedback suggests that vocal biomarkers can offer personalized insights and support clinical therapy and other beneficial activities that are associated with improved mental health risks and outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:这项横断面研究旨在通过检查非糖尿病和T2DM参与者之间语音记录的差异,探讨语音分析作为II型糖尿病(T2DM)预筛选工具的潜力。
    方法:在2020年2月至2023年9月期间,根据特定的纳入和排除标准,在伊朗招募了60名非糖尿病(n=30)或T2DM(n=30)的参与者。参与者根据他们的出生年份进行匹配,然后分为六个年龄类别。使用WhatsApp应用程序,参与者记录了语音启发任务的翻译版本。七个声学特征[基频,抖动,shimmer,谐波噪声比(HNR),倒谱峰突出度(CPP),语音发作时间(VOT),和共振峰(F1-F2)]从每个记录中提取并使用Praat软件进行分析。用Kolmogorov-Smirnov分析数据,双向方差分析,事后Tukey,二元逻辑回归,和学生t测试。
    结果:组间比较显示基频有显著差异,抖动,shimmer,CPP,和HNR(p<0.05),而共振峰和VOT无显著差异(p>0.05)。二元logistic回归显示,微光是疾病组的最显著预测因子。糖尿病状态和年龄之间也存在显着差异,在CPP的情况下。
    结论:与非糖尿病对照者相比,II型糖尿病参与者表现出显著的声带变异。
    OBJECTIVE: This cross-sectional study aimed to investigate the potential of voice analysis as a prescreening tool for type II diabetes mellitus (T2DM) by examining the differences in voice recordings between non-diabetic and T2DM participants.
    METHODS: 60 participants diagnosed as non-diabetic (n = 30) or T2DM (n = 30) were recruited on the basis of specific inclusion and exclusion criteria in Iran between February 2020 and September 2023. Participants were matched according to their year of birth and then placed into six age categories. Using the WhatsApp application, participants recorded the translated versions of speech elicitation tasks. Seven acoustic features [fundamental frequency, jitter, shimmer, harmonic-to-noise ratio (HNR), cepstral peak prominence (CPP), voice onset time (VOT), and formant (F1-F2)] were extracted from each recording and analyzed using Praat software. Data was analyzed with Kolmogorov-Smirnov, two-way ANOVA, post hoc Tukey, binary logistic regression, and student t tests.
    RESULTS: The comparison between groups showed significant differences in fundamental frequency, jitter, shimmer, CPP, and HNR (p < 0.05), while there were no significant differences in formant and VOT (p > 0.05). Binary logistic regression showed that shimmer was the most significant predictor of the disease group. There was also a significant difference between diabetes status and age, in the case of CPP.
    CONCLUSIONS: Participants with type II diabetes exhibited significant vocal variations compared to non-diabetic controls.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:10%至20%的COVID-19感染者会发展为所谓的长COVID综合征,其特点是症状波动。长COVID对受影响人群的生活质量有很大影响,他们经常感到被医疗保健系统抛弃,并要求新的工具来帮助他们控制症状。新的数字监控解决方案可以让他们可视化症状的演变,并可以成为与医疗保健专业人员(HCP)沟通的工具。使用声音和声音生物标志物可以促进对持续和波动症状的准确和客观监测。然而,评估需求并确保这种创新方法被其潜在用户——有持续COVID-19相关症状的人——接受,有或没有长期的COVID诊断,和参与长期COVID护理的HCP-将它们纳入整个开发过程是至关重要的。
    目的:在UpcomingVoice研究中,我们的目标是定义日常生活中最相关的方面,评估声音和声音生物标志物的使用如何成为帮助他们的潜在解决方案,并确定数字健康解决方案的一般规格和具体项目,以使用声音生物标志物与其最终用户一起监测长期COVID症状。
    方法:UpcomingVoice是一项横断面混合方法研究,包括基于网络的定量调查,然后是基于半结构化个人访谈和焦点小组的定性阶段。长期COVID患者和负责长期COVID患者的HCP将被邀请参加这项完全基于网络的研究。从调查中收集的定量数据将使用描述性统计进行分析。来自个人访谈和焦点小组的定性数据将使用主题分析方法进行转录和分析。
    结果:该研究于2022年8月获得卢森堡国家研究伦理委员会(编号202208/04)的批准,并于2022年10月启动了基于网络的调查。数据收集将于2023年9月完成,结果将于2024年公布。
    结论:这项混合方法研究将确定受长期COVID影响的人在日常生活中的需求,并描述需要监测和改善的主要症状或问题。我们将确定如何使用语音和声音生物标志物来满足这些需求,并与未来的最终用户共同开发量身定制的基于语音的数字健康解决方案。该项目将有助于提高长期COVID患者的生活质量和护理。将探索向其他疾病的潜在可转移性,这将有助于一般声乐生物标志物的部署。
    背景:ClinicalTrials.govNCT05546918;https://clinicaltrials.gov/ct2/show/NCT05546918。
    DERR1-10.2196/46103。
    BACKGROUND: Between 10% and 20% of people with a COVID-19 infection will develop the so-called long COVID syndrome, which is characterized by fluctuating symptoms. Long COVID has a high impact on the quality of life of affected people, who often feel abandoned by the health care system and are demanding new tools to help them manage their symptoms. New digital monitoring solutions could allow them to visualize the evolution of their symptoms and could be tools to communicate with health care professionals (HCPs). The use of voice and vocal biomarkers could facilitate the accurate and objective monitoring of persisting and fluctuating symptoms. However, to assess the needs and ensure acceptance of this innovative approach by its potential users-people with persisting COVID-19-related symptoms, with or without a long COVID diagnosis, and HCPs involved in long COVID care-it is crucial to include them in the entire development process.
    OBJECTIVE: In the UpcomingVoice study, we aimed to define the most relevant aspects of daily life that people with long COVID would like to be improved, assess how the use of voice and vocal biomarkers could be a potential solution to help them, and determine the general specifications and specific items of a digital health solution to monitor long COVID symptoms using vocal biomarkers with its end users.
    METHODS: UpcomingVoice is a cross-sectional mixed methods study and consists of a quantitative web-based survey followed by a qualitative phase based on semistructured individual interviews and focus groups. People with long COVID and HCPs in charge of patients with long COVID will be invited to participate in this fully web-based study. The quantitative data collected from the survey will be analyzed using descriptive statistics. Qualitative data from the individual interviews and the focus groups will be transcribed and analyzed using a thematic analysis approach.
    RESULTS: The study was approved by the National Research Ethics Committee of Luxembourg (number 202208/04) in August 2022 and started in October 2022 with the launch of the web-based survey. Data collection will be completed in September 2023, and the results will be published in 2024.
    CONCLUSIONS: This mixed methods study will identify the needs of people affected by long COVID in their daily lives and describe the main symptoms or problems that would need to be monitored and improved. We will determine how using voice and vocal biomarkers could meet these needs and codevelop a tailored voice-based digital health solution with its future end users. This project will contribute to improving the quality of life and care of people with long COVID. The potential transferability to other diseases will be explored, which will contribute to the deployment of vocal biomarkers in general.
    BACKGROUND: ClinicalTrials.gov NCT05546918; https://clinicaltrials.gov/ct2/show/NCT05546918.
    UNASSIGNED: DERR1-10.2196/46103.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    痛苦是一种复杂的主观体验,传统的疼痛评估方法可能受到自我报告偏见和观察者变异性等因素的限制。声音经常被用来评估疼痛,偶尔与面部手势等其他行为结合。与面部情绪相比,很少有证据将疼痛与声音联系起来。这篇文献综述综合了在成人中使用语音识别和语音分析进行疼痛检测的研究现状,特别关注人工智能(AI)和机器学习(ML)技术的作用。我们描述了使用语音进行疼痛识别的先前工作,并强调了语音作为疼痛检测工具的不同方法,如人类效应或生物信号。总的来说,研究表明,基于AI的语音分析可以成为患有各种类型疼痛的成年患者疼痛检测的有效工具,包括慢性和急性疼痛。我们强调了研究中使用的基于ML的方法的高准确性,以及由于疼痛的性质和患者人群特征等因素,它们在普遍性方面的局限性。然而,仍然存在潜在的挑战,例如对大型数据集的需求和训练模型中存在偏差的风险,这需要进一步的研究。
    Pain is a complex and subjective experience, and traditional methods of pain assessment can be limited by factors such as self-report bias and observer variability. Voice is frequently used to evaluate pain, occasionally in conjunction with other behaviors such as facial gestures. Compared to facial emotions, there is less available evidence linking pain with voice. This literature review synthesizes the current state of research on the use of voice recognition and voice analysis for pain detection in adults, with a specific focus on the role of artificial intelligence (AI) and machine learning (ML) techniques. We describe the previous works on pain recognition using voice and highlight the different approaches to voice as a tool for pain detection, such as a human effect or biosignal. Overall, studies have shown that AI-based voice analysis can be an effective tool for pain detection in adult patients with various types of pain, including chronic and acute pain. We highlight the high accuracy of the ML-based approaches used in studies and their limitations in terms of generalizability due to factors such as the nature of the pain and patient population characteristics. However, there are still potential challenges, such as the need for large datasets and the risk of bias in training models, which warrant further research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:基于声乐生物标志物的机器学习方法在检测各种健康状况方面显示出有希望的结果,包括哮喘等呼吸道疾病。在这项研究中,我们旨在验证最初在哮喘和健康志愿者数据集上训练的呼吸反应性声带生物标志物(RRVB)平台的区分能力,没有修改,活跃的COVID-19感染与向美国和印度医院展示患者的健康志愿者。
    目的:本研究的目的是确定RRVB模型是否可以区分患有活动性COVID-19感染的患者。无症状健康志愿者通过评估其敏感性,特异性,和赔率比。另一个目的是评估RRVB模型输出是否与COVID-19的症状严重程度相关。
    方法:使用语音声学特征的加权和的逻辑回归模型先前在约1,700名确诊哮喘患者的数据集上进行了训练和验证类似数量的健康对照。相同的模型已显示出对慢性阻塞性肺疾病(COPD)患者的普遍性,间质性肺病(ILD),还有咳嗽.在本研究中,共有497名参与者(46%为男性,54%女性;94%<65岁,6%>=65岁;51%马拉地语,45%英语,5%的西班牙语使用者)在美国和印度的四个临床站点注册,并在其个人智能手机上提供语音样本和症状报告。参与者包括有症状的COVID-19阳性和阴性患者以及无症状的健康志愿者。通过与RT-PCR证实的COVID-19的临床诊断进行比较,评估了RRVB模型的性能。
    结果:RRVB模型区分呼吸系统疾病患者的能力与健康对照以前在哮喘的验证数据上得到了证明,COPD,ILD和咳嗽的比值比分别为4.3、9.1、3.1和3.9。本研究在COVID-19中进行的RRVB模型相同,灵敏度为73.2%,特异性为62.9%,比值比为4.64(p<0.0001)。出现呼吸道症状的患者比未出现呼吸道症状和完全无症状的患者更频繁地检测到(78.4%vs.67.4%与68.0%)。
    结论:RRVB模型在呼吸条件下显示出良好的泛化性,地理位置,和语言。COVID-19的结果表明,它有可能作为一种预筛查工具,用于结合温度和症状报告识别有COVID-19感染风险的受试者。虽然不是COVID-19测试,这些结果表明,RRVB模型可以鼓励有针对性的测试。此外,该模型在不同的语言和地理环境中检测呼吸道症状的通用性提示了开发和验证未来用于更广泛疾病监测和监测应用的基于语音的工具的潜在途径.
    背景:ClinicalTrials.gov(NCT04582331。
    Vocal biomarker-based machine learning approaches have shown promising results in the detection of various health conditions, including respiratory diseases, such as asthma.
    This study aimed to determine whether a respiratory-responsive vocal biomarker (RRVB) model platform initially trained on an asthma and healthy volunteer (HV) data set can differentiate patients with active COVID-19 infection from asymptomatic HVs by assessing its sensitivity, specificity, and odds ratio (OR).
    A logistic regression model using a weighted sum of voice acoustic features was previously trained and validated on a data set of approximately 1700 patients with a confirmed asthma diagnosis and a similar number of healthy controls. The same model has shown generalizability to patients with chronic obstructive pulmonary disease, interstitial lung disease, and cough. In this study, 497 participants (female: n=268, 53.9%; <65 years old: n=467, 94%; Marathi speakers: n=253, 50.9%; English speakers: n=223, 44.9%; Spanish speakers: n=25, 5%) were enrolled across 4 clinical sites in the United States and India and provided voice samples and symptom reports on their personal smartphones. The participants included patients who are symptomatic COVID-19 positive and negative as well as asymptomatic HVs. The RRVB model performance was assessed by comparing it with the clinical diagnosis of COVID-19 confirmed by reverse transcriptase-polymerase chain reaction.
    The ability of the RRVB model to differentiate patients with respiratory conditions from healthy controls was previously demonstrated on validation data in asthma, chronic obstructive pulmonary disease, interstitial lung disease, and cough, with ORs of 4.3, 9.1, 3.1, and 3.9, respectively. The same RRVB model in this study in COVID-19 performed with a sensitivity of 73.2%, specificity of 62.9%, and OR of 4.64 (P<.001). Patients who experienced respiratory symptoms were detected more frequently than those who did not experience respiratory symptoms and completely asymptomatic patients (sensitivity: 78.4% vs 67.4% vs 68%, respectively).
    The RRVB model has shown good generalizability across respiratory conditions, geographies, and languages. Results using data set of patients with COVID-19 demonstrate its meaningful potential to serve as a prescreening tool for identifying individuals at risk for COVID-19 infection in combination with temperature and symptom reports. Although not a COVID-19 test, these results suggest that the RRVB model can encourage targeted testing. Moreover, the generalizability of this model for detecting respiratory symptoms across different linguistic and geographic contexts suggests a potential path for the development and validation of voice-based tools for broader disease surveillance and monitoring applications in the future.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    被诊断患有癌症的青少年和年轻人(AYAs)是年龄定义的人群,研究报告高达45%的人口经历心理困扰。尽管在整个癌症旅程中筛查和监测心理困扰是必不可少的,许多癌症中心未能有效实施痛苦筛查方案,主要是由于临床工作流程繁忙和调查疲劳.移动技术和语音科学的最新进展使灵活且引人入胜的方法能够监测心理困扰。然而,针对这些方法的可行性和可接受性的以患者为中心的研究仍然缺乏。因此,在这个项目中,我们旨在评估人工智能(AI)功能和基于语音的移动应用程序的可行性和可接受性,以监测被诊断患有癌症的AYAs中的心理困扰.我们使用单臂前瞻性队列设计和分层抽样策略。我们的目标是招募60名被诊断患有癌症的AYA,并在6个月内使用基于AI的基于语音的困扰监测工具监测他们的心理困扰。这项研究的主要可行性终点是由完成六个月遇险评估中的四个参与者的数量来定义的。并且使用干预措施的可接受性定量定义可接受性终点,并使用半结构化访谈定性定义可接受性终点。
    Adolescents and young adults (AYAs) diagnosed with cancer are an age-defined population, with studies reporting up to 45% of the population experiencing psychological distress. Although it is essential to screen and monitor for psychological distress throughout AYAs\' cancer journeys, many cancer centers fail to effectively implement distress screening protocols largely due to busy clinical workflow and survey fatigue. Recent advances in mobile technology and speech science have enabled flexible and engaging methods to monitor psychological distress. However, patient-centered research focusing on these methods\' feasibility and acceptability remains lacking. Therefore, in this project, we aim to evaluate the feasibility and acceptability of an artificial intelligence (AI)-enabled and speech-based mobile application to monitor psychological distress among AYAs diagnosed with cancer. We use a single-arm prospective cohort design with a stratified sampling strategy. We aim to recruit 60 AYAs diagnosed with cancer and to monitor their psychological distress using an AI-enabled speech-based distress monitoring tool over a 6 month period. The primary feasibility endpoint of this study is defined by the number of participants completing four out of six monthly distress assessments, and the acceptability endpoint is defined both quantitatively using the acceptability of intervention measure and qualitatively using semi-structured interviews.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    人们普遍认为,从分析语音(声音信号)和语言产生(单词和句子)中获得的信息是了解个人认知能力健康状况的有用窗口。事实上,大多数神经心理学测试系统都有一个与言语和语言相关的组件,临床医生会从患者那里引出言语,以便在广泛的维度上进行主观评估。随着语音信号处理和自然语言处理的进步,最近有兴趣开发检测认知语言功能更微妙变化的工具。这项工作依赖于从记录和转录的语音中提取一组特征,以对语音和语言进行客观评估,神经系统疾病的早期诊断,并在诊断后追踪疾病。强调认知和思维障碍,在本文中,我们对该领域中使用的现有语音和语言特征进行了回顾,讨论其临床应用,并突出它们的优点和缺点。广义地说,该综述分为两类:基于自然语言处理的语言特征和基于语音信号处理的语音特征。在每个类别中,我们考虑旨在衡量认知语言学互补维度的特征,包括语言多样性,句法复杂性,语义连贯,和时间。最后,我们提出了进一步推进该领域的新研究方向。
    It is widely accepted that information derived from analyzing speech (the acoustic signal) and language production (words and sentences) serves as a useful window into the health of an individual\'s cognitive ability. In fact, most neuropsychological testing batteries have a component related to speech and language where clinicians elicit speech from patients for subjective evaluation across a broad set of dimensions. With advances in speech signal processing and natural language processing, there has been recent interest in developing tools to detect more subtle changes in cognitive-linguistic function. This work relies on extracting a set of features from recorded and transcribed speech for objective assessments of speech and language, early diagnosis of neurological disease, and tracking of disease after diagnosis. With an emphasis on cognitive and thought disorders, in this paper we provide a review of existing speech and language features used in this domain, discuss their clinical application, and highlight their advantages and disadvantages. Broadly speaking, the review is split into two categories: language features based on natural language processing and speech features based on speech signal processing. Within each category, we consider features that aim to measure complementary dimensions of cognitive-linguistics, including language diversity, syntactic complexity, semantic coherence, and timing. We conclude the review with a proposal of new research directions to further advance the field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号