关键词: Auscultation Digital health Isolation forest Lung sounds Random forest Surfboard audio features extractor

Mesh : Humans Artificial Intelligence Auscultation / methods Algorithms Machine Learning Lung

来  源:   DOI:10.1016/j.compbiomed.2023.107784

Abstract:
The use of machine learning in biomedical research has surged in recent years thanks to advances in devices and artificial intelligence. Our aim is to expand this body of knowledge by applying machine learning to pulmonary auscultation signals. Despite improvements in digital stethoscopes and attempts to find synergy between them and artificial intelligence, solutions for their use in clinical settings remain scarce. Physicians continue to infer initial diagnoses with less sophisticated means, resulting in low accuracy, leading to suboptimal patient care. To arrive at a correct preliminary diagnosis, the auscultation diagnostics need to be of high accuracy. Due to the large number of auscultations performed, data availability opens up opportunities for more effective sound analysis. In this study, digital 6-channel auscultations of 45 patients were used in various machine learning scenarios, with the aim of distinguishing between normal and abnormal pulmonary sounds. Audio features (such as fundamental frequencies F0-4, loudness, HNR, DFA, as well as descriptive statistics of log energy, RMS and MFCC) were extracted using the Python library Surfboard. Windowing, feature aggregation, and concatenation strategies were used to prepare data for machine learning algorithms in unsupervised (fair-cut forest, outlier forest) and supervised (random forest, regularized logistic regression) settings. The evaluation was carried out using 9-fold stratified cross-validation repeated 30 times. Decision fusion by averaging the outputs for a subject was also tested and found to be helpful. Supervised models showed a consistent advantage over unsupervised ones, with random forest achieving a mean AUC ROC of 0.691 (accuracy 71.11%, Kappa 0.416, F1-score 0.675) in side-based detection and a mean AUC ROC of 0.721 (accuracy 68.89%, Kappa 0.371, F1-score 0.650) in patient-based detection.
摘要:
近年来,由于设备和人工智能的进步,机器学习在生物医学研究中的应用激增。我们的目标是通过将机器学习应用于肺听诊信号来扩展这种知识体系。尽管数字听诊器有所改进,并试图找到它们与人工智能之间的协同作用,在临床环境中使用它们的解决方案仍然很少。医生继续用不太复杂的方法推断初步诊断,导致精度低,导致患者护理欠佳。为了得出正确的初步诊断,听诊诊断需要高精度。由于进行了大量的听诊,数据可用性为更有效的声音分析提供了机会。在这项研究中,在各种机器学习场景中使用了45名患者的数字6通道听诊,目的是区分正常和异常肺音。音频功能(如基本频率F0-4,响度,HNR,DFA,以及对数能量的描述性统计,RMS和MFCC)是使用Python库Surfboard提取的。开窗,特征聚合,并使用串联策略为无监督的机器学习算法准备数据(公平采伐森林,离群值森林)和监督(随机森林,正则化逻辑回归)设置。使用重复30次的9倍分层交叉验证进行评价。还测试了通过平均受试者的输出进行的决策融合,并发现有帮助。监督模型显示出比无监督模型一致的优势,随机森林的平均AUCROC为0.691(准确率为71.11%,Kappa0.416,F1评分0.675)在侧基检测中,平均AUCROC为0.721(准确率68.89%,Kappa0.371,F1评分0.650)在基于患者的检测中。
公众号