Data analysis

数据分析
  • 文章类型: Journal Article
    目的:息肉和症状的准确诊断和量化对于制定慢性鼻窦炎伴鼻息肉病(CRSwNP)的治疗策略至关重要。这项初步研究旨在开发一种基于人工智能(AI)的图像分析系统,该系统能够从鼻内窥镜检查视频中分割鼻息肉。
    方法:回顾性分析2019年至2022年间52例CRSwNP患者的鼻内窥镜检查记录。提取的图像在Web应用程序Roboflow上手动分割。生成了342张图像的数据集,并将其分为训练(80%),验证(10%),和测试(10%)集。UltralyticsYOLOv8.0.28模型用于自动分割。
    结果:YOLOv8s-seg模型由195层组成,需要42.4GFLOP进行操作。当针对验证集进行测试时,该算法的精度为0.91,召回率为0.839,在50%IoU时的平均精度(mAP50)为0.949。对于分段任务,观察到类似的指标,包括50%到95%的IoU的mAP范围从0.675到0.679。
    结论:研究表明,经过精心训练的AI算法可以有效地识别和描绘CRSwNP患者的鼻息肉。尽管存在某些限制,例如专注于CRSwNP特定样品,该算法为现有的诊断方法提供了一个有前途的补充工具。
    OBJECTIVE: Accurate diagnosis and quantification of polyps and symptoms are pivotal for planning the therapeutic strategy of Chronic rhinosinusitis with nasal polyposis (CRSwNP). This pilot study aimed to develop an artificial intelligence (AI)-based image analysis system capable of segmenting nasal polyps from nasal endoscopy videos.
    METHODS: Recorded nasal videoendoscopies from 52 patients diagnosed with CRSwNP between 2019 and 2022 were retrospectively analyzed. Images extracted were manually segmented on the web application Roboflow. A dataset of 342 images was generated and divided into training (80%), validation (10%), and testing (10%) sets. The Ultralytics YOLOv8.0.28 model was employed for automated segmentation.
    RESULTS: The YOLOv8s-seg model consisted of 195 layers and required 42.4 GFLOPs for operation. When tested against the validation set, the algorithm achieved a precision of 0.91, recall of 0.839, and mean average precision at 50% IoU (mAP50) of 0.949. For the segmentation task, similar metrics were observed, including a mAP ranging from 0.675 to 0.679 for IoUs between 50% and 95%.
    CONCLUSIONS: The study shows that a carefully trained AI algorithm can effectively identify and delineate nasal polyps in patients with CRSwNP. Despite certain limitations like the focus on CRSwNP-specific samples, the algorithm presents a promising complementary tool to existing diagnostic methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    最近,HexNAcQuest被开发用于帮助区分由HexNAc异构体修饰的肽,更具体地说,O-连接的β-N-乙酰葡糖胺(O-GlcNAc)和O-连接的α-N-乙酰半乳糖胺(O-GalNAc,Tn抗原)。为了促进其使用(特别是对于来自糖蛋白质组学研究的数据集),在这里,我们提出了一个详细的协议。它描述了用户可能需要使用HexNAcQuest来区分这两个修改的示例案例和过程。
    Recently, HexNAcQuest was developed to help distinguish peptides modified by HexNAc isomers, more specifically O-linked β-N-acetylglucosamine (O-GlcNAc) and O-linked α-N-acetylgalactosamine (O-GalNAc, Tn antigen). To facilitate its usage (particularly for datasets from glycoproteomics studies), herein we present a detailed protocol. It describes example cases and procedures for which users might need to use HexNAcQuest to distinguish these two modifications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    及时准确地发现新出现的感染对于有效的暴发管理和疾病控制至关重要。人类流动性显著影响传染病的空间传播动态。空间采样,整合目标的空间结构,作为检测感染的一种测试分配的方法,利用有关个人运动和接触行为的信息可以提高瞄准精度。本研究引入了一个由人类流动数据的时空分析提供信息的空间抽样框架,旨在优化检测资源的分配,以检测新出现的感染。流动性模式,从对兴趣点和旅行数据进行聚类得出,在社区一级被整合到四种空间抽样方法中。我们通过分析实际和模拟的爆发来评估所提出的基于移动性的空间采样,考虑到可传播性的情况,干预时机,和城市人口密度。结果表明,利用社区间流动数据和初始病例位置,建议的病例流强度(CFI)和病例透射强度(CTI)的空间采样通过减少筛选的个体数量,同时保持感染识别的高准确率,从而提高了社区水平的测试效率。此外,CFI和CTI在城市中的迅速应用对于有效检测至关重要,特别是在人口稠密地区的高度传染性感染中。随着人类流动数据广泛用于传染病反应,提出的理论框架将流动模式的时空数据分析扩展到空间采样,提供具有成本效益的解决方案,以优化测试资源部署,以遏制新出现的传染病。
    Timely and precise detection of emerging infections is imperative for effective outbreak management and disease control. Human mobility significantly influences the spatial transmission dynamics of infectious diseases. Spatial sampling, integrating the spatial structure of the target, holds promise as an approach for testing allocation in detecting infections, and leveraging information on individuals\' movement and contact behavior can enhance targeting precision. This study introduces a spatial sampling framework informed by spatiotemporal analysis of human mobility data, aiming to optimize the allocation of testing resources for detecting emerging infections. Mobility patterns, derived from clustering point-of-interest and travel data, are integrated into four spatial sampling approaches at the community level. We evaluate the proposed mobility-based spatial sampling by analyzing both actual and simulated outbreaks, considering scenarios of transmissibility, intervention timing, and population density in cities. Results indicate that leveraging inter-community movement data and initial case locations, the proposed Case Flow Intensity (CFI) and Case Transmission Intensity (CTI)-informed spatial sampling enhances community-level testing efficiency by reducing the number of individuals screened while maintaining a high accuracy rate in infection identification. Furthermore, the prompt application of CFI and CTI within cities is crucial for effective detection, especially in highly contagious infections within densely populated areas. With the widespread use of human mobility data for infectious disease responses, the proposed theoretical framework extends spatiotemporal data analysis of mobility patterns into spatial sampling, providing a cost-effective solution to optimize testing resource deployment for containing emerging infectious diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肾素表达细胞是对维持体内平衡至关重要的肌内分泌细胞。肾素受cAMP调节,p300(组蛋白乙酰转移酶p300)/CBP(CREB结合蛋白),和Brd4(含溴结构域蛋白4)蛋白和相关途径。然而,抑制这些途径后发生的具体调控变化尚不清楚.
    我们用3种针对肾素转录所需的不同因子的抑制剂处理了As4.1细胞(来自组成型表达肾素的小鼠近球细胞的肿瘤细胞):H-89-二盐酸盐,PKA(蛋白激酶A)抑制剂;JQ1,Brd4溴结构域抑制剂;和A-485,p300/CBP抑制剂。我们执行了ATAC-seq,单细胞RNA测序,CUT&Tag,和染色质免疫沉淀测序H3K27ac和p300结合在处理和对照As4.1细胞的生物复制上。
    在对每种抑制剂的反应中,Ren1表达显著降低并且在洗出时是可逆的。Ren1基因座的染色质可及性没有显着变化,但在远端元件处整体降低。抑制PKA导致在Ren1超增强子区域内特异性结合H3K27ac和p300的显著降低。Further,我们确定了每个抑制性治疗共有的富集TF(转录因子)基序。最后,我们确定了一组9个基因,在3个肾素调节途径中的每一个中都有推定的作用,并观察到每个基因都表现出差异可接近的染色质,基因表达,H3K27ac,和p300在它们各自的基因座处结合。
    在组成型合成和释放肾素的细胞中肾素表达的抑制受到表观遗传开关从活性状态到平衡状态的调节,所述表观遗传开关与减少的细胞-细胞通讯和上皮-间质转化相关。这项工作突出并有助于定义肾素细胞在肌内分泌和收缩表型之间交替所必需的因素。
    UNASSIGNED: Renin-expressing cells are myoendocrine cells crucial for the maintenance of homeostasis. Renin is regulated by cAMP, p300 (histone acetyltransferase p300)/CBP (CREB-binding protein), and Brd4 (bromodomain-containing protein 4) proteins and associated pathways. However, the specific regulatory changes that occur following inhibition of these pathways are not clear.
    UNASSIGNED: We treated As4.1 cells (tumoral cells derived from mouse juxtaglomerular cells that constitutively express renin) with 3 inhibitors that target different factors required for renin transcription: H-89-dihydrochloride, PKA (protein kinase A) inhibitor; JQ1, Brd4 bromodomain inhibitor; and A-485, p300/CBP inhibitor. We performed ATAC-seq, single-cell RNA sequencing, CUT&Tag, and chromatin immunoprecipitation sequencing for H3K27ac and p300 binding on biological replicates of treated and control As4.1 cells.
    UNASSIGNED: In response to each inhibitor, Ren1 expression was significantly reduced and reversible upon washout. Chromatin accessibility at the Ren1 locus did not markedly change but was globally reduced at distal elements. Inhibition of PKA led to significant reductions in H3K27ac and p300 binding specifically within the Ren1 super-enhancer region. Further, we identified enriched TF (transcription factor) motifs shared across each inhibitory treatment. Finally, we identified a set of 9 genes with putative roles across each of the 3 renin regulatory pathways and observed that each displayed differentially accessible chromatin, gene expression, H3K27ac, and p300 binding at their respective loci.
    UNASSIGNED: Inhibition of renin expression in cells that constitutively synthesize and release renin is regulated by an epigenetic switch from an active to poised state associated with decreased cell-cell communication and an epithelial-mesenchymal transition. This work highlights and helps define the factors necessary for renin cells to alternate between myoendocrine and contractile phenotypes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本文提出的元学习方法解决了小样本回归在工程数据分析中的应用问题,这是一个非常有前途的研究方向。通过将传统回归模型与元学习中基于优化的数据增强相结合,所提出的深度神经网络在优化玻璃纤维增强塑料(GFRP)包裹混凝土短柱方面表现出优异的性能。与传统回归模型相比,如支持向量回归(SVR),高斯过程回归(GPR),和径向基函数神经网络(RBFNN),本文提出的元学习方法在对小数据样本进行建模时表现更好。这种方法的成功说明了深度学习在处理有限数量数据方面的潜力,在材料数据分析领域提供新的机会。
    The meta-learning method proposed in this paper addresses the issue of small-sample regression in the application of engineering data analysis, which is a highly promising direction for research. By integrating traditional regression models with optimization-based data augmentation from meta-learning, the proposed deep neural network demonstrates excellent performance in optimizing glass fiber reinforced plastic (GFRP) for wrapping concrete short columns. When compared with traditional regression models, such as Support Vector Regression (SVR), Gaussian Process Regression (GPR), and Radial Basis Function Neural Networks (RBFNN), the meta-learning method proposed here performs better in modeling small data samples. The success of this approach illustrates the potential of deep learning in dealing with limited amounts of data, offering new opportunities in the field of material data analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Editorial
    最近的医学文献表明,人工智能(AI)模型在胃肠道病理学中的应用是一个指数增长的领域,有前途的模型,表现出非常高的性能。关于炎症性肠病(IBD),最近的评论证明了有希望的诊断和预后AI模型。然而,研究通常存在较高的偏差风险(特别是在基于图像的人工智能模型中)。创建特定的AI模型以提高诊断性能并允许在IBD中建立一般的预后预测非常感兴趣,因为它可以将患者分为亚组,反过来,允许为这些患者创建不同的诊断和治疗方案。关于手术模型,术后并发症预测模型在大规模研究中显示出巨大潜力.在这项工作中,作者介绍了基于随机森林模型的克罗恩病术后早期并发症预测算法的开发,该模型对队列中的并发症具有出色的预测能力.目前的工作,基于逻辑和推理,临床,和适用方面,为今后进一步开发IBD术后预后工具的前瞻性工作奠定了坚实的基础。下一步是以前瞻性和多中心的方式发展,这是一条优化这条研究路线并使其适用于我们的患者的协作路径。
    Recent medical literature shows that the application of artificial intelligence (AI) models in gastrointestinal pathology is an exponentially growing field, with promising models that show very high performances. Regarding inflammatory bowel disease (IBD), recent reviews demonstrate promising diagnostic and prognostic AI models. However, studies are generally at high risk of bias (especially in AI models that are image-based). The creation of specific AI models that improve diagnostic performance and allow the establishment of a general prognostic forecast in IBD is of great interest, as it may allow the stratification of patients into subgroups and, in turn, allow the creation of different diagnostic and therapeutic protocols for these patients. Regarding surgical models, predictive models of postoperative complications have shown great potential in large-scale studies. In this work, the authors present the development of a predictive algorithm for early post-surgical complications in Crohn\'s disease based on a Random Forest model with exceptional predictive ability for complications within the cohort. The present work, based on logical and reasoned, clinical, and applicable aspects, lays a solid foundation for future prospective work to further develop post-surgical prognostic tools for IBD. The next step is to develop in a prospective and multicenter way, a collaborative path to optimize this line of research and make it applicable to our patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:微生物生长的表征具有基础和应用兴趣。现代平台可以自动收集高通量微生物生长曲线,需要开发计算工具来处理和分析这些数据以产生见解。
    结果:为了满足这一需求,这里我介绍一个新开发的R包:gcplyr。gcplyr可以以常见的表格格式灵活导入增长曲线数据,并在一个灵活和可扩展的整洁框架下重塑它,使用户能够使用流行的可视化软件包设计自定义分析或绘制数据。gcplyr还可以合并元数据,并生成或导入实验设计以与数据合并。最后,gcplyr进行无模型(非参数)分析。这些分析不需要关于微生物生长动力学的数学假设,gcplyr能够提取广泛的重要性状,包括增长率,倍增时间,滞后时间,最大密度和承载能力,Diauxie,曲线下的面积,灭绝时间,还有更多.
    结论:gcplyr对R中的生长曲线数据进行了脚本分析,简化常见的数据整理和分析步骤,并轻松集成常见的可视化和统计分析。
    BACKGROUND: Characterization of microbial growth is of both fundamental and applied interest. Modern platforms can automate collection of high-throughput microbial growth curves, necessitating the development of computational tools to handle and analyze these data to produce insights.
    RESULTS: To address this need, here I present a newly-developed R package: gcplyr. gcplyr can flexibly import growth curve data in common tabular formats, and reshapes it under a tidy framework that is flexible and extendable, enabling users to design custom analyses or plot data with popular visualization packages. gcplyr can also incorporate metadata and generate or import experimental designs to merge with data. Finally, gcplyr carries out model-free (non-parametric) analyses. These analyses do not require mathematical assumptions about microbial growth dynamics, and gcplyr is able to extract a broad range of important traits, including growth rate, doubling time, lag time, maximum density and carrying capacity, diauxie, area under the curve, extinction time, and more.
    CONCLUSIONS: gcplyr makes scripted analyses of growth curve data in R straightforward, streamlines common data wrangling and analysis steps, and easily integrates with common visualization and statistical analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:本研究旨在使用韩国医学检查中的真实和模拟数据,比较和评估两种停止规则(SEM0.3和0.25)下的计算机自适应测试(CAT)的效率和准确性。
    方法:本研究采用事后模拟和真实数据分析来探索医学检查中CAT的最佳停止规则。真实数据来自哈勒姆大学医学院2020年考试期间三年级医学生的反应。模拟数据是使用R中真实项目库的估计参数生成的。结果变量包括通过或失败的受试者数量,SEM值为0.25和0.30,管理的项目数,和相关性。通过基于0.0的切分检查通过或失败的一致性来评估真实CAT结果的一致性。通过比较两种停止规则下管理的物品的平均数量来评估所有CAT设计的效率。
    结果:SEM0.25和SEM0.30均在CAT中提供了准确性和效率之间的良好平衡。实际数据显示,两种SEM条件之间的通过/失败结果差异最小,能力估计之间的相关性很高(r=0.99)。模拟结果证实了这些发现,表示真实数据和模拟数据之间相似的平均项目编号。
    结论:研究结果表明,在Rasch模型的背景下,SEM0.25和0.30都是有效的终止标准,在CAT中平衡准确性和效率。
    OBJECTIVE: This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under two stopping rules (SEM 0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
    METHODS: This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees\' passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
    RESULTS: Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r = 0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
    CONCLUSIONS: The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    大型语言模型(LLM)已经成为生物医学研究人员的强大工具,在理解和生成类似人类的文本方面表现出非凡的能力。ChatGPT及其代码解释器功能,具有编写和执行代码能力的LLM,通过启用自然语言交互,简化数据分析工作流程。使用以前发布的教程中的材料,类似的分析可以通过与聊天机器人的会话交互来执行,涵盖数据加载和探索,模型开发和比较,排列特征重要性,部分依赖图,以及其他分析和建议。这些发现强调了LLM在协助研究人员完成数据分析任务方面的巨大潜力,让他们专注于更高层次的工作。然而,存在与使用LLM相关的限制和潜在问题,比如批判性思维的重要性,隐私,安全,以及公平使用这些工具。随着LLM不断改进并与可用工具集成,数据科学可能会经历类似于驾驶中从手动变速器到自动变速器的转变。LLM的进步要求考虑数据科学及其教育的未来方向,确保在适当的人力监督和责任下利用这些强大工具的好处。
    Large language models (LLMs) have emerged as a powerful tool for biomedical researchers, demonstrating remarkable capabilities in understanding and generating human-like text. ChatGPT with its Code Interpreter functionality, an LLM connected with the ability to write and execute code, streamlines data analysis workflows by enabling natural language interactions. Using materials from a previously published tutorial, similar analyses can be performed through conversational interactions with the chatbot, covering data loading and exploration, model development and comparison, permutation feature importance, partial dependence plots, and additional analyses and recommendations. The findings highlight the significant potential of LLMs in assisting researchers with data analysis tasks, allowing them to focus on higher-level aspects of their work. However, there are limitations and potential concerns associated with the use of LLMs, such as the importance of critical thinking, privacy, security, and equitable access to these tools. As LLMs continue to improve and integrate with available tools, data science may experience a transformation similar to the shift from manual to automatic transmission in driving. The advancements in LLMs call for considering the future directions of data science and its education, ensuring that the benefits of these powerful tools are utilized with proper human supervision and responsibility.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号