Data analysis

数据分析
  • 文章类型: Journal Article
    目的:息肉和症状的准确诊断和量化对于制定慢性鼻窦炎伴鼻息肉病(CRSwNP)的治疗策略至关重要。这项初步研究旨在开发一种基于人工智能(AI)的图像分析系统,该系统能够从鼻内窥镜检查视频中分割鼻息肉。
    方法:回顾性分析2019年至2022年间52例CRSwNP患者的鼻内窥镜检查记录。提取的图像在Web应用程序Roboflow上手动分割。生成了342张图像的数据集,并将其分为训练(80%),验证(10%),和测试(10%)集。UltralyticsYOLOv8.0.28模型用于自动分割。
    结果:YOLOv8s-seg模型由195层组成,需要42.4GFLOP进行操作。当针对验证集进行测试时,该算法的精度为0.91,召回率为0.839,在50%IoU时的平均精度(mAP50)为0.949。对于分段任务,观察到类似的指标,包括50%到95%的IoU的mAP范围从0.675到0.679。
    结论:研究表明,经过精心训练的AI算法可以有效地识别和描绘CRSwNP患者的鼻息肉。尽管存在某些限制,例如专注于CRSwNP特定样品,该算法为现有的诊断方法提供了一个有前途的补充工具。
    OBJECTIVE: Accurate diagnosis and quantification of polyps and symptoms are pivotal for planning the therapeutic strategy of Chronic rhinosinusitis with nasal polyposis (CRSwNP). This pilot study aimed to develop an artificial intelligence (AI)-based image analysis system capable of segmenting nasal polyps from nasal endoscopy videos.
    METHODS: Recorded nasal videoendoscopies from 52 patients diagnosed with CRSwNP between 2019 and 2022 were retrospectively analyzed. Images extracted were manually segmented on the web application Roboflow. A dataset of 342 images was generated and divided into training (80%), validation (10%), and testing (10%) sets. The Ultralytics YOLOv8.0.28 model was employed for automated segmentation.
    RESULTS: The YOLOv8s-seg model consisted of 195 layers and required 42.4 GFLOPs for operation. When tested against the validation set, the algorithm achieved a precision of 0.91, recall of 0.839, and mean average precision at 50% IoU (mAP50) of 0.949. For the segmentation task, similar metrics were observed, including a mAP ranging from 0.675 to 0.679 for IoUs between 50% and 95%.
    CONCLUSIONS: The study shows that a carefully trained AI algorithm can effectively identify and delineate nasal polyps in patients with CRSwNP. Despite certain limitations like the focus on CRSwNP-specific samples, the algorithm presents a promising complementary tool to existing diagnostic methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    最近,HexNAcQuest被开发用于帮助区分由HexNAc异构体修饰的肽,更具体地说,O-连接的β-N-乙酰葡糖胺(O-GlcNAc)和O-连接的α-N-乙酰半乳糖胺(O-GalNAc,Tn抗原)。为了促进其使用(特别是对于来自糖蛋白质组学研究的数据集),在这里,我们提出了一个详细的协议。它描述了用户可能需要使用HexNAcQuest来区分这两个修改的示例案例和过程。
    Recently, HexNAcQuest was developed to help distinguish peptides modified by HexNAc isomers, more specifically O-linked β-N-acetylglucosamine (O-GlcNAc) and O-linked α-N-acetylgalactosamine (O-GalNAc, Tn antigen). To facilitate its usage (particularly for datasets from glycoproteomics studies), herein we present a detailed protocol. It describes example cases and procedures for which users might need to use HexNAcQuest to distinguish these two modifications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    及时准确地发现新出现的感染对于有效的暴发管理和疾病控制至关重要。人类流动性显著影响传染病的空间传播动态。空间采样,整合目标的空间结构,作为检测感染的一种测试分配的方法,利用有关个人运动和接触行为的信息可以提高瞄准精度。本研究引入了一个由人类流动数据的时空分析提供信息的空间抽样框架,旨在优化检测资源的分配,以检测新出现的感染。流动性模式,从对兴趣点和旅行数据进行聚类得出,在社区一级被整合到四种空间抽样方法中。我们通过分析实际和模拟的爆发来评估所提出的基于移动性的空间采样,考虑到可传播性的情况,干预时机,和城市人口密度。结果表明,利用社区间流动数据和初始病例位置,建议的病例流强度(CFI)和病例透射强度(CTI)的空间采样通过减少筛选的个体数量,同时保持感染识别的高准确率,从而提高了社区水平的测试效率。此外,CFI和CTI在城市中的迅速应用对于有效检测至关重要,特别是在人口稠密地区的高度传染性感染中。随着人类流动数据广泛用于传染病反应,提出的理论框架将流动模式的时空数据分析扩展到空间采样,提供具有成本效益的解决方案,以优化测试资源部署,以遏制新出现的传染病。
    Timely and precise detection of emerging infections is imperative for effective outbreak management and disease control. Human mobility significantly influences the spatial transmission dynamics of infectious diseases. Spatial sampling, integrating the spatial structure of the target, holds promise as an approach for testing allocation in detecting infections, and leveraging information on individuals\' movement and contact behavior can enhance targeting precision. This study introduces a spatial sampling framework informed by spatiotemporal analysis of human mobility data, aiming to optimize the allocation of testing resources for detecting emerging infections. Mobility patterns, derived from clustering point-of-interest and travel data, are integrated into four spatial sampling approaches at the community level. We evaluate the proposed mobility-based spatial sampling by analyzing both actual and simulated outbreaks, considering scenarios of transmissibility, intervention timing, and population density in cities. Results indicate that leveraging inter-community movement data and initial case locations, the proposed Case Flow Intensity (CFI) and Case Transmission Intensity (CTI)-informed spatial sampling enhances community-level testing efficiency by reducing the number of individuals screened while maintaining a high accuracy rate in infection identification. Furthermore, the prompt application of CFI and CTI within cities is crucial for effective detection, especially in highly contagious infections within densely populated areas. With the widespread use of human mobility data for infectious disease responses, the proposed theoretical framework extends spatiotemporal data analysis of mobility patterns into spatial sampling, providing a cost-effective solution to optimize testing resource deployment for containing emerging infectious diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肾素表达细胞是对维持体内平衡至关重要的肌内分泌细胞。肾素受cAMP调节,p300(组蛋白乙酰转移酶p300)/CBP(CREB结合蛋白),和Brd4(含溴结构域蛋白4)蛋白和相关途径。然而,抑制这些途径后发生的具体调控变化尚不清楚.
    我们用3种针对肾素转录所需的不同因子的抑制剂处理了As4.1细胞(来自组成型表达肾素的小鼠近球细胞的肿瘤细胞):H-89-二盐酸盐,PKA(蛋白激酶A)抑制剂;JQ1,Brd4溴结构域抑制剂;和A-485,p300/CBP抑制剂。我们执行了ATAC-seq,单细胞RNA测序,CUT&Tag,和染色质免疫沉淀测序H3K27ac和p300结合在处理和对照As4.1细胞的生物复制上。
    在对每种抑制剂的反应中,Ren1表达显著降低并且在洗出时是可逆的。Ren1基因座的染色质可及性没有显着变化,但在远端元件处整体降低。抑制PKA导致在Ren1超增强子区域内特异性结合H3K27ac和p300的显著降低。Further,我们确定了每个抑制性治疗共有的富集TF(转录因子)基序。最后,我们确定了一组9个基因,在3个肾素调节途径中的每一个中都有推定的作用,并观察到每个基因都表现出差异可接近的染色质,基因表达,H3K27ac,和p300在它们各自的基因座处结合。
    在组成型合成和释放肾素的细胞中肾素表达的抑制受到表观遗传开关从活性状态到平衡状态的调节,所述表观遗传开关与减少的细胞-细胞通讯和上皮-间质转化相关。这项工作突出并有助于定义肾素细胞在肌内分泌和收缩表型之间交替所必需的因素。
    UNASSIGNED: Renin-expressing cells are myoendocrine cells crucial for the maintenance of homeostasis. Renin is regulated by cAMP, p300 (histone acetyltransferase p300)/CBP (CREB-binding protein), and Brd4 (bromodomain-containing protein 4) proteins and associated pathways. However, the specific regulatory changes that occur following inhibition of these pathways are not clear.
    UNASSIGNED: We treated As4.1 cells (tumoral cells derived from mouse juxtaglomerular cells that constitutively express renin) with 3 inhibitors that target different factors required for renin transcription: H-89-dihydrochloride, PKA (protein kinase A) inhibitor; JQ1, Brd4 bromodomain inhibitor; and A-485, p300/CBP inhibitor. We performed ATAC-seq, single-cell RNA sequencing, CUT&Tag, and chromatin immunoprecipitation sequencing for H3K27ac and p300 binding on biological replicates of treated and control As4.1 cells.
    UNASSIGNED: In response to each inhibitor, Ren1 expression was significantly reduced and reversible upon washout. Chromatin accessibility at the Ren1 locus did not markedly change but was globally reduced at distal elements. Inhibition of PKA led to significant reductions in H3K27ac and p300 binding specifically within the Ren1 super-enhancer region. Further, we identified enriched TF (transcription factor) motifs shared across each inhibitory treatment. Finally, we identified a set of 9 genes with putative roles across each of the 3 renin regulatory pathways and observed that each displayed differentially accessible chromatin, gene expression, H3K27ac, and p300 binding at their respective loci.
    UNASSIGNED: Inhibition of renin expression in cells that constitutively synthesize and release renin is regulated by an epigenetic switch from an active to poised state associated with decreased cell-cell communication and an epithelial-mesenchymal transition. This work highlights and helps define the factors necessary for renin cells to alternate between myoendocrine and contractile phenotypes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本文提出的元学习方法解决了小样本回归在工程数据分析中的应用问题,这是一个非常有前途的研究方向。通过将传统回归模型与元学习中基于优化的数据增强相结合,所提出的深度神经网络在优化玻璃纤维增强塑料(GFRP)包裹混凝土短柱方面表现出优异的性能。与传统回归模型相比,如支持向量回归(SVR),高斯过程回归(GPR),和径向基函数神经网络(RBFNN),本文提出的元学习方法在对小数据样本进行建模时表现更好。这种方法的成功说明了深度学习在处理有限数量数据方面的潜力,在材料数据分析领域提供新的机会。
    The meta-learning method proposed in this paper addresses the issue of small-sample regression in the application of engineering data analysis, which is a highly promising direction for research. By integrating traditional regression models with optimization-based data augmentation from meta-learning, the proposed deep neural network demonstrates excellent performance in optimizing glass fiber reinforced plastic (GFRP) for wrapping concrete short columns. When compared with traditional regression models, such as Support Vector Regression (SVR), Gaussian Process Regression (GPR), and Radial Basis Function Neural Networks (RBFNN), the meta-learning method proposed here performs better in modeling small data samples. The success of this approach illustrates the potential of deep learning in dealing with limited amounts of data, offering new opportunities in the field of material data analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:微生物生长的表征具有基础和应用兴趣。现代平台可以自动收集高通量微生物生长曲线,需要开发计算工具来处理和分析这些数据以产生见解。
    结果:为了满足这一需求,这里我介绍一个新开发的R包:gcplyr。gcplyr可以以常见的表格格式灵活导入增长曲线数据,并在一个灵活和可扩展的整洁框架下重塑它,使用户能够使用流行的可视化软件包设计自定义分析或绘制数据。gcplyr还可以合并元数据,并生成或导入实验设计以与数据合并。最后,gcplyr进行无模型(非参数)分析。这些分析不需要关于微生物生长动力学的数学假设,gcplyr能够提取广泛的重要性状,包括增长率,倍增时间,滞后时间,最大密度和承载能力,Diauxie,曲线下的面积,灭绝时间,还有更多.
    结论:gcplyr对R中的生长曲线数据进行了脚本分析,简化常见的数据整理和分析步骤,并轻松集成常见的可视化和统计分析。
    BACKGROUND: Characterization of microbial growth is of both fundamental and applied interest. Modern platforms can automate collection of high-throughput microbial growth curves, necessitating the development of computational tools to handle and analyze these data to produce insights.
    RESULTS: To address this need, here I present a newly-developed R package: gcplyr. gcplyr can flexibly import growth curve data in common tabular formats, and reshapes it under a tidy framework that is flexible and extendable, enabling users to design custom analyses or plot data with popular visualization packages. gcplyr can also incorporate metadata and generate or import experimental designs to merge with data. Finally, gcplyr carries out model-free (non-parametric) analyses. These analyses do not require mathematical assumptions about microbial growth dynamics, and gcplyr is able to extract a broad range of important traits, including growth rate, doubling time, lag time, maximum density and carrying capacity, diauxie, area under the curve, extinction time, and more.
    CONCLUSIONS: gcplyr makes scripted analyses of growth curve data in R straightforward, streamlines common data wrangling and analysis steps, and easily integrates with common visualization and statistical analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:本研究旨在使用韩国医学检查中的真实和模拟数据,比较和评估两种停止规则(SEM0.3和0.25)下的计算机自适应测试(CAT)的效率和准确性。
    方法:本研究采用事后模拟和真实数据分析来探索医学检查中CAT的最佳停止规则。真实数据来自哈勒姆大学医学院2020年考试期间三年级医学生的反应。模拟数据是使用R中真实项目库的估计参数生成的。结果变量包括通过或失败的受试者数量,SEM值为0.25和0.30,管理的项目数,和相关性。通过基于0.0的切分检查通过或失败的一致性来评估真实CAT结果的一致性。通过比较两种停止规则下管理的物品的平均数量来评估所有CAT设计的效率。
    结果:SEM0.25和SEM0.30均在CAT中提供了准确性和效率之间的良好平衡。实际数据显示,两种SEM条件之间的通过/失败结果差异最小,能力估计之间的相关性很高(r=0.99)。模拟结果证实了这些发现,表示真实数据和模拟数据之间相似的平均项目编号。
    结论:研究结果表明,在Rasch模型的背景下,SEM0.25和0.30都是有效的终止标准,在CAT中平衡准确性和效率。
    OBJECTIVE: This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under two stopping rules (SEM 0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
    METHODS: This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees\' passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
    RESULTS: Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r = 0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
    CONCLUSIONS: The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    大型语言模型(LLM)已经成为生物医学研究人员的强大工具,在理解和生成类似人类的文本方面表现出非凡的能力。ChatGPT及其代码解释器功能,具有编写和执行代码能力的LLM,通过启用自然语言交互,简化数据分析工作流程。使用以前发布的教程中的材料,类似的分析可以通过与聊天机器人的会话交互来执行,涵盖数据加载和探索,模型开发和比较,排列特征重要性,部分依赖图,以及其他分析和建议。这些发现强调了LLM在协助研究人员完成数据分析任务方面的巨大潜力,让他们专注于更高层次的工作。然而,存在与使用LLM相关的限制和潜在问题,比如批判性思维的重要性,隐私,安全,以及公平使用这些工具。随着LLM不断改进并与可用工具集成,数据科学可能会经历类似于驾驶中从手动变速器到自动变速器的转变。LLM的进步要求考虑数据科学及其教育的未来方向,确保在适当的人力监督和责任下利用这些强大工具的好处。
    Large language models (LLMs) have emerged as a powerful tool for biomedical researchers, demonstrating remarkable capabilities in understanding and generating human-like text. ChatGPT with its Code Interpreter functionality, an LLM connected with the ability to write and execute code, streamlines data analysis workflows by enabling natural language interactions. Using materials from a previously published tutorial, similar analyses can be performed through conversational interactions with the chatbot, covering data loading and exploration, model development and comparison, permutation feature importance, partial dependence plots, and additional analyses and recommendations. The findings highlight the significant potential of LLMs in assisting researchers with data analysis tasks, allowing them to focus on higher-level aspects of their work. However, there are limitations and potential concerns associated with the use of LLMs, such as the importance of critical thinking, privacy, security, and equitable access to these tools. As LLMs continue to improve and integrate with available tools, data science may experience a transformation similar to the shift from manual to automatic transmission in driving. The advancements in LLMs call for considering the future directions of data science and its education, ensuring that the benefits of these powerful tools are utilized with proper human supervision and responsibility.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    迄今为止,转录组学领域的特点是快速的方法发展和技术进步,新技术不断使旧技术过时。本章追溯了定量基因表达方法的演变,并提供了转录组学领域现状的总体观点。它在人类大脑研究中的应用,以及它在更广泛的新兴多元组学领域的地位。
    To date, the field of transcriptomics has been characterized by rapid methods development and technological advancement, with new technologies continuously rendering older ones obsolete.This chapter traces the evolution of approaches to quantifying gene expression and provides an overall view of the current state of the field of transcriptomics, its applications to the study of the human brain, and its place in the broader emerging multiomics landscape.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:先前的一些局限性研究比较了动态血压(BP)与自我测量的BP和器官损伤的生物标志物之间的关系。本研究通过检查动态和自我测量的BP与心脏之间的关系来扩展这一研究路线,肾,和有心血管风险的门诊患者的动脉粥样硬化生物标志物。
    方法:在1,440名诊所门诊患者中,走动,和自测BP监测,我们评估了每种BP与器官损伤生物标志物(包括B型利钠肽(BNP))的关系,超声心动图左心室质量指数(LVMI),尿-白蛋白-肌酐比值(UACR),臂踝脉搏波传导速度(baPWV)。
    结果:在相关性比较中,自我测量的收缩压血压(SBP)与对数转换(Ln)BNP(n=1,435;r=0.123vs.r=-0.093,P<0.001),LVMI(n=1,278;r=0.223vs.r=0.094,P<0.001),Ln-UACR(n=1,435;r=0.244vs.r=0.154,P=0.010),和baPWV(n=1,360;r=0.327vs.r=0.115,P<0.001)比白天动态SBP。在包括办公室在内的线性回归模型中,走动,和自测SBP,仅自测SBP与Ln-BNP(P=0.016)和LVMI(P<0.001)显著相关。在LVMI前四分位数的逻辑回归模型中,加入自测SBP提高了模型的可预测性(P=0.027),但增加日间动态SBP没有。然而,在包括办公室和自测SBP在内的逻辑模型中,增加日间动态SBP改善了模型的可预测性.
    结论:我们的研究结果表明,自测血压与独立于动态血压的心脏生物标志物相关。
    BACKGROUND: Previous studies with several limitations have comparatively analyzed the relationship between ambulatory blood pressure (BP) and self-measured BP and biomarkers of organ damage. This study extends this line of research by examining the relationship between ambulatory and self-measured BP and cardiac, renal, and atherosclerotic biomarkers in outpatients at cardiovascular risk.
    METHODS: In 1,440 practice outpatients who underwent office, ambulatory, and self-measured BP monitoring, we assessed the relationships of each BP with organ damage biomarkers including b-type natriuretic peptide (BNP), echocardiographic left ventricular mass index (LVMI), urine-albumin-creatinine ratio (UACR), and brachial-ankle pulse wave velocity (baPWV).
    RESULTS: In the comparison of correlation, self-measured systolic BP (SBP) was more strongly correlated to log-transformed (Ln) BNP (n=1,435; r=0.123 vs. r = -0.093, P<0.001), LVMI (n=1,278; r=0.223 vs. r=0.094, P<0.001), Ln-UACR (n=1,435; r=0.244 vs. r=0.154, P=0.010), and baPWV (n=1,360; r=0.327 vs. r=0.115, P<0.001) than daytime ambulatory SBP. In the linear regression models including office, ambulatory, and self-measured SBP, only self-measured SBP was significantly related to Ln-BNP (P=0.016) and LVMI (P<0.001). In the logistic regression models for the top quartile of LVMI, adding self-measured SBP improved the model predictability (P=0.027), but adding daytime ambulatory SBP did not. However, adding daytime ambulatory SBP improved the model predictability in the logistic model including office and self-measured SBP.
    CONCLUSIONS: Our study findings suggested that self-measured BP was associated with cardiac biomarkers independent of ambulatory BP.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号