software

软件
  • 文章类型: Journal Article
    长读测序具有表征复杂微生物群落的巨大潜力,然而,专门为长时间阅读设计的分类学分析工具仍然缺乏。我们介绍甜瓜,一种新颖的基于标记的分类分析器,它利用了长读数的独特属性。Melon采用两阶段分类方案来减少计算时间,并配备了基于期望最大化的后校正模块来处理模糊的读段。甜瓜在模拟和模拟样品中与现有工具相比具有卓越的性能。使用废水宏基因组样本,我们证明了甜瓜的适用性,表明它提供了可靠的整体基因组拷贝的估计,和物种级分类学概况。
    Long-read sequencing holds great potential for characterizing complex microbial communities, yet taxonomic profiling tools designed specifically for long reads remain lacking. We introduce Melon, a novel marker-based taxonomic profiler that capitalizes on the unique attributes of long reads. Melon employs a two-stage classification scheme to reduce computational time and is equipped with an expectation-maximization-based post-correction module to handle ambiguous reads. Melon achieves superior performance compared to existing tools in both mock and simulated samples. Using wastewater metagenomic samples, we demonstrate the applicability of Melon by showing it provides reliable estimates of overall genome copies, and species-level taxonomic profiles.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:利用长读段进行单核苷酸多态性(SNP)定相已经变得很流行,为人类疾病研究和动植物遗传研究提供实质性支持。然而,由于SNP基因座之间的连锁关系和读取中的测序错误的复杂性,最近的方法仍然不能产生令人满意的结果。
    结果:在这项研究中,我们提出了一种基于图的算法,GC阶段,它利用最小割算法来执行定相。首先,基于长读数和参考基因组之间的比对,GC阶段过滤掉不明确的SNP位点和无用的读取信息。第二,GCphase构建了一个图,其中顶点代表SNP基因座的等位基因,每个边代表读段支持的存在;此外,GCphase采用图最小割算法对SNP进行相位化。接下来,GCpahse使用两个纠错步骤来完善从上一步获得的相位结果,有效地降低了错误率。最后,GCphase获取相位块。将GC阶段与其他三种方法进行了比较,WhatsHap,HapCUT2和LongPhase,在纳米孔和PacBio长读数据集上。该代码可从https://github.com/baimawjy/GCphase获得。
    结论:实验结果表明,与其他方法相比,在不同数据的不同测序深度下的GC相具有最少的切换误差和最高的准确性。
    BACKGROUND: The utilization of long reads for single nucleotide polymorphism (SNP) phasing has become popular, providing substantial support for research on human diseases and genetic studies in animals and plants. However, due to the complexity of the linkage relationships between SNP loci and sequencing errors in the reads, the recent methods still cannot yield satisfactory results.
    RESULTS: In this study, we present a graph-based algorithm, GCphase, which utilizes the minimum cut algorithm to perform phasing. First, based on alignment between long reads and the reference genome, GCphase filters out ambiguous SNP sites and useless read information. Second, GCphase constructs a graph in which a vertex represents alleles of an SNP locus and each edge represents the presence of read support; moreover, GCphase adopts a graph minimum-cut algorithm to phase the SNPs. Next, GCpahse uses two error correction steps to refine the phasing results obtained from the previous step, effectively reducing the error rate. Finally, GCphase obtains the phase block. GCphase was compared to three other methods, WhatsHap, HapCUT2, and LongPhase, on the Nanopore and PacBio long-read datasets. The code is available from https://github.com/baimawjy/GCphase .
    CONCLUSIONS: Experimental results show that GCphase under different sequencing depths of different data has the least number of switch errors and the highest accuracy compared with other methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    生命体征是评价患者健康状况的重要指标。信道状态信息(CSI)可以以非接触方式感测由心肺活动引起的胸壁位移。由于杂波的影响,直流分量,和呼吸谐波,很难检测到可靠的心跳信号。为了解决这个问题,本文提出了一种健壮和新颖的方法,用于使用软件定义无线电(SDR)同时提取呼吸和心跳信号。具体来说,对信号进行建模和分析,提出基于奇异值分解(SVD)的杂波抑制方法来增强生命体征信号。通过圆拟合方法估计和补偿DC。然后,通过改进的变分模态分解(VMD)获得心跳信号和呼吸信号。实验结果表明,该方法能够准确地从滤波信号中分离出呼吸信号和心跳信号。Bland-Altman分析表明,所提出的系统与医疗传感器具有良好的一致性。此外,所提出的系统可以准确测量0.5m内的心率变异性(HRV)。总之,我们的系统可以用作传统接触式医疗传感器的首选非接触式替代品,它可以提供先进的以患者为中心的医疗保健解决方案。
    Vital signs are important indicators to evaluate the health status of patients. Channel state information (CSI) can sense the displacement of the chest wall caused by cardiorespiratory activity in a non-contact manner. Due to the influence of clutter, DC components, and respiratory harmonics, it is difficult to detect reliable heartbeat signals. To address this problem, this paper proposes a robust and novel method for simultaneously extracting breath and heartbeat signals using software defined radios (SDR). Specifically, we model and analyze the signal and propose singular value decomposition (SVD)-based clutter suppression method to enhance the vital sign signals. The DC is estimated and compensated by the circle fitting method. Then, the heartbeat signal and respiratory signal are obtained by the modified variational modal decomposition (VMD). The experimental results demonstrate that the proposed method can accurately separate the respiratory signal and the heartbeat signal from the filtered signal. The Bland-Altman analysis shows that the proposed system is in good agreement with the medical sensors. In addition, the proposed system can accurately measure the heart rate variability (HRV) within 0.5m. In summary, our system can be used as a preferred contactless alternative to traditional contact medical sensors, which can provide advanced patient-centered healthcare solutions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    microRNAs(miRNAs)在多种生物过程中起着至关重要的作用。通过检测它们的亚细胞定位,对于更深入地了解它们的功能和机制至关重要。用于确定miRNA亚细胞定位的传统方法是昂贵的。计算方法是快速预测miRNA亚细胞定位的替代方法。尽管在这方面已经提出了几种计算方法,这些方法中miRNAs的不完整表示留下了改进的空间。在这项研究中,一种预测miRNA亚细胞定位的新计算方法,名为PMiSLocMF,已开发。由于许多miRNA具有多个亚细胞定位,该方法是一种多标签分类器。miRNA的几个性质,如miRNA序列,miRNA功能相似性,miRNA-疾病,miRNA-药物,和miRNA-mRNA关联被用于产生信息性miRNA特征。为此,采用强大的算法[node2vec和图形注意自动编码器(GATE)]和一个新设计的方案来处理上述属性,产生五种特征类型。所有功能都被注入自我关注和完全连接的图层以进行预测。交叉验证结果表明,PMiSLocMF的准确性高于0.83,受试者工作特征曲线下平均面积(AUC)和精确召回曲线下面积(AUPR)分别超过0.90和0.77。这种性能优于基于相同数据集的所有先前方法。进一步的测试证明,使用所有特征类型可以提高PMisLocMF的性能,和GATE和自我注意层可以帮助提高性能。最后,我们深入分析了miRNA与疾病关联的影响,毒品,和在PMiSLocMF上的mRNA。数据集和代码可在https://github.com/Gu20201017/PMiSLocMF获得。
    The microRNAs (miRNAs) play crucial roles in several biological processes. It is essential for a deeper insight into their functions and mechanisms by detecting their subcellular localizations. The traditional methods for determining miRNAs subcellular localizations are expensive. The computational methods are alternative ways to quickly predict miRNAs subcellular localizations. Although several computational methods have been proposed in this regard, the incomplete representations of miRNAs in these methods left the room for improvement. In this study, a novel computational method for predicting miRNA subcellular localizations, named PMiSLocMF, was developed. As lots of miRNAs have multiple subcellular localizations, this method was a multi-label classifier. Several properties of miRNA, such as miRNA sequences, miRNA functional similarity, miRNA-disease, miRNA-drug, and miRNA-mRNA associations were adopted for generating informative miRNA features. To this end, powerful algorithms [node2vec and graph attention auto-encoder (GATE)] and one newly designed scheme were adopted to process above properties, producing five feature types. All features were poured into self-attention and fully connected layers to make predictions. The cross-validation results indicated the high performance of PMiSLocMF with accuracy higher than 0.83, average area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AUPR) exceeding 0.90 and 0.77, respectively. Such performance was better than all previous methods based on the same dataset. Further tests proved that using all feature types can improve the performance of PMiSLocMF, and GATE and self-attention layer can help enhance the performance. Finally, we deeply analyzed the influence of miRNA associations with diseases, drugs, and mRNAs on PMiSLocMF. The dataset and codes are available at https://github.com/Gu20201017/PMiSLocMF.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    scRNA-seq数据的可用性和规模的快速增长需要可扩展的综合分析方法。尽管已经开发了许多数据集成方法,在综合分析中,很少关注理解不同细胞群体之间生物条件的异质性效应。我们提出的可扩展方法,scParser,模拟来自生物条件的异质效应,揭示了基因表达促成表型的关键机制。值得注意的是,扩展的scParser指出了细胞亚群中有助于疾病发病机理的生物过程。与最先进的方法相比,scParser在细胞聚类中实现了良好的性能,并且具有广泛而多样的适用性。
    The rapid rise in the availability and scale of scRNA-seq data needs scalable methods for integrative analysis. Though many methods for data integration have been developed, few focus on understanding the heterogeneous effects of biological conditions across different cell populations in integrative analysis. Our proposed scalable approach, scParser, models the heterogeneous effects from biological conditions, which unveils the key mechanisms by which gene expression contributes to phenotypes. Notably, the extended scParser pinpoints biological processes in cell subpopulations that contribute to disease pathogenesis. scParser achieves favorable performance in cell clustering compared to state-of-the-art methods and has a broad and diverse applicability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    编码蛋白质的环状RNA(circRNAs)是新鉴定的RNA分子,其特征在于与翻译核糖体的强烈相互作用。新兴的证据已经暗示了这些非规范RNA的生理和病理意义,然而,他们中的一大群人仍然身份不明。由于手头工具有限,我们开发了CircProPlus,用于从头检测翻译的circRNAs的自动计算管道。与以前建立的CircPro相比,CircProPlus调整了整体工作流程,并集成了更强大的实现,以实现更轻松的可访问性,更高的灵活性和生产力。在目前的研究中,我们测试了CircProPlus在使用不同的CircRNA检测工具时的性能(即,CIRI2,CirComPara2)在评估circRNAs的编码能力中。结果表明,CirComPara2是一种最先进的算法,在测试从不同RNA文库和物种收集的真实数据时,与CircProPlus结合使用时,始终优于CIRI2,这突出了它在具有蛋白质编码潜力的circRNAs数据挖掘中的潜力。
    Protein-encoding circular RNAs (circRNAs) are newly identified RNA molecules characterized by intense interaction with translating ribosome. Emerging evidence has implicated physiological and pathological significance of these non-canonical RNAs, yet a large body of them remains unidentified. Due to limited tools at hand, we developed CircProPlus, an automated computational pipeline for de novo detection of translated circRNAs. In comparison to previously established CircPro, CircProPlus adjusts the overall workflow and integrates more robust implements for achieving easier accessibility, higher flexibility and productivity. In present study, we tested the performance of CircProPlus when using different circRNA-detecting implements (i.e., CIRI2, CirComPara2) in the evaluation of coding ability of circRNAs. Results showed that CirComPara2, a state-of-the-art algorithm, consistently outperformed CIRI2 when coupled with CircProPlus in testing real data collected from different RNA libraries and species, which highlighted its potency in data mining of circRNAs with protein-coding potential.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基因组测序已成为生物学家的常规任务,但是基因结构注释的挑战仍然存在,阻碍准确的基因组和遗传研究。这里,我们提出了一个生物信息学工具包,SynGAP(基于Synteny的基因结构注释抛光剂),它利用基因同步信息完成基因组基因结构注释的精确和自动化抛光。SynGAP在改善基因结构注释质量和物种之间整合基因同合的分析方面提供了出色的功能。此外,表达变异指数设计用于比较转录组学分析,以探索在系统发育相关物种中观察到的不同性状发育的候选基因。
    Genome sequencing has become a routine task for biologists, but the challenge of gene structure annotation persists, impeding accurate genomic and genetic research. Here, we present a bioinformatics toolkit, SynGAP (Synteny-based Gene structure Annotation Polisher), which uses gene synteny information to accomplish precise and automated polishing of gene structure annotation of genomes. SynGAP offers exceptional capabilities in the improvement of gene structure annotation quality and the profiling of integrative gene synteny between species. Furthermore, an expression variation index is designed for comparative transcriptomics analysis to explore candidate genes responsible for the development of distinct traits observed in phylogenetically related species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:面部裂痕是胎儿面部最常见的先天性畸形之一,超声检查主要负责其诊断。很难看到胎儿的腭,所以目前没有统一的胎腭筛查标准,并且相关的产前超声筛查指南中不包括腭裂的诊断。由于缺乏有效的筛查方法,许多腭裂的产前诊断被错过。因此,必须提高胎儿腭的显示率,提高腭裂的检出率和诊断准确性。我们的目标是引入一种基于“口腔裂隙连续扇形扫描”的胎儿腭筛查软件,通过我们的随访结果和三维超声验证了该方法的可行性和临床实用性。
    方法:基于“通过口腔裂隙的顺序扇形扫描”和三维超声设计和编程软件。将胎儿面部三维超声容积数据导入软件。然后,正中矢状面作为参考界面,选择下颌牙槽骨的前上缘作为支点,间隔角度,并设置扇形扫描的层数,之后进行自动扫描。因此,扇形扫描下颌牙槽骨的顺序平面,咽部,软腭,硬腭,依次获得上颌牙槽骨,以显示和评估腭。此外,通过实际临床病例评估软件在胎儿腭显示和筛查中的可行性和准确性。
    结果:显示了正常胎儿腭和腭裂缺损部位的完整视图,将10个正常胎腭和10个腭裂的三维体数据导入软件后,形成相对清晰的序贯断层图像和连续动态视频。
    结论:该软件可以更直接地显示胎儿腭,这可能是一种新的胎儿腭筛查和腭裂诊断方法。
    BACKGROUND: Orofacial clefts are one of the most common congenital malformations of the fetal face and ultrasound is mainly responsible for its diagnosis. It is difficult to view the fetal palate, so there is currently no unified standard for fetal palate screening, and the diagnosis of cleft palate is not included in the relevant prenatal ultrasound screening guidelines. Many prenatal diagnoses for cleft palate are missed due to the lack of effective screening methods. Therefore, it is imperative to increase the display rate of the fetal palate, which would improve the detection rate and diagnostic accuracy for cleft palate. We aim to introduce a fetal palate screening software based on the \"sequential sector scan though the oral fissure\", an effective method for fetal palate screening which was verified by our follow up results and three-dimensional ultrasound and to evaluate its feasibility and clinical practicability.
    METHODS: A software was designed and programmed based on \"sequential sector scan through the oral fissure\" and three-dimensional ultrasound. The three-dimensional ultrasound volume data of the fetal face were imported into the software. Then, the median sagittal plane was taken as the reference interface, the anterior upper margin of the mandibular alveolar bone was selected as the fulcrum, the interval angles, and the number of layers of the sector scan were set, after which the automatic scan was performed. Thus, the sector scan sequential planes of the mandibular alveolar bone, pharynx, soft palate, hard palate, and maxillary alveolar bone were obtained in sequence to display and evaluate the palate. In addition, the feasibility and accuracy of the software in fetal palate displaying and screening was evaluated by actual clinical cases.
    RESULTS: Full views of the normal fetal palates and the defective parts of the cleft palates were displayed, and relatively clear sequential tomographic images and continuous dynamic videos were formed after the three-dimensional volume data of 10 normal fetal palates and 10 cleft palates were imported into the software.
    CONCLUSIONS: The software can display fetal palates more directly which might allow for a new method of fetal palate screening and cleft palate diagnosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从3D神经元显微镜图像对神经元结构进行数字重建对于定量研究脑回路和功能至关重要。目前,神经元重建主要通过手动或半自动方法获得。然而,这些方式是劳动密集型的,特别是在处理大量的全脑显微镜成像数据时。这里,我们提出了一个基于深度学习的神经元形态分析工具箱(DNeuroMAT),用于神经元显微镜图像的自动分析,它由三个模块组成:神经元分割,神经元重建,和神经元临界点检测。
    Digital reconstruction of neuronal structures from 3D neuron microscopy images is critical for the quantitative investigation of brain circuits and functions. Currently, neuron reconstructions are mainly obtained by manual or semiautomatic methods. However, these ways are labor-intensive, especially when handling the huge volume of whole brain microscopy imaging data. Here, we present a deep-learning-based neuron morphology analysis toolbox (DNeuroMAT) for automated analysis of neuron microscopy images, which consists of three modules: neuron segmentation, neuron reconstruction, and neuron critical points detection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    使用机器学习模型导航高维组学数据的复杂环境提出了重大挑战。将生物领域知识整合到这些模型中,在创建更有意义的预测变量分层方面显示出了希望,导致算法更准确和可推广。然而,能够整合此类生物学知识的机器学习工具的广泛可用性仍然有限。解决这个差距,我们介绍了BioM2,这是一种新颖的R包,专为生物信息多级机器学习而设计。BioM2独特地利用生物信息在机器学习的背景下有效地分层和聚合高维生物数据。通过全基因组DNA甲基化和全转录组基因表达数据证明其实用性,BioM2已显示出增强的预测性能,超越了没有生物知识集成的传统机器学习模型。BioM2的一个关键特征是它能够在生物类别中对预测变量进行排名,特别是基因本体论途径。此功能不仅有助于结果的可解释性,而且还可以对这些变量进行后续的模块化网络分析。揭示了支撑预测结果的复杂系统级生物学。我们已经提出了一种生物学知情的多阶段机器学习框架,称为BioM2,用于基于组学数据的表型预测。BioM2已被纳入BioM2CRAN软件包(https://cran。r-project.org/web/packages/BioM2/index.html).
    Navigating the complex landscape of high-dimensional omics data with machine learning models presents a significant challenge. The integration of biological domain knowledge into these models has shown promise in creating more meaningful stratifications of predictor variables, leading to algorithms that are both more accurate and generalizable. However, the wider availability of machine learning tools capable of incorporating such biological knowledge remains limited. Addressing this gap, we introduce BioM2, a novel R package designed for biologically informed multistage machine learning. BioM2 uniquely leverages biological information to effectively stratify and aggregate high-dimensional biological data in the context of machine learning. Demonstrating its utility with genome-wide DNA methylation and transcriptome-wide gene expression data, BioM2 has shown to enhance predictive performance, surpassing traditional machine learning models that operate without the integration of biological knowledge. A key feature of BioM2 is its ability to rank predictor variables within biological categories, specifically Gene Ontology pathways. This functionality not only aids in the interpretability of the results but also enables a subsequent modular network analysis of these variables, shedding light on the intricate systems-level biology underpinning the predictive outcome. We have proposed a biologically informed multistage machine learning framework termed BioM2 for phenotype prediction based on omics data. BioM2 has been incorporated into the BioM2 CRAN package (https://cran.r-project.org/web/packages/BioM2/index.html).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号