NCBI

NCBI
  • 文章类型: Journal Article
    在集约化水产养殖的环境中,鱼类的营养代谢疾病经常发生。这些疾病的病因和发病机制涉及受内部遗传因素和外部环境条件影响的能量代谢紊乱。与营养和代谢紊乱相关的基因的探索引起了水产养殖科学界和工业界的极大兴趣。高通量测序技术为研究人员提供了广泛的遗传信息。有效开采,分析,安全地存储这些数据至关重要,特别是推进疾病预防和治疗策略。目前,有关鱼类营养和代谢紊乱的基因数据库的探索和应用还处于起步阶段。因此,本研究以模式生物斑马鱼和5种初级经济鱼类为研究对象。使用来自KEGG的信息,OMIM,和现有的文献,精心构建了一个与鱼类营养代谢疾病相关的新基因数据库。这个数据库包含了Daniorerio的4583个基因,6287用于鲤鱼,3289为Takifugurublopes,3548大黄鱼,3816用于尼罗罗非鱼,和5708的Oncorhynchusmykiss。通过比较系统生物学方法,我们发现在这些鱼类中与营养代谢疾病相关的基因相对较高的保守性,超过54.9%的基因在所有六个物种中都是保守的。此外,该分析确定了大黄鱼基因组中存在13个物种特异性基因,罗非鱼,还有虹鳟鱼.这些基因显示出作为解决营养代谢疾病的新候选靶标的潜力。
    Nutritional metabolic diseases in fish frequently arise in the setting of intensive aquaculture. The etiology and pathogenesis of these conditions involve energy metabolic disorders influenced by both internal genetic factors and external environmental conditions. The exploration of genes associated with nutritional and metabolic disorder has sparked considerable interest within both the aquaculture scientific community and the industry. High-throughput sequencing technology offers researchers extensive genetic information. Effectively mining, analyzing, and securely storing this data is crucial, especially for advancing disease prevention and treatment strategies. Presently, the exploration and application of gene databases concerning nutritional and metabolic disorders in fish are at a nascent stag. Therefore, this study focused on the model organism zebrafish and five primary economic fish species as the subjects of investigation. Using information from KEGG, OMIM, and existing literature, a novel gene database associated with nutritional metabolic diseases in fish was meticulously constructed. This database encompassed 4583 genes for Danio rerio, 6287 for Cyprinus carpio, 3289 for Takifugu rubripes, 3548 for Larimichthys crocea, 3816 for Oreochromis niloticus, and 5708 for Oncorhynchus mykiss. Through a comparative systems biology approach, we discerned a relatively high conservation of genes linked to nutritional metabolic diseases across these fish species, with over 54.9 % of genes being conserved throughout all six species. Additionally, the analysis pinpointed the existence of 13 species-specific genes within the genomes of large yellow croaker, tilapia, and rainbow trout. These genes exhibit the potential to serve as novel candidate targets for addressing nutritional metabolic diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    虽然分类学是一个经常被低估的科学分支,它发挥着非常重要的作用。噬菌体分类学已经从一个主要基于形态学的学科演变而来,以大卫·布拉德利和汉斯·沃尔夫冈·阿克曼的作品为特色,今天采取的基于序列的方法。国际病毒分类学委员会(ICTV)的细菌病毒小组委员会采用整体方法对原核生物病毒进行分类,方法是在决定新病毒的分类位置之前,测量整体DNA和蛋白质的相似性以及系统发育。国家生物技术信息中心(NCBI)和其他公共数据库中存放了大量完整的基因组,这导致了对许多病毒分类的重新评估。未来将出现新的病毒家族和更高的订单。
    While taxonomy is an often underappreciated branch of science, it serves very important roles. Bacteriophage taxonomy has evolved from a discipline based mainly on morphology, characterized by the work of David Bradley and Hans-Wolfgang Ackermann, to the sequence-based approach that is taken today. The Bacterial Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV) takes a holistic approach to classifying prokaryote viruses by measuring overall DNA and protein similarity and phylogeny before making decisions about the taxonomic position of a new virus. The huge number of complete genomes being deposited with the National Center for Biotechnology Information (NCBI) and other public databases has resulted in a reassessment of the taxonomy of many viruses, and the future will see the introduction of new viral families and higher orders.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:花粉是识别马来西亚森林蜂蜜生产来源和复杂生态系统的有用工具。作为马来西亚的本地主要传粉者,在各种植物/花粉物种上收集蜂蜜的背松和异叶三叶草。这项研究旨在生成一个数据集,以揭示这些植物/花粉物种的存在及其在A.dorsata和H.itama蜂蜜中的相对丰度。从这项研究中收集的信息可用于确定这两个物种生产的蜂蜜的地理和植物来源以及真实性。
    结果:获得了背角曲霉和伊塔玛的序列数据。Dorsata的原始序列数据为5Mb,它被组装成5个重叠群,大小为6,098,728bp,N50为15,534,GC平均值为57.42。同样,H.itama的原始序列数据为6.3Mb,它被组装成11个重叠群,大小为7,642,048bp,N50为17,180,GC平均值为55.38。在A.dorsata的蜂蜜样本中,我们确定了五种不同的植物/花粉物种,五个物种中只有一个的相对丰度低于1%。对于H.Itama,我们确定了七种不同的植物/花粉物种,只有三个物种的相对丰度低于1%。所有确定的植物物种都原产于马来西亚半岛,尤其是登嘉楼的东海岸地区.
    方法:我们的数据为蜂蜜的地理和植物来源以及真实性提供了宝贵的见解。宏基因组学研究可以帮助确定蜜蜂觅食的植物物种,并为研究A.dorsata和H.itama生物学发育的研究人员提供初步数据。从蜂蜜的eDNA中识别各种因其药用特性而闻名的花朵,可以帮助具有准确产品来源标签的区域蜂蜜,这对于保证消费者的产品真实性至关重要。
    OBJECTIVE: Pollen is a useful tool for identifying the provenance and complex ecosystems surrounding honey production in Malaysian forests. As native key pollinators in Malaysia, Apis dorsata and Heterotrigona itama forage on various plant/pollen species to collect honey. This study aims to generate a dataset that uncovers the presence of these plant/pollen species and their relative abundance in the honey of A. dorsata and H. itama. The information gathered from this study can be used to determine the geographical and botanical origin and authenticity of the honey produced by these two species.
    RESULTS: Sequence data were obtained for both A. dorsata and H. itama. The raw sequence data for A. dorsata was 5 Mb, which was assembled into 5 contigs with a size of 6,098,728 bp, an N50 of 15,534, and a GC average of 57.42. Similarly, the raw sequence data for H. itama was 6.3 Mb, which was assembled into 11 contigs with a size of 7,642,048 bp, an N50 of 17,180, and a GC average of 55.38. In the honey sample of A. dorsata, we identified five different plant/pollen species, with only one of the five species exhibiting a relative abundance of less than 1%. For H. itama, we identified seven different plant/pollen species, with only three of the species exhibiting a relative abundance of less than 1%. All of the identified plant species were native to Peninsular Malaysia, especially the East Coast area of Terengganu.
    METHODS: Our data offers valuable insights into honey\'s geographical and botanical origin and authenticity. Metagenomic studies could help identify the plant species that honeybees forage and provide preliminary data for researchers studying the biological development of A. dorsata and H. itama. The identification of various flowers from the eDNA of honey that are known for their medicinal properties could aid in regional honey with accurate product origin labeling, which is crucial for guaranteeing product authenticity to consumers.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    自闭症谱系障碍(ASD)是由大脑区域差异引起的发育障碍。转录组数据的差异表达(DE)分析允许对与ASD相关的基因表达变化进行全基因组分析。从头突变可能在ASD中起着至关重要的作用,但是所涉及的基因列表还远远没有完成。差异表达基因(DEG)被视为候选生物标志物,并且可以使用生物学知识或数据驱动的方法(如机器学习和统计分析)将一小组DEG鉴定为生物标志物。在这项研究中,我们采用基于机器学习的方法来鉴定ASD和典型发育(TD)之间的差异基因表达。从NCBIGEO数据库获得15个ASD和15个TD的基因表达数据。最初,我们提取数据并使用标准管道对数据进行预处理。Further,随机森林(RF)用于区分ASD和TD之间的基因。我们确定了前10个突出的差异基因,并将它们与统计检验结果进行了比较。我们的结果表明,所提出的射频模型具有5倍的交叉验证精度,敏感性和特异性为96.67%。Further,我们获得了97.5%和96.57%的精度和F测量分数,分别。此外,我们发现34个独特的DEG染色体位置在识别来自TD的ASD方面有影响.我们还确定了chr3:113322718-113322659是区分ASD和TD的最重要的贡献染色体位置。我们基于机器学习的改进DE分析方法有望从基因表达谱中找到生物标志物并优先考虑DEG。此外,我们的研究报告了ASD的前10个基因特征可能有助于开发可靠的诊断和预后生物标志物来筛查ASD.
    Autism spectrum disorder (ASD) is a developmental disability caused by differences in the brain regions. Analysis of differential expression (DE) of transcriptomic data allows for genome-wide analysis of gene expression changes related to ASD. De-novo mutations may play a vital role in ASD, but the list of genes involved is still far from complete. Differentially expressed genes (DEGs) are treated as candidate biomarkers and a small set of DEGs might be identified as biomarkers using either biological knowledge or data-driven approaches like machine learning and statistical analysis. In this study, we employed a machine learning-based approach to identify the differential gene expression between ASD and Typical Development (TD). The gene expression data of 15 ASD and 15 TD were obtained from the NCBI GEO database. Initially, we extracted the data and used a standard pipeline to pre-process the data. Further, Random Forest (RF) was used to discriminate genes between ASD and TD. We identified the top 10 prominent differential genes and compared them with the statistical test results. Our results show that the proposed RF model yields 5-fold cross-validation accuracy, sensitivity and specificity of 96.67%. Further, we obtained precision and F-measure scores of 97.5% and 96.57%, respectively. Moreover, we found 34 unique DEG chromosomal locations having influential contributions in identifying ASD from TD. We have also identified chr3:113322718-113322659 as the most significant contributing chromosomal location in discriminating ASD and TD. Our machine learning-based method of refining DE analysis is promising for finding biomarkers from gene expression profiles and prioritizing DEGs. Moreover, our study reported top 10 gene signatures for ASD may facilitate the development of reliable diagnosis and prognosis biomarkers for screening ASD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    书架是由国家医学图书馆的国家生物技术信息中心(NCBI)维护的数据库,其中包含免费访问的在线生物医学文件,包括系统审查,技术报告,教科书,和参考书。该数据库允许用户浏览和搜索所有内容和个人书籍,并链接到其他NCBI内容。本文概述了Bookshelf,并演示了它在示例搜索中的用法。书架中可用的资源对学生有用,研究人员,医疗保健专业人员,和图书馆员。
    Bookshelf is a database maintained by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine that contains freely accessible online biomedical documents, including systematic reviews, technical reports, textbooks, and reference books. The database allows users to browse and search across all content and within individual books, and it is linked to other NCBI content. This article provides an overview of Bookshelf and demonstrates its usage in a sample search. The resources available in Bookshelf are useful for students, researchers, healthcare professionals, and librarians.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Preprint
    代谢组学数据集的生物学解释通常在途径分析步骤结束,以在统计上显著的代谢物列表中找到过度表示的代谢途径。然而,生化途径和代谢物覆盖率的定义在不同的精选数据库中有所不同,导致不准确和矛盾的解释。对于基因列表,转录本和蛋白质,基因本体论(GO)术语过度表达分析已成为生物学解释的标准化方法。GO术语不限于预定义的途径,还可以包括不包括在途径数据库中的相关代谢过程。尽管GO术语相对于传统路径图具有若干优势,代谢组学数据集尚未实现GO分析。为了克服这一点,我们提供了一个新的知识库和在线工具,代谢组学和外显子组学综合数据科学实验室的基因本体论分析(IDSL。GOA)对代谢物清单进行GO过度表达分析。IDSL。GOA知识库涵盖2,324个代谢GO术语和相关的2,818个基因,22,264份成绩单,20,158种蛋白质,1,482EC注释,2,430个反应和2,212个代谢物。IDSL。对老年和年轻女性大脑皮质代谢组的案例研究的GOA分析强调,超过250个GO术语被显著高估(FDR<0.05)。分析表明,在老年女性大脑皮层区域,核苷酸抢救过程受到严重影响。相比之下,对于相同的代谢物列表,MetaboAnalyst和ReactomePathway分析表明,在FDR<0.05时,少于5条途径,并且没有一条与核苷酸补救途径相关。我们展示了如何IDSL。GOA确定了替代途径分析方法未提及的关键和相关的GO代谢过程。总的来说,我们建议,代谢组学研究人员不应将代谢物列表的解释仅限于通路图,也可以利用GO术语.IDSL。GOA为此提供了一个强大的工具,允许对代谢物途径数据进行更全面和准确的分析。IDSL。GOA工具可以在https://goa访问。idsl.我/.
    Biological interpretation of metabolomic datasets often ends at a pathway analysis step to find the over-represented metabolic pathways in the list of statistically significant metabolites. However, definitions of biochemical pathways and metabolite coverage vary among different curated databases, leading to missed interpretations. For the lists of genes, transcripts and proteins, Gene Ontology (GO) terms over-presentation analysis has become a standardized approach for biological interpretation. But, GO analysis has not been achieved for metabolomic datasets. We present a new knowledgebase (KB) and the online tool, Gene Ontology Analysis by the Integrated Data Science Laboratory for Metabolomics and Exposomics (IDSL.GOA) to conduct GO over-representation analysis for a metabolite list. The IDSL.GOA KB covers 2,393 metabolic GO terms and associated 3,144 genes, 1,492 EC annotations, and 2,621 metabolites. IDSL.GOA analysis of a case study of older vs young female brain cortex metabolome highlighted 82 GO terms being significantly overrepresented (FDR <0.05). We showed how IDSL.GOA identified key and relevant GO metabolic processes that were not yet covered in other pathway databases. Overall, we suggest that interpretation of metabolite lists should not be limited to only pathway maps and can also leverage GO terms as well. IDSL.GOA provides a useful tool for this purpose, allowing for a more comprehensive and accurate analysis of metabolite pathway data. IDSL.GOA tool can be accessed at https://goa.idsl.me/.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    由严重急性呼吸道综合症冠状病毒-2(SARS-CoV-2)引起的COVID-19最近在全球爆发,造成了大流行,并成为对人类的潜在威胁。对病毒遗传组成的分析表明,刺突蛋白,一种主要的结构蛋白,促进病毒进入宿主细胞。
    刺突蛋白已成为预防和治疗研究的主要目标。这里,我们使用生物信息学工具比较了SARS-CoV-2变体的刺突蛋白。
    野生型SARS-CoV-2及其6种变体D614G的刺突蛋白序列,阿尔法(B.1.1.7),beta(B.1.351),三角洲(B.1.617.2),gamma(P.1),和omicron(B.1.1.529)-从NCBI数据库检索。使用ClustalX程序对多重比对进行测序并进行突变分析。几个在线生物信息学工具被用来预测生理,免疫学,和SARS-CoV-2变体的刺突蛋白的结构特征。使用CLC软件构建系统发育树。使用jamovi2软件对数据进行统计分析。
    多重序列分析显示,delta变异体中的P681R突变,将氨基酸从组氨酸(H)变为精氨酸(R),由于精氨酸的高pKa值(12.5)与组氨酸(6.0)相比,使蛋白质更具碱性。物理化学性质揭示了与其他变体相比,δ变体的相对较高的等电点(7.34)和脂肪族指数(84.65)。等电点的统计分析,抗原性,所有变异体的免疫原性都显示出显著的相关性,P值范围从<.007到.04。2D凝胶图的生成显示了δ刺突蛋白与其他变体的分组的分离。刺突蛋白的系统发育树表明,δ变体接近Rousettus蝙蝠冠状病毒和MERS-CoV的混合物。
    SARS-CoV-2变体的比较分析表明,δ变体在性质上更脂肪族,这为它提供了更多的稳定性,并随后影响病毒行为。
    UNASSIGNED: A recent global outbreak of COVID-19 caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) created a pandemic and emerged as a potential threat to humanity. The analysis of virus genetic composition has revealed that the spike protein, one of the major structural proteins, facilitates the entry of the virus to host cells.
    UNASSIGNED: The spike protein has become the main target for prophylactics and therapeutics studies. Here, we compared the spike proteins of SARS-CoV-2 variants using bioinformatics tools.
    UNASSIGNED: The spike protein sequences of wild-type SARS-CoV-2 and its 6 variants-D614G, alpha (B.1.1.7), beta (B.1.351), delta (B.1.617.2), gamma (P.1), and omicron (B.1.1.529)-were retrieved from the NCBI database. The ClustalX program was used to sequence multiple alignment and perform mutational analysis. Several online bioinformatics tools were used to predict the physiological, immunological, and structural features of the spike proteins of SARS-CoV-2 variants. A phylogenetic tree was constructed using CLC software. Statistical analysis of the data was done using jamovi 2 software.
    UNASSIGNED: Multiple sequence analysis revealed that the P681R mutation in the delta variant, which changed an amino acid from histidine (H) to arginine (R), made the protein more alkaline due to arginine\'s high pKa value (12.5) compared to histidine\'s (6.0). Physicochemical properties revealed the relatively higher isoelectric point (7.34) and aliphatic index (84.65) of the delta variant compared to other variants. Statistical analysis of the isoelectric point, antigenicity, and immunogenicity of all the variants revealed significant correlation, with P values ranging from <.007 to .04. The generation of a 2D gel map showed the separation of the delta spike protein from a grouping of the other variants. The phylogenetic tree of the spike proteins showed that the delta variant was close to and a mix of the Rousettus bat coronavirus and MERS-CoV.
    UNASSIGNED: The comparative analysis of SARS-CoV-2 variants revealed that the delta variant is more aliphatic in nature, which provides more stability to it and subsequently influences virus behavior.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    GenBank®和序列读取存档(SRA)是公开可用的DNA序列的综合数据库。GenBank包含480,000个命名生物的数据,超过176,000个胚叶内,通过单个实验室的提交和大规模测序项目的批量提交获得。SRA包含来自超过110,000个物种的下一代测序研究的读数。与欧洲的欧洲核苷酸档案馆(ENA)和日本的DNA数据库(DDBJ)进行每日数据交换,确保了这两个数据库的全球覆盖。GenBank和SRA数据可通过NCBIEntrez检索系统访问,该系统将这些数据与NCBI的其他数据集成在一起,如基因组,分类法,和生物医学文献。BLAST提供GenBank和其他序列数据库的序列相似性搜索。通过FTP可获得GenBank数据库的完整双月版本和每日更新。讨论了GenBank和SRA的使用方案,从本地和云分析到NCBI基于Web的工具支持的在线分析。GenBank和SRA,以及他们相关的检索和分析服务,可从www上的NCBI主页获得。ncbi.nlm.nih.gov.
    GenBank® and the Sequence Read Archive (SRA) are comprehensive databases of publicly available DNA sequences. GenBank contains data for 480,000 named organisms, more than 176,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. SRA contains reads from next-generation sequencing studies from over 110,000 species. Daily data exchange with the European Nucleotide Archive (ENA) in Europe and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage for both databases. GenBank and SRA data are accessible through the NCBI Entrez retrieval system that integrates these data with other data at NCBI, such as genomes, taxonomy, and the biomedical literature. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Usage scenarios for both GenBank and SRA ranging from local and cloud analyses to online analyses supported by the NCBI web-based tools are discussed. Both GenBank and SRA, along with their related retrieval and analysis services, are available from the NCBI homepage at www.ncbi.nlm.nih.gov .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    对病原体进行准确的物种级识别对于疾病诊断和管理至关重要,因为了解病原体的身份将其与已知的宿主范围联系起来。地理分布,和毒素生产潜力。在发表同行评审的疾病报告时尤其如此,不精确和/或不正确的标识会削弱公共知识库。对于需要特别识别镰刀菌的植物病理学家和其他应用生物学家来说,这可能是一项艰巨的任务,因为已发表和正在进行的多位点分子系统研究突出了几个混杂问题。其中最重要的是:(i)目前估计该农业和临床上重要的属包括400多个系统发育上不同的物种(即,系统物种),其中80%以上是在过去25年中发现的;(ii)尚未正式描述大约三分之一的系统物种;(iii)仅形态学不足以区分大多数这些物种;(iv)当前从病原体调查中迅速发现的新型镰刀菌以及对分类学景观的影响预计将持续到可预见的未来。为了满足对准确病原体识别的关键需求,我们的研究小组专注于填充两个可通过网络访问的数据库(FUSARIUM-IDv.3.0和包括GenBank在内的非冗余国家生物技术信息中心核苷酸收集),其中包含三个系统发育信息基因的部分(即,TEF1,RPB1和RPB2)在每个镰刀菌物种中都达到或接近物种水平。本特别报告的目标,及其在本期中的同伴(Torres-Cruz等人,2022),将提供一份进展报告,说明我们为填充这些数据库所做的努力,并概述一套基于DNA序列鉴定镰刀菌的最佳做法。
    Accurate species-level identification of an etiological agent is crucial for disease diagnosis and management because knowing the agent\'s identity connects it with what is known about its host range, geographic distribution, and toxin production potential. This is particularly true in publishing peer-reviewed disease reports, where imprecise and/or incorrect identifications weaken the public knowledge base. This can be a daunting task for phytopathologists and other applied biologists that need to identify Fusarium in particular, because published and ongoing multilocus molecular systematic studies have highlighted several confounding issues. Paramount among these are: (i) this agriculturally and clinically important genus is currently estimated to comprise more than 400 phylogenetically distinct species (i.e., phylospecies), with more than 80% of these discovered within the past 25 years; (ii) approximately one-third of the phylospecies have not been formally described; (iii) morphology alone is inadequate to distinguish most of these species from one another; and (iv) the current rapid discovery of novel fusaria from pathogen surveys and accompanying impact on the taxonomic landscape is expected to continue well into the foreseeable future. To address the critical need for accurate pathogen identification, our research groups are focused on populating two web-accessible databases (FUSARIUM-ID v.3.0 and the nonredundant National Center for Biotechnology Information nucleotide collection that includes GenBank) with portions of three phylogenetically informative genes (i.e., TEF1, RPB1, and RPB2) that resolve at or near the species level in every Fusarium species. The objectives of this Special Report, and its companion in this issue (Torres-Cruz et al. 2022), are to provide a progress report on our efforts to populate these databases and to outline a set of best practices for DNA sequence-based identification of fusaria.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号