Scientific name

  • 文章类型: Journal Article
    背景:使用唯一标识符的生物数据标准化对于无缝数据集成至关重要,全面解读,和研究结果的可重复性,促进生物信息学和系统生物学的进步。尽管被广泛接受为通用标识符,生物物种的科学名称具有固有的局限性,包括缺乏稳定性,独特性,和可兑换,阻碍了它们在数据库中作为标识符的有效使用,特别是在天然产物(NP)发生数据库中,对利用这些有价值的数据进行大规模研究应用构成了重大障碍。
    结果:为了应对这些挑战并促进涉及科学名称的生物学数据的高通量分析,我们开发了PhyloSophos,一个Python包,它考虑了科学名称和分类系统的属性,以将名称输入准确地映射到所选参考数据库中的条目。我们以NP发生数据库为例,说明了评估多个分类数据库并考虑基于分类语法的预处理的重要性,最终目标是将异构信息集成到一个单一的,统一数据集。
    结论:我们预计PhyloSophos将大大有助于系统处理数字化和策划不良的生物数据,如生物多样性信息和民族药理学资源,利用这些宝贵的数据资源进行全面的生物信息学分析。
    BACKGROUND: The standardization of biological data using unique identifiers is vital for seamless data integration, comprehensive interpretation, and reproducibility of research findings, contributing to advancements in bioinformatics and systems biology. Despite being widely accepted as a universal identifier, scientific names for biological species have inherent limitations, including lack of stability, uniqueness, and convertibility, hindering their effective use as identifiers in databases, particularly in natural product (NP) occurrence databases, posing a substantial obstacle to utilizing this valuable data for large-scale research applications.
    RESULTS: To address these challenges and facilitate high-throughput analysis of biological data involving scientific names, we developed PhyloSophos, a Python package that considers the properties of scientific names and taxonomic systems to accurately map name inputs to entries within a chosen reference database. We illustrate the importance of assessing multiple taxonomic databases and considering taxonomic syntax-based pre-processing using NP occurrence databases as an example, with the ultimate goal of integrating heterogeneous information into a single, unified dataset.
    CONCLUSIONS: We anticipate PhyloSophos to significantly aid in the systematic processing of poorly digitized and curated biological data, such as biodiversity information and ethnopharmacological resources, enabling full-scale bioinformatics analysis using these valuable data resources.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: English Abstract
    来自天然来源的食品添加剂的官方规格根据其科学和日本名称列出了该物种,从而为物种提供唯一的标识符。这有助于防止使用非指定物种,这可能会导致意外或意外的健康危害。然而,在某些情况下,根据最新的分类学研究,官方规范中列出的源物种名称与公认的科学名称不同。在本文中,我们认为,更重要的是定义科学和日本名称,强调可追溯性,以便以合理和可持续的方式控制食品添加剂成分的范围。因此,我们提出了一种确保可追溯性的方法,以及科学和日语名称的特定符号程序。使用此方法,我们检查了三种食品添加剂的来源。在某些情况下,来源物种的范围随着科学名称的变化而扩大。确保可追溯性非常重要,但也有必要确认名称更改时是否包括意外物种。
    The official specifications for food additives from natural sources list the species according to their scientific and Japanese names, thereby providing a unique identifier for the species. This helps to prevent the use of nonprescribed species, which might cause unexpected or unintended health hazards. However, there are cases in which the names of the source species listed in the official specifications differ from the accepted scientific names based on the latest taxonomic research. In this paper, we argue that it is more important to define scientific and Japanese names with an emphasis on traceability in order to control the range of food additive ingredients in a rational and sustainable manner. Therefore, we proposed a method for ensuring traceability as well as a specific notation procedure for scientific and Japanese names. Using this method, we examined the source species for three food additives. In some cases, the range of sources species expanded with the change in scientific names. Ensuring traceability is extremely important, but it is also necessary to confirm whether unexpected species are included when names are changed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    BACKGROUND: Scientific names in biology act as universal links. They allow us to cross-reference information about organisms globally. However variations in spelling of scientific names greatly diminish their ability to interconnect data. Such variations may include abbreviations, annotations, misspellings, etc. Authorship is a part of a scientific name and may also differ significantly. To match all possible variations of a name we need to divide them into their elements and classify each element according to its role. We refer to this as \'parsing\' the name. Parsing categorizes name\'s elements into those that are stable and those that are prone to change. Names are matched first by combining them according to their stable elements. Matches are then refined by examining their varying elements. This two stage process dramatically improves the number and quality of matches. It is especially useful for the automatic data exchange within the context of \"Big Data\" in biology.
    RESULTS: We introduce Global Names Parser (gnparser). It is a Java tool written in Scala language (a language for Java Virtual Machine) to parse scientific names. It is based on a Parsing Expression Grammar. The parser can be applied to scientific names of any complexity. It assigns a semantic meaning (such as genus name, species epithet, rank, year of publication, authorship, annotations, etc.) to all elements of a name. It is able to work with nested structures as in the names of hybrids. gnparser performs with ≈99% accuracy and processes 30 million name-strings/hour per CPU thread. The gnparser library is compatible with Scala, Java, R, Jython, and JRuby. The parser can be used as a command line application, as a socket server, a web-app or as a RESTful HTTP-service. It is released under an Open source MIT license.
    CONCLUSIONS: Global Names Parser (gnparser) is a fast, high precision tool for biodiversity informaticians and biologists working with large numbers of scientific names. It can replace expensive and error-prone manual parsing and standardization of scientific names in many situations, and can quickly enhance the interoperability of distributed biological information.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    In conservation science, assessments of trends and priorities for actions often focus on species as the management unit. Studies on species coverage in online media are commonly conducted by using species vernacular names. However, the use of species vernacular names for web-based data search is problematic due to the high risk of mismatches in results. While the use of Latin names may produce more consistent results, it is uncertain whether a search using Latin names will produce unbiased results as compared to vernacular names. We assessed the potential of Latin names to be used as an alternative to vernacular names for the data mining within the field of conservation science. By using Latin and vernacular names, we searched for species from four species groups: diurnal birds of prey, Carnivora, Primates and marine mammals. We assessed the relationship of the results obtained within different online sources, such as Internet pages, newspapers and social media networks. Results indicated that the search results based on Latin and vernacular names were highly correlated, and confirmed that one may be used as an alternative for the other. We also demonstrated the potential of the number of images posted on the Internet to be used as an indication of the public attention towards different species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:植物和动物的科学名称在生命科学中起着重要作用,因为信息被索引,集成,用学名搜索.名称的主要问题是它们的模棱两可,因为多个名称可能指向相同的分类单元,多个分类单元可能共享相同的名称。此外,科学名称随着时间的推移而改变,这让他们对各种解释持开放态度。将机器可理解的语义应用于这些名称可以有效处理信息系统中的生物内容。第一步是在引用tasa时使用唯一的持久标识符而不是名称字符串。最常用的标识符是生命科学标识符(LSID),传统上用于关系数据库中,以及最近的HTTPURI,通过链接数据应用程序在语义Web上应用。
    结果:我们介绍了两种以物种清单形式表达分类学信息的模型。首先,我们展示了如何使用LSID在关系数据库系统中显示物种清单。然后,为了获得更详细的分类信息,我们引入了元本体TaxMeOn来对与语义Web本体相同的内容进行建模,其中使用HTTPURI识别分类单元。我们还探讨了如何随着时间的推移管理科学名称的变化。
    结论:对于提供物种清单的分类学信息,使用HTTPURI更为可取。HTTPURI标识一个分类单元,并作为一个网址操作,可以从中找到有关该分类单元的其他信息,不像LSID。这使得能够使用关联数据原理在网络上集成来自不同来源的生物数据,并防止信息孤岛的形成。链接数据方法允许用户基于分类分类的冲突观点来组装信息并评估分类数据的复杂性。使用HTTPURI和语义Web技术还可以促进生物数据的语义表示,以这种方式,创造更多的“智能”生物应用和服务。
    BACKGROUND: The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications.
    RESULTS: We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time.
    CONCLUSIONS: The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more \"intelligent\" biological applications and services.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号