关键词: HTTP URI LSID Linked data Ontology Scientific name Semantic web Species checklist Taxonomic concept

来  源:   DOI:10.1186/2041-1480-5-40   PDF(Sci-hub)   PDF(Pubmed)

Abstract:
BACKGROUND: The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications.
RESULTS: We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time.
CONCLUSIONS: The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more \"intelligent\" biological applications and services.
摘要:
背景:植物和动物的科学名称在生命科学中起着重要作用,因为信息被索引,集成,用学名搜索.名称的主要问题是它们的模棱两可,因为多个名称可能指向相同的分类单元,多个分类单元可能共享相同的名称。此外,科学名称随着时间的推移而改变,这让他们对各种解释持开放态度。将机器可理解的语义应用于这些名称可以有效处理信息系统中的生物内容。第一步是在引用tasa时使用唯一的持久标识符而不是名称字符串。最常用的标识符是生命科学标识符(LSID),传统上用于关系数据库中,以及最近的HTTPURI,通过链接数据应用程序在语义Web上应用。
结果:我们介绍了两种以物种清单形式表达分类学信息的模型。首先,我们展示了如何使用LSID在关系数据库系统中显示物种清单。然后,为了获得更详细的分类信息,我们引入了元本体TaxMeOn来对与语义Web本体相同的内容进行建模,其中使用HTTPURI识别分类单元。我们还探讨了如何随着时间的推移管理科学名称的变化。
结论:对于提供物种清单的分类学信息,使用HTTPURI更为可取。HTTPURI标识一个分类单元,并作为一个网址操作,可以从中找到有关该分类单元的其他信息,不像LSID。这使得能够使用关联数据原理在网络上集成来自不同来源的生物数据,并防止信息孤岛的形成。链接数据方法允许用户基于分类分类的冲突观点来组装信息并评估分类数据的复杂性。使用HTTPURI和语义Web技术还可以促进生物数据的语义表示,以这种方式,创造更多的“智能”生物应用和服务。
公众号