Databases, Nucleic Acid

  • 文章类型: Journal Article
    Cancer stemness plays an important role in cancer initiation and progression, and is the major cause of tumor invasion, metastasis, recurrence, and poor prognosis. Non-coding RNAs (ncRNAs) are a class of RNA transcripts that generally cannot encode proteins and have been demonstrated to play a critical role in regulating cancer stemness. Here, we developed the ncStem database to record manually curated and predicted ncRNAs associated with cancer stemness. In total, ncStem contains 645 experimentally verified entries, including 159 long non-coding RNAs (lncRNAs), 254 microRNAs (miRNAs), 39 circular RNAs (circRNAs), and 5 other ncRNAs. The detailed information of each entry includes the ncRNA name, ncRNA identifier, disease, reference, expression direction, tissue, species, and so on. In addition, ncStem also provides computationally predicted cancer stemness-associated ncRNAs for 33 TCGA cancers, which were prioritized using the random walk with restart (RWR) algorithm based on regulatory and co-expression networks. The total predicted cancer stemness-associated ncRNAs included 11 132 lncRNAs and 972 miRNAs. Moreover, ncStem provides tools for functional enrichment analysis, survival analysis, and cell location interrogation for cancer stemness-associated ncRNAs. In summary, ncStem provides a platform to retrieve cancer stemness-associated ncRNAs, which may facilitate research on cancer stemness and offer potential targets for cancer treatment. Database URL:






  • 文章类型: Journal Article
    The DNA Commission of the International Society for Forensic Genetics (ISFG) has developed a set of nomenclature recommendations for short tandem repeat (STR) sequences. These recommendations follow the 2016 considerations of the DNA Commission of the ISFG, incorporating the knowledge gained through research and population studies in the intervening years. While maintaining a focus on backward compatibility with the CE data that currently populate national DNA databases, this report also looks to the future with the establishment of recommended minimum sequence reporting ranges to facilitate interlaboratory comparisons, automated solutions for sequence-based allele designations, a suite of resources to support bioinformatic development, guidance for characterizing new STR loci, and considerations for incorporating STR sequences and other new markers into investigative databases.






  • 文章类型: Journal Article
    Despite the increasing number of 3D RNA structures in the Protein Data Bank, the majority of experimental RNA structures lack thorough functional annotations. As the significance of the functional roles played by noncoding RNAs becomes increasingly apparent, comprehensive annotation of RNA function is becoming a pressing concern. In response to this need, we have developed FURNA (Functions of RNAs), the first database for experimental RNA structures that aims to provide a comprehensive repository of high-quality functional annotations. These include Gene Ontology terms, Enzyme Commission numbers, ligand-binding sites, RNA families, protein-binding motifs, and cross-references to related databases. FURNA is available at to enable quick discovery of RNA functions from their structures and sequences.






  • 文章类型: Journal Article
    The inception of forensic DNA elimination database represents a pivotal advancement in forensic science, aiming to streamline the process of distinguishing between DNA found at crime scenes and that of individuals involved in the investigation process, such as law enforcement personnel and forensic lab staff. In subsequent phases, once familiarity with the database is achieved by its administrators and other stakeholders, and they have accrued sufficient experience, the possibility of expanding the database to encompass first responders-including firefighters, paramedics, emergency medical technicians, and other emergency services personnel-can be contemplated. Key challenges in managing these databases encompass the grounds for collecting samples, ensuring the integrity of both samples and profiles, along with the duration of retention, access to the database, and the protocols to follow when a match is found in the database. This paper outlines the conceptual and detailed legislative framework in Hungary, where the forensic DNA elimination database was introduced in 2022.






  • 文章类型: Journal Article
    National forensic DNA databases are a valuable investigative tool, that have the potential to increase the efficacy of criminal investigations. Their unfettered expansion in recent years raises unsettling ethical issues that require close attention. DNA database expansion threatens the rights to privacy, non-discrimination, and equality, and can undermine public trust in government. This perspective piece relies on data from an international mapping study of Forensic DNA Databases to document the expansion of these databases, highlight the ethical issues they raise, and propose key recommendations for more responsible use of this infrastructure.






  • 文章类型: Journal Article
    DNA technology is the gold standard with respect to the identification of individuals from biological evidence. The technology offers the convenience of a universally similar approach and methodology for analysis across the globe. However, the technology has not realised its full potential in India due to the lack of a DNA database and lacunae in sample collection and preservation from the scene of crime and victims (especially those of sexual assault). Further, statistical interpretation of DNA results is non-existent in the majority of cases. Though the latest technologies and developments in the field of DNA analysis are being adopted and implemented,very little has been enacted practically to improve optimise sample collection and preservation. This article discusses current casework scenarios that highlight the pitfalls and ambiguous areas in the field of DNA analysis, especially with respect DNA databases, sampling, andstatistical approaches to genetic data analysis. Possible solutions and mitigation measures are suggested.






  • 文章类型: Journal Article
    The relationship between different ribonucleic acids (RNAs) and tumor immunity has been widely investigated. However, a systematic description of tumor immune-related RNAs in different tumors is still lacking. We collected the relationship of tumor immune-related RNAs from the published literature and presented them in a user-friendly interface, \"ImmRNA\" (, to provide a resource to study immune-RNA-cancer regulatory relations. The ImmRNA contains 49 996 curated entries. Each entry includes gene symbols, gene types, target genes, downstream effects, functions, immune cells, and other information. By rearranging and reanalyzing the data, our dataset contains the following key points: (i) providing the links between RNAs and the immune in cancers, (ii) displaying the downstream effects and functions of RNAs, (iii) listing immune cells and immune pathways related to RNA function, (iv) showing the relationship between RNAs and prognostic outcomes, and (v) exhibiting the experimental methods described in the article. ImmRNA provides a valuable resource for understanding the functions of tumor immune-related RNAs. Database URL:






  • 文章类型: Journal Article
    In 2019, the Texas Department of Public Safety (TXDPS) Texas Ranger Division (TRD) identified approximately 3300 registered sex offenders (RSOs) from whom a \"lawfully owed\" DNA sample was missing from the Federal Bureau of Investigation\'s Combined DNA Index System (CODIS). Lawfully owed DNA (LODNA) is defined as a DNA sample from a qualifying offender who should have had their sample entered into CODIS, but for unknown reasons did not. As a result of those findings, TXDPS then applied for and was awarded a grant from the Bureau of Justice Assistance\'s Sexual Assault Kit Initiative to collect DNA specimens from these RSOs, and to perform a statewide LODNA census. TXDPS TRD sought to determine: Are the missed DNA collection problems limited to RSO\'s or are they occurring among individuals with a qualifying arrest or conviction as specified by state law too? What processes are used to identify individuals who are eligible for DNA sample collection? How is an individuals\' DNA collection eligibility conveyed to external agencies? The findings from TXDPS\' LODNA census, identified 43,245 individuals who were likely eligible for DNA collection between 1995 and 2020, therefore indicating statewide DNA collection issues. Over 4 years, collection efforts pertaining to the aforementioned lawfully owed census, have yielded 5183 LODNA sample collections, and 276 CODIS hits. This manuscript aims to create an awareness within other agencies of the importance of implementing best practices to ensure the collection and upload of LODNA from every eligible individual.






  • 文章类型: Journal Article
    Recent success of AlphaFold2 in protein structure prediction relied heavily on co-evolutionary information derived from homologous protein sequences found in the huge, integrated database of protein sequences (Big Fantastic Database). In contrast, the existing nucleotide databases were not consolidated to facilitate wider and deeper homology search. Here, we built a comprehensive database by incorporating the non-coding RNA (ncRNA) sequences from RNAcentral, the transcriptome assembly and metagenome assembly from metagenomics RAST (MG-RAST), the genomic sequences from Genome Warehouse (GWH), and the genomic sequences from MGnify, in addition to the nucleotide (nt) database and its subsets in National Center of Biotechnology Information (NCBI). The resulting Master database of All possible RNA sequences (MARS) is 20-fold larger than NCBI\'s nt database or 60-fold larger than RNAcentral. The new dataset along with a new split-search strategy allows a substantial improvement in homology search over existing state-of-the-art techniques. It also yields more accurate and more sensitive multiple sequence alignments (MSAs) than manually curated MSAs from Rfam for the majority of structured RNAs mapped to Rfam. The results indicate that MARS coupled with the fully automatic homology search tool RNAcmap will be useful for improved structural and functional inference of ncRNAs and RNA language models based on MSAs. MARS is accessible at, and RNAcmap3 is accessible at






  • 文章类型: Journal Article
    Molecular identification of micro- and macroorganisms based on nuclear markers has revolutionized our understanding of their taxonomy, phylogeny and ecology. Today, research on the diversity of eukaryotes in global ecosystems heavily relies on nuclear ribosomal RNA (rRNA) markers. Here, we present the research community-curated reference database EUKARYOME for nuclear ribosomal 18S rRNA, internal transcribed spacer (ITS) and 28S rRNA markers for all eukaryotes, including metazoans (animals), protists, fungi and plants. It is particularly useful for the identification of arbuscular mycorrhizal fungi as it bridges the four commonly used molecular markers-ITS1, ITS2, 18S V4-V5 and 28S D1-D2 subregions. The key benefits of this database over other annotated reference sequence databases are that it is not restricted to certain taxonomic groups and it includes all rRNA markers. EUKARYOME also offers a number of reference long-read sequences that are derived from (meta)genomic and (meta)barcoding-a unique feature that can be used for taxonomic identification and chimera control of third-generation, long-read, high-throughput sequencing data. Taxonomic assignments of rRNA genes in the database are verified based on phylogenetic approaches. The reference datasets are available in multiple formats from the project homepage,





