nucleotide sequence

  • 文章类型: Journal Article
    BACKGROUND: With an exponential growth in biological data and computing power, familiarity with bioinformatics has become a demanding and popular skill set both in academia and industry. There is a need to increase students\' competencies to be able to take on bioinformatic careers, to get them familiarized with scientific professions in data science and the academic training required to pursue them, in a field where demand outweighs the supply.
    METHODS: Here we implemented a set of bioinformatic activities into a protein structure and function course of a graduate program. Concisely, students were given hands-on opportunities to explore the bioinformatics-based analyses of biomolecular data and structural biology via a semester-long case study structured as inquiry-based bioinformatics exercises. Towards the end of the term, the students also designed and presented an assignment project that allowed them to document the unknown protein that they identified using bioinformatic knowledge during the term.
    RESULTS: The post-module survey responses and students\' performances in the lab module imply that it furthered an in-depth knowledge of bioinformatics. Despite having not much prior knowledge of bioinformatics prior to taking this module students indicated positive feedback.
    CONCLUSIONS: The students got familiar with cross-indexed databases that interlink important data about proteins, enzymes as well as genes. The essential skillsets honed by this research-based bioinformatic pedagogical approach will empower students to be able to leverage this knowledge for their future endeavours in the bioinformatics field.






  • 文章类型: Journal Article
    During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human data, as well as transfer, storage, and sharing of enormous amounts of data. To promote data-driven biological research, the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station (K-BDS), which consists of multiple databases for individual data types. Here, we introduce the Korean Nucleotide Archive (KoNA), a repository of nucleotide sequence data. As of July 2022, the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects. To ensure data quality and prepare for international alignment, a standard operating procedure was adopted, which is similar to that of the International Nucleotide Sequence Database Collaboration. The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline, followed by manual examination. To ensure fast and stable data transfer, a high-speed transmission system called GBox is used in KoNA. Furthermore, the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express. This seamless coupling of KoNA, GBox, and Bio-Express enhances the data experience, including submission, access, and analysis of raw nucleotide sequences. KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics. The KoNA is available at






  • 文章类型: Journal Article
    The objective is to determine the complete nucleotide sequence and conduct a phylogenetic analysis of genome variants of the Puumala virus isolated in the Saratov region.
    METHODS: The samples for the study were field material collected in the Gagarinsky (formerly Saratovsky), Engelssky, Novoburassky and Khvalynsky districts of the Saratov region in the period from 2019 to 2022. To specifically enrich the Puumala virus genome in the samples, were used PCR and developed a specific primer panel. Next, the resulting PCR products were sequenced and the fragments were assembled into one sequence for each segment of the virus genome. To construct phylogenetic trees, the maximum parsimony algorithm was used.
    RESULTS: Genetic variants of the Puumala virus isolated in the Saratov region have a high degree of genome similarity to each other, which indicates their unity of origin. According to phylogenetic analysis, they all form a separate branch in the cluster formed by hantaviruses from other subjects of the Volga Federal District. The virus variants from the Republics of Udmurtia and Tatarstan, as well as from the Samara and Ulyanovsk regions, are closest to the samples from the Saratov region.
    CONCLUSIONS: The data obtained show the presence of a pronounced territorial confinement of strains to certain regions or areas that are the natural biotopes of their carriers. This makes it possible to fairly accurately determine the territory of possible infection of patients and/or the circulation of carriers of these virus variants based on the sequence of individual segments of their genome.
    Цель работы – определение полной нуклеотидной последовательности и проведение филогенетического анализа вариантов геномов вируса Пуумала, выделенных на территории Саратовской области. Материалы и методы. Образцами для исследования послужил полевой материал, собранный в Гагаринском (бывшем Саратовском), Энгельсском, Новобурасском и Хвалынском районах Саратовской области в период с 2019 по 2022 г. Для специфического обогащения генома вируса Пуумала в образцах использовали ПЦР и панель праймеров, подготовленную для данного исследования. Далее проводили секвенирование полученных продуктов реакции и сборку фрагментов в одну последовательность для каждого из сегментов генома вируса. При построении филогенетических деревьев применяли алгоритм maximum parsimony. Результаты. Показано, что генетические варианты вируса Пуумала, выделенные в Саратовской области, имеют высокую степень подобия генома, что говорит о единстве их происхождения. По данным филогенетического анализа, все выделенные варианты вируса (за исключением изолятов вируса из Хвалынского района) образуют обособленную ветвь в кластере, сформированном хантавирусами из других субъектов Приволжского федерального округа. Самыми близкими к образцам из Саратовской области являются варианты вируса из республик Удмуртия и Татарстан, а также из Самарской и Ульяновской областей. Заключение. Полученные данные указывают на наличие выраженной территориальной приуроченности штаммов к определенным регионам или областям, являющимся природными биотопами их носителей. Этот факт позволяет довольно точно определять территорию возможного инфицирования заболевших и/или циркуляцию переносчиков данных вариантов вируса по последовательности отдельных сегментов их генома..






  • 文章类型: English Abstract
    Introduction. During the development of the SARS-CoV-2 pandemic in Antioquia, we experienced epidemiological peaks related to the α, ɣ, β, ƛ, and δ variants. δ had the highest incidence and prevalence. This lineage is of concern due to its clinical manifestations and epidemiological characteristics. A total of 253 δ sublineages have been reported in the PANGOLIN database. The sublineage identification through genomic analysis has made it possible to trace their evolution and propagation. Objective. To characterize the genetic diversity of the different SARS-CoV-2 δ sublineages in Antioquia and to describe its prevalence. Materials and methods. We collected sociodemographic information from 2,675 samples, and obtained 1,115 genomes from the GISAID database between July 12th, 2021, and January 18th, 2022. From the analyzed genomes, 515 were selected because of their high coverage values (>90%) to perform phylogenetic analysis and to infer allele frequencies of mutations of interest. Results. We characterized 24 sublineages. The most prevalent was AY.25. Mutations of interest as L452R, P681R, and P681H were identified in this sublineage, comprising a frequency close to 0.99. Conclusions. This study identified that the AY.25 sublineage has a transmission advantage compared to the other δ sublineages. This attribute may be related to the presence of the L452R and P681R mutations associated in other studies with higher evasion of the immune system and less efficacy of drugs against SARS-CoV-2.
    Introducción. Durante el desarrollo de la pandemia por SARS-CoV-2 en Antioquia se presentaron picos epidemiológicos relacionados con las variantes α, ɣ, β, ƛ y δ, donde δ tuvo la mayor incidencia y prevalencia. Este linaje se considera una variante de preocupación dadas las manifestaciones clínicas que desencadena y sus características epidemiológicas. Se han informado 253 sublinajes δ en la base de datos PANGOLIN. La identificación de estos sublinajes mediante análisis genómico ha permitido rastrear su evolución y propagación. Objetivo. Caracterizar la diversidad genética de los diferentes sublinajes δ de SARSCoV-2 en Antioquia y determinar su prevalencia. Materiales y métodos. Se recopiló información sociodemográfica de 2.675 muestras y de 1.115 genomas del repositorio GISAID entre el 12 de julio de 2021 y el 18 de enero de 2022. Se seleccionaron 501 por su alto porcentaje de cobertura (>90 %) para realizar análisis filogenéticos e inferencia de frecuencias alélicas de mutaciones de interés. Resultados. Se caracterizaron 24 sublinajes donde el más prevalente fue AY.25. En este sublinaje se identificaron mutaciones de interés como L452R, P681R y P681H, que comprendían una frecuencia cercana a 0,99. Conclusiones. Este estudio permitió identificar que el sublinaje AY.25 tiene una ventaja de transmisión en comparación con los otros sublinajes δ. Esto puede estar relacionado con la presencia de las mutaciones L452R y P681R que en otros estudios se han visto asociadas con una mayor transmisibilidad, evasión del sistema inmunitario y menor eficacia de los medicamentos contra SARS-CoV-2.






  • 文章类型: Journal Article
    BACKGROUND: 1-methyladenosine (m1A) is a variant of methyladenosine that holds a methyl substituent in the 1st position having a prominent role in RNA stability and human metabolites.
    OBJECTIVE: Traditional approaches, such as mass spectrometry and site-directed mutagenesis, proved to be time-consuming and complicated.
    METHODS: The present research focused on the identification of m1A sites within RNA sequences using novel feature development mechanisms. The obtained features were used to train the ensemble models, including blending, boosting, and bagging. Independent testing and k-fold cross validation were then performed on the trained ensemble models.
    RESULTS: The proposed model outperformed the preexisting predictors and revealed optimized scores based on major accuracy metrics.
    CONCLUSIONS: For research purpose, a user-friendly webserver of the proposed model can be accessed through .






  • 文章类型: Journal Article
    The aim of the study in this article is to systematise the newly introduced strains of Lactobacillus based on determining the nucleotide sequence of a particular set of their genes (loci). The primary approach employed to address this issue involves conducting a laboratory experiment. During this experiment, a thorough examination was carried out on a set of organic compounds consisting of small DNA elements from the Lactobacillus genus. The Multilocus genotyping method served as the central technique, complemented by additional molecular-biological and population methods. These additional methods were utilized to determine the extent of phylogenetic similarity among pure cultures of Lactobacillus and to classify them accordingly. The article presents the gene isolates that were used for Multilocus typing; the number of L. casei isolates suitable for Multilocus genotyping was revealed; the gene alleles that allowed classifying L. casei isolates into five sequencing types were revealed; the effectiveness of genetic typing method for Multilocus sequencing was substantiated. The article is of practical value for microbiologists and geneticists in the field of molecular biology, as well as for technologists in the food industry. With the development of applied methods in genetic systematics, it has become possible to study pure culture of Lactobacillus species. The application of modern methods of genotypic classification of Lactobacillus species will make it possible to increase the efficiency of using better and safer products in the food industry and medicine.






  • 文章类型: Journal Article
    Human gene research studies that describe wrongly identified nucleotide sequence reagents have been mostly identified in journals of low to moderate impact factor, where unreliable findings could be considered to have limited influence on future research. This study examined whether papers describing wrongly identified nucleotide sequences are also published in high-impact-factor cancer research journals. We manually verified nucleotide sequence identities in original Molecular Cancer articles published in 2014, 2016, 2018, and 2020, including nucleotide sequence reagents that were claimed to target circRNAs. Using keywords identified in some 2018 and 2020 Molecular Cancer papers, we also verified nucleotide sequence identities in 2020 Oncogene papers that studied miRNA(s) and/or circRNA(s). Overall, 3.8% (251/6647) and 4.0% (47/1165) nucleotide sequences that were verified in Molecular Cancer and Oncogene papers, respectively, were found to be wrongly identified. Wrongly identified nucleotide sequences were distributed across 18% (91/500) original Molecular Cancer papers, including 38% (31/82) Molecular Cancer papers from 2020, and 40% (21/52) selected Oncogene papers from 2020. Original papers with wrongly identified nucleotide sequences were therefore unexpectedly frequent in two high-impact-factor cancer research journals, highlighting the risks of employing journal impact factors or citations as proxies for research quality.






  • 文章类型: Journal Article
    BACKGROUND: Vibrio species are among the autochthonous bacterial  populations found in surface waters and associated with various life-threatening extraintestinal diseases, especially in human populations with underlying illnesses and wound infections. Presently, very diminutive information exists regarding these species\' mutational diversity of virulence and resistance genes. This study evaluated variations in endonucleases and mutational diversity of the virulence and resistance genes of Vibrio isolates, harboring virulence-correlated gene (vcgCPI), dihydropteroate synthase type 1 and type II genes (Sul 1 and 11), (aadA) aminoglycoside (3\'\') (9) adenylyltransferase gene, (aac(3)-IIa, (aacC2)a, aminoglycoside N(3)-acetyltransferase III, and (strA) aminoglycoside 3\'-phosphotransferase resistance genes.
    METHODS: Using combinations of molecular biology techniques, bioinformatics tools, and sequence analysis.
    RESULTS: Our result revealed various nucleotide variations in virulence determinants of V. vulnificus (vcgCPI) at nucleotide positions (codon) 73-75 (A → G) and 300-302 (N → S). The aminoglycosides resistance gene (aadA) of Vibrio species depicts a nucleotide difference at position 482 (A → G), while the aminoglycosides resistance gene (sul 1 and 11) showed two variable regions of nucleotide polymorphism (102 and 140). The amino acid differences exist with the nucleotide polymorphism at position 140 (A → E). The banding patterns produced by the restriction enzymes HinP1I, MwoI, and StyD4I showed significant variations. Also, the restriction enzyme digestion of protein dihydropteroate synthase type 1 and type II genes (Sul 1 and 11) differed significantly, while enzymes DpnI and Hinf1 indicate no significant differences. The restriction enzyme NlaIV showed no band compared to reference isolates from the GenBank. However, the resistant determinants show significant point nucleotide mutation, which does not produce any amino acid change with diverse polymorphic regions, as revealed in the restriction digest profile.
    CONCLUSIONS: The described virulence and resistance determinants possess specific polymorphic locus relevant to pathogenomics studies, pharmacogenomic, and control of such water-associated strains.






  • 文章类型: Journal Article
    Infectious diseases of young and adult birds with respiratory syndrome are a significant deterrent to the development of industrial poultry farming due to decreased productivity and significant mortality. The only effective method of combating viral diseases is timely and targeted vaccination, which largely depends on laboratory diagnostic results.
    This article aims to study the real-time reverse transcription polymerase chain reaction, (RT-PCR) which has the prospect of more effective diagnosis of vaccine strains of chicken infectious bronchitis and Newcastle disease.
    The fastest and most accurate method for the differential diagnosis of pathogens in an associative viral infection is RT-PCR. The method proposed in the article for selecting primers for amplification made it possible to use this method for the simultaneous interspecies differential diagnosis of two or more viral agents, significantly accelerating their diagnosis.
    The correlation of the nucleotide sequence obtained from sequencing to a specific virus strain is complicated by the lack of a single nomenclature mechanism for separating genetic groups.
    The results of this study will allow easy and fast typing of sequences into known and databased virus strains and avoid further confusion in the nomenclature of genetic groups in the future.






  • 文章类型: Journal Article
    A recombinant form of pneumolysin from Streptococcus pneumoniae was obtained. By using Vector NTI Advance 11.0 bioinformatic analysis software, specific primers were designed in order to amplify the genome fragment of strain No. 3358 S. pneumoniae serotype 19F containing the nucleotide sequence encoding the full-length pneumolysin protein. A PCR product with a molecular weight corresponding to the nucleotide sequence of the S. pneumoniae genome fragment encoding the full-length pneumolysin was obtained. An expression system for recombinant pneumolysin in E. coli was constructed. Sequencing confirmed the identity of the inserted nucleotide sequence encoding the full-length recombinant pneumolysin synthesized in E. coli M15 strain. Purification of the recombinant protein was performed by affinity chromatography using Ni-Sepharose in 8 M urea buffer solution. Confirmation of the recombinant protein was performed by immunoblotting with monoclonal antibodies to pneumolysin.





