Molecular Sequence Data

分子序列数据
  • 文章类型: Evaluation Study
    BACKGROUND: Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.
    METHODS: In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.
    RESULTS: Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.
    CONCLUSIONS: Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    BACKGROUND: Oenococcus oeni is a lactic acid bacterium that is specialised for growth in the ecological niche of wine, where it is noted for its ability to perform the secondary, malolactic fermentation that is often required for many types of wine. Expanding the understanding of strain-dependent genetic variations in its small and streamlined genome is important for realising its full potential in industrial fermentation processes.
    RESULTS: Whole genome comparison was performed on 191 strains of O. oeni; from this rich source of genomic information consensus pan-genome assemblies of the invariant (core) and variable (flexible) regions of this organism were established. Genetic variation in amino acid biosynthesis and sugar transport and utilisation was found to be common between strains. Furthermore, we characterised previously-unreported intra-specific genetic variations in the natural competence of this microbe.
    CONCLUSIONS: By assembling a consensus pan-genome from a large number of strains, this study provides a tool for researchers to readily compare protein-coding genes across strains and infer functional relationships between genes in conserved syntenic regions. This establishes a foundation for further genetic, and thus phenotypic, research of this industrially-important species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Many food bioactive peptides with diverse functions have been discovered by studying plant proteins. We have previously identified a 68-residue insulin receptor (IR)-binding protein (mcIRBP) from Momordica charantia that exhibits hypoglycemic effects in mice via interaction with IR. By in vitro digestion, we found that mcIRBP-19, spanning residues 50-68 of mcIRBP, enhanced the binding of insulin to IR, stimulated the phosphorylation of PDK1 and Akt, induced the expression of glucose transporter 4, and stimulated both the uptake of glucose in cells and the clearance of glucose in diabetic mice. Furthermore, mcIRBP-19 homologs were present in various plants and shared similar β-hairpin structures and IR kinase-activating abilities to mcIRBP-19. In conclusion, our findings suggested that mcIRBP-19 is a blood glucose-lowering bioactive peptide that exhibits IR-binding potentials. Moreover, we newly identified novel IR-binding bioactive peptides in various plants which belonged to different taxonomic families.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Case Reports
    De novo mutations that contribute to rare Mendelian diseases, including neurological disorders, have been recently identified. Whole-exome sequencing (WES) has become a powerful tool for the identification of inherited and de novo mutations in Mendelian diseases. Two important guidelines were recently published regarding the investigation of causality of sequence variant in human disease and the interpretation of novel variants identified in human genome sequences. In this study, a family with supposed movement disorders was sequenced via WES (including the proband and her unaffected parents), and a standard investigation and interpretation of the identified variants was performed according to the published guidelines. We identified a novel de novo mutation (c.2327C > T, p.P776L) in DYNC1H1 gene and confirmed that it was the causal variant. The phenotype of the affected twins included delayed motor milestones, pes cavus, lower limb weakness and atrophy, and a waddling gait. Electromyographic (EMG) recordings revealed typical signs of chronic denervation. Our study demonstrates the power of WES to discover the de novo mutations associated with a neurological disease on the whole exome scale, and guidelines to conduct WES studies and interpret of identified variants are a preferable option for the exploration of the pathogenesis of rare neurological disorders.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Traf2- and Nck-interacting kinase (TNIK) is a serine/threonine kinase highly expressed in the brain and enriched in the postsynaptic density of glutamatergic synapses in the mammalian brain. Accumulating genetic evidence and functional data have implicated TNIK as a risk factor for psychiatric disorders. However, the endogenous substrates of TNIK in neurons are unknown. Here, we describe a novel selective small molecule inhibitor of the TNIK kinase family. Using this inhibitor, we report the identification of endogenous neuronal TNIK substrates by immunoprecipitation with a phosphomotif antibody followed by mass spectrometry. Phosphorylation consensus sequences were defined by phosphopeptide sequence analysis. Among the identified substrates were members of the delta-catenin family including p120-catenin, δ-catenin, and armadillo repeat gene deleted in velo-cardio-facial syndrome (ARVCF), each of which is linked to psychiatric or neurologic disorders. Using p120-catenin as a representative substrate, we show TNIK-induced p120-catenin phosphorylation in cells requires intact kinase activity and phosphorylation of TNIK at T181 and T187 in the activation loop. Addition of the small molecule TNIK inhibitor or knocking down TNIK by two shRNAs reduced endogenous p120-catenin phosphorylation in cells. Together, using a TNIK inhibitor and phosphomotif antibody, we identify endogenous substrates of TNIK in neurons, define consensus sequences for TNIK, and suggest signaling pathways by which TNIK influences synaptic development and function linked to psychiatric and neurologic disorders.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Thermostable variants of the Cellulomonas sp. NT3060 glycerol kinase have been constructed by through the introduction of ancestral-consensus mutations. We produced seven mutants, each having an ancestral-consensus amino acid residue that might be present in the common ancestors of both bacteria and of archaea, and that appeared most frequently at the position of 17 glycerol kinase sequences in the multiple sequence alignment. The thermal stabilities of the resulting mutants were assessed by determining their melting temperatures (Tm), which was defined as the temperature at which 50% of the initial catalytic activity is lost after 15 min of incubation, as well as when the half-life of the catalytic activity occurs at a temperature of 60°C (t1/2). Three mutants showed increased stabilities compared to the wild-type protein. We then produced five more mutants with multiple amino acid substitutions. Some of the resulting mutants showed thermal stabilities much greater than those expected given the stabilities of the respective mutants with single mutations. Therefore, the effects of mutations are not always simply additive and some amino acid substitutions, which do not affect or only slightly improve stability when individually introduced into the protein, show substantial stabilizing effects in combination with other mutations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    丙型肝炎病毒(HCV)的非结构4B(NS4B)蛋白是最近参与膜网形成的疏水性蛋白,复制复合物形成的平台,因此是抗病毒药物的潜在靶标。CLC主工作台用于生成基因型特异性共有序列,全球共有序列和来自来自世界各地报道的7种不同HCV基因型的非结构4B(NS4B)蛋白序列的代表性系统发育树。NS4B蛋白的C端结构域(CTD),尤其是与ER膜相互作用的残基被发现是高度保守的。发现在所有HCV基因型中高度保守的其他残基包括:N末端结构域(NTD)的5个芳香族残基(F49,W50,W55,F57和Y63),3个疏水亮氨酸残基(L237,L240,L245),和2个带正电荷的CTD残基(R248和H250),跨膜结构域3(TMD3)(G143YGAG147)及其周围残基(F118和F155)和TMD1Ser/Thr簇残基(T87,S88和T95)的二聚化基序参与氢(H)键相互作用。总之,NTD的氨基酸,参与NS4B的膜结合/锚定和膜网形成的TMD和CTD结构域是高度保守的,并且可以用作抗病毒剂和肽疫苗的潜在靶标。这些保守的残基构成了开发五种被提议作为潜在治疗靶标的短肽的基础。对于3a巴基斯坦分离株的NS4B序列,系统发育分析特别有趣。高度的变异性阻止了巴基斯坦分离株与系统发育树中其他序列的聚类,揭示地理差异。
    The non-structural 4B (NS4B) protein of hepatitis C virus (HCV) is a hydrophobic protein implicated recently in the formation of membranous web, a platform for the formation of replication complex and thus is potential target for antivirals. The CLC main workbench was used to generate genotype-specific consensus sequence, global consensus sequence and a representative phylogenetic tree from non-structural 4 B (NS4B) protein sequences of seven different HCV genotypes reported from all over the world. The C-terminal domain (CTD) of NS4B protein especially the residues involved in interaction with ER membrane were found to be highly conserved. Other residues found to be highly conserved across all HCV genotypes included; 5 aromatic residues of N-terminal domain (NTD) (F49, W50, W55, F57, and Y63), 3 hydrophobic leucine residues (L237, L240, L245), and 2 positively charged residues of CTD (R248 and H250), dimerization motif of transmembrane domain 3 (TMD3) (G143YGAG147) and its surrounding residues (F118 and F155) and TMD1 Ser/Thr cluster residues (T87, S88 and T95) involved in the hydrogen (H) bond interactions. In short, amino acids of NTD, TMD and CTD domains involved in the membrane association/anchoring of NS4B and formation of membranous web are highly conserved and can serve as potential targets for antivirals and peptide vaccines. These conserved residues formed the basis for the development of five short peptides proposed to serve as potential therapeutic target. The phylogenetic analysis was particularly interesting for NS4B sequences of 3a Pakistani isolates. The high degree of variability prevented the clustering of Pakistani isolates with other sequences in phylogenetic tree, revealing geographical disparity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Heterosexual transmission of HIV-1 is characterized by a genetic bottleneck that selects a single viral variant, the transmitted/founder (TF), during most transmission events. To assess viral characteristics influencing HIV-1 transmission, we sequenced 167 near full-length viral genomes and generated 40 infectious molecular clones (IMC) including TF variants and multiple non-transmitted (NT) HIV-1 subtype C variants from six linked heterosexual transmission pairs near the time of transmission. Consensus-like genomes sensitive to donor antibodies were selected for during transmission in these six transmission pairs. However, TF variants did not demonstrate increased viral fitness in terms of particle infectivity or viral replicative capacity in activated peripheral blood mononuclear cells (PBMC) and monocyte-derived dendritic cells (MDDC). In addition, resistance of the TF variant to the antiviral effects of interferon-α (IFN-α) was not significantly different from that of non-transmitted variants from the same transmission pair. Thus neither in vitro viral replicative capacity nor IFN-α resistance discriminated the transmission potential of viruses in the quasispecies of these chronically infected individuals. However, our findings support the hypothesis that within-host evolution of HIV-1 in response to adaptive immune responses reduces viral transmission potential.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    BACKGROUND: Proteins often recognize their interaction partners on the basis of short linear motifs located in disordered regions on proteins\' surface. Experimental techniques that study such motifs use short peptides to mimic the structural properties of interacting proteins. Continued development of these methods allows for large-scale screening, resulting in vast amounts of peptide sequences, potentially containing information on multiple protein-protein interactions. Processing of such datasets is a complex but essential task for large-scale studies investigating protein-protein interactions.
    RESULTS: The software tool presented in this article is able to rapidly identify multiple clusters of sequences carrying shared specificity motifs in massive datasets from various sources and generate multiple sequence alignments of identified clusters. The method was applied on a previously published smaller dataset containing distinct classes of ligands for SH3 domains, as well as on a new, an order of magnitude larger dataset containing epitopes for several monoclonal antibodies. The software successfully identified clusters of sequences mimicking epitopes of antibody targets, as well as secondary clusters revealing that the antibodies accept some deviations from original epitope sequences. Another test indicates that processing of even much larger datasets is computationally feasible.
    METHODS: Hammock is published under GNU GPL v. 3 license and is freely available as a standalone program (from http://www.recamo.cz/en/software/hammock-cluster-peptides/) or as a tool for the Galaxy toolbox (from https://toolshed.g2.bx.psu.edu/view/hammock/hammock). The source code can be downloaded from https://github.com/hammock-dev/hammock/releases.
    BACKGROUND: muller@mou.cz
    BACKGROUND: Supplementary data are available at Bioinformatics online.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    A number of classes of proteins have been engineered for high stability using consensus sequence design methods. Here we describe the engineering of a novel albumin binding domain (ABD) three-helix bundle protein. The resulting engineered ABD molecule, called ABDCon, is expressed at high levels in the soluble fraction of Escherichia coli and is highly stable, with a melting temperature of 81.5°C. ABDCon binds human, monkey and mouse serum albumins with affinity as high as 61 pM. The solution structure of ABDCon is consistent with the three-helix bundle design and epitope mapping studies enabled a precise definition of the albumin binding interface. Fusion of a 10 kDa scaffold protein to ABDCon results in a long terminal half-life of 60 h in mice and 182 h in cynomolgus monkeys. To explore the link between albumin affinity and in vivo exposure, mutations were designed at the albumin binding interface of ABDCon yielding variants that span an 11 000-fold range in affinity. The PK properties of five such variants were determined in mice in order to demonstrate the tunable nature of serum half-life, exposure and clearance with variations in albumin binding affinity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号