使用序列数据库进行异源表达研究的陷阱 - 技术综述。Pitfalls of using sequence databases for heterologous expression studies - a technical review.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

Synthesis of DNA fragments based on gene sequences that are available in public resources has become an efficient and affordable method that has gradually replaced traditional cloning efforts such as PCR cloning from cDNA. However, database entries based on genome sequencing results are prone to errors which can lead to false sequence information and, ultimately, errors in functional characterisation of proteins such as ion channels and transporters in heterologous expression systems. We have identified five common problems that repeatedly appear in public resources: (1) Not every gene has yet been annotated; (2) not all gene annotations are necessarily correct; (3) transcripts may contain automated corrections; (4) there are mismatches between gene, mRNA and protein sequences; and (5) splicing patterns often lack experimental validation. This technical review highlights and provides a strategy to bypass these issues in order to avoid critical mistakes that could impact future studies of any gene/protein of interest in heterologous expression systems.

摘要：

基于公共资源中可用的基因序列合成DNA片段已成为一种有效且负担得起的方法，可逐渐取代传统的克隆工作，例如从cDNA进行PCR克隆。然而,基于基因组测序结果的数据库条目容易出现错误，这可能导致错误的序列信息，最终,蛋白质功能表征中的错误，例如异源表达系统中的离子通道和转运蛋白。我们已经确定了在公共资源中反复出现的五个常见问题：1）并非每个基因都已注释；2）并非所有基因注释都一定正确；3）转录本可能包含自动校正；4）基因之间存在错配，mRNA和蛋白质序列；和5)剪接模式通常缺乏实验验证。本技术综述突出并提供了绕过这些问题的策略，以避免可能影响异源表达系统中任何感兴趣的基因/蛋白质的未来研究的严重错误。摘要图例涉及异源基因表达的项目通常具有相似的步骤。最初，数据库研究(A)是检索感兴趣基因的全部部分序列的信息所必需的。许多基因组装配被注释并保存在公共数据库中,或者可用于使用个体序列信息的精细搜索选项。需要仔细检查搜索结果，并将其与现有信息（B）进行比较。一旦序列被确定,通过PCR的DNA合成(C)或商业合成对于进一步的克隆程序(D)是必需的。最终,DNA需要被转染（E）并表达，例如，真核细胞(F)。最后,需要记录感兴趣基因的表达并分析其功能(G)。本文受版权保护。保留所有权利。