背景:流感病毒的持续出现凸显了公共数据库和相关生物信息学分析工具在研究人类和动物模型中由不同流感病毒感染引起的转录组变化中的价值。
方法:我们在公共数据库(GEO和ArrayExpress)中收集了大量与流感病毒感染的人类和动物模型相关的转录组研究数据,并提取和集成数组和元数据。通过严格的质量控制,生成基因表达矩阵,balance,标准化,批量校正,和基因注释。然后我们分析了不同物种的基因表达,病毒,细胞/组织或抗体/疫苗治疗后,并将样品元数据和基因表达数据集导入数据库。
结果:总体而言,保持仔细的加工和质量控制,我们从103个独立数据集中收集了8064个样本,并构建了流感病毒的比较转录组学数据库,命名为Flu-CED数据库(流感比较表达数据库,https://flu.com-med.org。cn/)。使用整合和处理的转录组数据,我们建立了一个用户友好的网站来实现集成,在线检索,可视化,探索不同物种流感病毒感染的基因表达及差异相关基因的生物学功能。Flu-CED可以快速查询单基因和多基因表达谱,结合不同的实验条件进行比较转录组分析,识别比较组之间的差异表达基因(DEGs),并方便地找到DEG。
结论:Flu-CED为分析感染流感病毒的人类和动物模型的基因表达提供了数据资源和工具,可以加深我们对疾病发生和发展机制的理解。并能够预测可用于医学研究的关键基因或治疗靶标。
BACKGROUND: The continuing emergence of influenza virus has highlighted the value of public databases and related bioinformatic analysis tools in investigating transcriptomic change caused by different influenza virus infections in human and animal models.
METHODS: We collected a large amount of transcriptome research data related to influenza virus-infected human and animal models in public databases (GEO and ArrayExpress), and extracted and integrated array and metadata. The gene expression matrix was generated through strictly quality control, balance, standardization, batch correction, and gene annotation. We then analyzed gene expression in different species, virus, cells/tissues or after antibody/vaccine treatment and imported sample metadata and gene expression datasets into the database.
RESULTS: Overall, maintaining careful processing and quality control, we collected 8064 samples from 103 independent datasets, and constructed a comparative transcriptomics database of influenza virus named the Flu-CED database (Influenza comparative expression database, https://flu.com-med.org.cn/). Using integrated and processed transcriptomic data, we established a user-friendly website for realizing the integration, online retrieval, visualization, and exploration of gene expression of influenza virus infection in different species and the biological functions involved in differential genes. Flu-CED can quickly query single and multi-gene expression profiles, combining different experimental conditions for comparative transcriptome analysis, identifying differentially expressed genes (DEGs) between comparison groups, and conveniently finding DEGs.
CONCLUSIONS: Flu-CED provides data resources and tools for analyzing gene expression in human and animal models infected with influenza virus that can deepen our understanding of the mechanisms underlying disease occurrence and development, and enable prediction of key genes or therapeutic targets that can be used for medical research.