normalization

归一化
  • 文章类型: Journal Article
    转录组谱是化合物的代表性的基于表型的描述符,因其有效捕获复合效应的能力而广受认可。然而,批量差异的存在是不可避免的。尽管存在复杂的统计方法,他们中的许多人假设样本量很大。我们应该如何设计转录组分析来获得强大的化合物谱,特别是在实际场景中经常遇到的小数据集的背景下?这项研究通过调查转录组概况的归一化程序来解决这个问题,重点关注用于推导生物反应作为概况的基线分布。首先,我们调查了两个大型基因芯片数据集,比较不同归一化程序的影响。通过评估每个数据集内生物重复的反应谱之间的相似性和跨数据集的同一化合物的反应谱之间的相似性,我们发现,在批次校正条件下,由每个批次内的所有样本定义的基线分布是大型数据集的良好选择。随后,我们进行了一项模拟,以探讨对照样本数量对数据集响应曲线鲁棒性的影响.结果为确定小型数据集的对照样品的合适数量提供了见解。至关重要的是要承认这些结论来自受约束的数据集。然而,我们相信,这项研究增强了我们对如何有效利用化合物的转录组概况的理解,并促进了这些概况的实际应用的基本知识的积累。
    The transcriptome profile is a representative phenotype-based descriptor of compounds, widely acknowledged for its ability to effectively capture compound effects. However, the presence of batch differences is inevitable. Despite the existence of sophisticated statistical methods, many of them presume a substantial sample size. How should we design a transcriptome analysis to obtain robust compound profiles, particularly in the context of small datasets frequently encountered in practical scenarios? This study addresses this question by investigating the normalization procedures for transcriptome profiles, focusing on the baseline distribution employed in deriving biological responses as profiles. Firstly, we investigated two large GeneChip datasets, comparing the impact of different normalization procedures. Through an evaluation of the similarity between response profiles of biological replicates within each dataset and the similarity between response profiles of the same compound across datasets, we revealed that the baseline distribution defined by all samples within each batch under batch-corrected condition is a good choice for large datasets. Subsequently, we conducted a simulation to explore the influence of the number of control samples on the robustness of response profiles across datasets. The results offer insights into determining the suitable quantity of control samples for diminutive datasets. It is crucial to acknowledge that these conclusions stem from constrained datasets. Nevertheless, we believe that this study enhances our understanding of how to effectively leverage transcriptome profiles of compounds and promotes the accumulation of essential knowledge for the practical application of such profiles.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    确定标准权重在多标准决策分析中起着至关重要的作用。熵是信息科学中的一个重要指标,几种多准则决策方法利用熵权法(EWM)。在文学中,可以找到两种确定熵权法的方法。一个涉及在计算熵值之前进行归一化,而第二个没有。本文研究了基于熵的权重和Hellwig方法的归一化效果。为了比较EWM和Hellwig方法中各种归一化方法的影响,分析了一项评估2021年欧盟国家在教育领域可持续发展的研究。该研究使用了欧盟统计局的数据,这些数据与欧洲国家实现可持续发展目标4有关。观察到向量归一化和和归一化没有改变基于熵的权重。在案例研究中,最大-最小归一化影响EWM权重。同时,这些权重对实现可持续发展目标4的国家最终排名影响很小,由Hellwig的方法确定。将结果与等权重的Hellwig方法获得的结果进行比较。模拟研究是通过修改欧盟统计局的数据来进行的,以调查标准之间发现的不同归一化关系如何影响基于熵的权重和Hellwig的方法结果。
    Determining criteria weights plays a crucial role in multi-criteria decision analyses. Entropy is a significant measure in information science, and several multi-criteria decision-making methods utilize the entropy weight method (EWM). In the literature, two approaches for determining the entropy weight method can be found. One involves normalization before calculating the entropy values, while the second does not. This paper investigates the normalization effect for entropy-based weights and Hellwig\'s method. To compare the influence of various normalization methods in both the EWM and Hellwig\'s method, a study evaluating the sustainable development of EU countries in the education area in the year 2021 was analyzed. The study used data from Eurostat related to European countries\' realization of the SDG 4 goal. It is observed that vector normalization and sum normalization did not change the entropy-based weights. In the case study, the max-min normalization influenced EWM weights. At the same time, these weights had only a very weak impact on the final rankings of countries with respect to achieving the SDG 4 goal, as determined by Hellwig\'s method. The results are compared with the outcome obtained by Hellwig\'s method with equal weights. The simulation study was conducted by modifying Eurostat data to investigate how the different normalization relationships discovered among the criteria affect entropy-based weights and Hellwig\'s method results.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究旨在开发一种有效的基于虚拟现实(VR)的干预计划,以改善2023年Kahramanmaraš地震幸存者的创伤症状。
    符合这一目标,该研究的样本包括年龄在15-72岁之间的34名地震幸存者(平均值:38.09,标准偏差(SD):15.09),他们直接受到2023年2月6日Kahramanmaraš地震的影响。五阶段干预计划(正常化,重新解释,创造一个安全的地方,制定以问题为重点的应对策略,和社会支持)应用于17名参与者(平均值:36.88,标准差:13.65),组成干预小组的人,使用VR技术。分配到干预组的所有参与者都接受了干预,其中包括正常化,重新解释地震,创造一个安全的地方,以问题为中心的应对,增加社会支持,一次以标准化的方式。在重新解释的阶段,创建一个安全的地方和以问题为中心的应对VR技术被使用,正常化和增加社会支持的阶段是通过涉及研究者和参与者之间一对一互动的心理治疗工作进行的.五阶段干预计划在2月6日的Kahramanmara地震后51天开始实施,干预的所有阶段均在7天内完成。在两个不同的时间对参与者进行测量:干预前测试和干预后测试。对照组的17名参与者(平均:39.29,SD:16.75)被放在等待名单上。数据使用“社会人口统计信息表”收集,“创伤后成长清单”,“确定震后创伤程度的量表”和“应对量表的方式”。
    在干预之前,这些群体在创伤后成长方面受到控制,震后创伤水平,宿命论应对,寻求社会支持的应对,和无助风格的应对水平之间没有观察到差异(p>0.05)。干预之后,结果发现,接受VR支持干预的地震幸存者的创伤后成长和寻求社会支持应对得分明显高于对照组,和地震后的创伤程度,宿命式应对和无助式应对得分明显低于对照组(p<0.05)。作为组内分析的结果,可以看出,创伤后的成长,与干预前相比,VR支持干预后干预组参与者寻求社会支持的应对和以问题为中心的应对得分在统计学上显著增加,而地震后的创伤水平,与干预前相比,宿命性应对和无助风格应对得分明显下降(p<0.05)。然而,可以看出,对照组参与者的得分来自应对方式量表的所有其他量表,除了宿命论的应对量表,差异无统计学意义(p>0.05)。
    作为分析的结果,可以看出,开发的VR支持的干预计划可有效改善地震幸存者的创伤症状。由于制定了干预措施,地震幸存者的创伤水平迅速且具有统计学意义,这表明相关干预措施可应用于其他创伤领域,并建议进行进一步研究。
    UNASSIGNED: This study aimed to develop an effective virtual reality (VR)-based intervention program to improve trauma symptoms of survivors of the 2023 Kahramanmaraş earthquake.
    UNASSIGNED: In line with this aim, the sample of the study consisted of 34 earthquake survivors aged 15-72 years (mean: 38.09, standard deviation (SD): 15.09) who were directly affected by the Kahramanmaraş earthquake on February 6, 2023. A five-stage intervention program (normalization, reinterpretation, creating a safe place, developing problem-focused coping strategies, and social support) was applied to 17 participants (mean: 36.88, SD: 13.65), who constituted the intervention group, using VR technology. All participants assigned to the intervention group received the intervention, which included normalization, reinterpreting the earthquake, creating a safe place, problem-focused coping, and increasing social support, one time in a standardized manner. In the stages of reinterpretation, creating a safe place and problem-focused coping VR technology was used and, the stages of normalization and increasing social support were carried out with psychotherapeutic work involving one-to-one interaction between the researcher and the participant. The five-stage intervention program started to be implemented 51 days after the February 6 Kahramanmaraş earthquakes and all stages of the intervention were completed within seven days. Measurements were taken from the participants at two different times: pre-intervention pre-test and post-intervention post-test. The 17 participants in the control group (mean: 39.29, SD: 16.75) were placed on a waiting list. Data were collected using the \"Sociodemographic Information Form\", \"Posttraumatic Growth Inventory\", \"Scale for Determining the Level of Post-Earthquake Trauma\" and \"Ways of Coping Scale\".
    UNASSIGNED: Before the intervention, the groups were controlled in terms of posttraumatic growth, post-earthquake trauma level, fatalistic coping, social support-seeking coping, and helplessness style coping levels and no difference was observed between them (p>0.05). After the intervention, it was found that the posttraumatic growth and social support-seeking coping scores of the earthquake survivors who received VR-supported intervention were significantly higher than the scores of the control group, and the post-earthquake trauma level, fatalistic coping and helplessness style coping scores were significantly lower than the control group scores (p<0.05). As a result of the in-group analyses, it is seen that the post-traumatic growth, social support-seeking coping and problem-focused coping scores of the intervention group participants after the VR-supported intervention increased statistically significantly compared to the pre-intervention, while the post-earthquake trauma level, fatalistic coping and helplessness style coping scores decreased statistically significantly compared to the pre-intervention (p<0.05). However, it is seen that the scores of the control group participants from all other scales of the Ways of Coping Scale, except for the fatalistic coping subscale, did not change statistically significantly (p>0.05).
    UNASSIGNED: As a result of the analysis, it is seen that the VR-supported intervention program developed is effective in improving the trauma symptoms of earthquake survivors. The rapid and statistically significant reduction in the trauma levels of earthquake survivors as a result of the developed intervention shows that the relevant intervention can be applied in other trauma areas and suggested for further studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    评估脾(ADC脾)和椎旁肌(ADC肌肉)的表观扩散系数(ADC)测量的变异性,以鉴定用于使来自腹部扩散加权成像(DWI)的ADC归一化的参考器官。
    两台MRI扫描仪,GE的腹部检查314次,西门子系统的腹部检查929次,用于MRI检查,包括DWI(b值,50和800s/mm2)。对于西门子系统上的73项考试的子集,进行了第二次考试。在每次检查中放置四个感兴趣区域(ROI)以测量ADC脾和双侧ADC肌肉。患者之间的ADC变异性(分别在每个扫描仪上),由于每个器官中两个ROI之间的ROI放置,ADC可变性,并评估了第一次和第二次检查之间子集中的变异性。
    在可比性上,ADC脾脏比ADC肌肉更加分散和可变(对于两个MRI扫描仪,n=929和314,分别)和重复性(n=73)数据集。ADC脾的Bland-Altmann偏见和协议极限(LoAs)(ICC,0.47;CV,0.070)和ADCmuscum(ICC,0.67;CV,0.023)在重复性数据集(n=73)中分别为-0.1(-25.7%-25.6%)和-0.3(-8.8%-8.1%),分别。对于西门子系统,ADC脾脏的Bland-Altmann偏见和LoAs(ICC,0.72;CV,0.061)和ADCmuscum(ICC,0.53;CV,0.030)在可比性数据集(n=929)中分别为2.1(-20.0%-24.2%)和0.7(-10.0%-11.4%),分别。在GE系统中发现了类似的发现(n=314)。在可重复性和可比性分析中,ADC肌肉测量的CV均低于ADC脾脏的CV(所有p<0.001)。
    在估计腹部DWI的ADC变异性方面,椎旁肌表现出比脾脏更好的参考特征。
    UNASSIGNED: Evaluation of the variabilities in apparent diffusion coefficient (ADC) measurements of the spleen (ADCspleen) and the paraspinal muscles (ADCmuscle) to identify the reference organ for normalizing the ADC from the abdominal diffusion weighted imaging (DWI).
    UNASSIGNED: Two MRI scanners, with 314 abdominal exams on the GE and 929 on the Siemens system, were used for MRI examinations including DWI (b-values, 50 and 800 s/mm2). For a subset of 73 exams on the Siemens system a second exam was conducted. Four regions of interest (ROIs) in each exam were placed to measure the ADCspleen and the bilateral ADCmuscle. ADC variability between patients (on each scanner separately), ADC variability due to ROI placement between the two ROIs in each organ, and variability in the subset between the first and second exams were assessed.
    UNASSIGNED: The ADCspleen was more scattered and variable than the ADCmuscle in the comparability (n = 929 and 314 for two MRI scanners, respectively) and repeatability (n = 73) datasets. The Bland-Altmann bias and limits of agreement (LoAs) for the ADCspleen (ICC, 0.47; CV, 0.070) and ADCmuscle (ICC, 0.67; CV, 0.023) in the repeatability datasets (n = 73) were -0.1 (-25.7%-25.6%) and -0.3 (-8.8%-8.1%), respectively. For the Siemens system, the Bland-Altmann bias and LoAs for the ADCspleen (ICC, 0.72; CV, 0.061) and ADCmuscle (ICC, 0.53; CV, 0.030) in the comparability datasets (n = 929) were 2.1 (-20.0%-24.2%) and 0.7 (-10.0%-11.4%), respectively. Similar findings have been found in the GE system (n = 314). The CVs for the ADCmuscle measurements were lower than those of the ADCspleen both in the repeatability and the comparability analyses (all p < 0.001).
    UNASSIGNED: Paraspinal muscles demonstrate better reference characteristics than the spleen in estimating ADC variability of abdominal DWI.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    高通量测序技术使研究人员能够从各种环境中分析微生物群落,但是多变量分类群计数数据的分析仍然具有挑战性。我们开发了具有零通货膨胀的贝叶斯非参数(BNP)回归模型,以分析来自微生物组研究的多变量计数数据。BNP方法灵活地模拟微生物与协变量的关联,如环境因素和临床特征。该模型产生概率分布的估计值,这些概率分布将微生物多样性和差异丰度与协变量联系起来,并促进社区比较,而不是简单的统计测试提供的比较。我们将该模型与模拟研究中更简单的模型和流行的替代方案进行比较,显示,除了这些额外的社区层面的见解,它在各种设置中产生优异的参数估计和模型拟合。通过将其应用于慢性伤口微生物组数据集和人类微生物组项目数据集来证明模型的实用性,用于比较不同环境中存在的微生物群落。
    High-throughput sequencing technology has enabled researchers to profile microbial communities from a variety of environments, but analysis of multivariate taxon count data remains challenging. We develop a Bayesian nonparametric (BNP) regression model with zero inflation to analyse multivariate count data from microbiome studies. A BNP approach flexibly models microbial associations with covariates, such as environmental factors and clinical characteristics. The model produces estimates for probability distributions which relate microbial diversity and differential abundance to covariates, and facilitates community comparisons beyond those provided by simple statistical tests. We compare the model to simpler models and popular alternatives in simulation studies, showing, in addition to these additional community-level insights, it yields superior parameter estimates and model fit in various settings. The model\'s utility is demonstrated by applying it to a chronic wound microbiome data set and a Human Microbiome Project data set, where it is used to compare microbial communities present in different environments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人体免疫系统中的白细胞(WBC)可以抵御感染并保护身体免受外部危险物体的侵害。它们由中性粒细胞组成,嗜酸性粒细胞,嗜碱性粒细胞,单核细胞,和淋巴细胞,其中每个占不同的百分比,并执行特定的功能。传统上,用于量化特定类型的白细胞的临床实验室程序是全血细胞计数(CBC)测试的组成部分,这有助于监测人们的健康。随着深度学习的进步,使用各种算法,可以在更少的时间和高精度的血液胶片图像进行分类。本文基于CNN架构开发了许多最先进的深度学习模型及其变体。基于精度的模型性能比较研究,F1分数,召回,精度,参数数量,时间进行了,和DenseNet161被发现在其同行中表现出优异的性能。此外,先进的优化技术,如归一化,混合增强,DenseNet上还采用了标签平滑技术,以进一步完善其性能。
    White blood cells (WBCs) in the human immune system defend against infection and protect the body from external hazardous objects. They are comprised of neutrophils, eosinophils, basophils, monocytes, and lymphocytes, whereby each accounts for a distinct percentage and performs specific functions. Traditionally, the clinical laboratory procedure for quantifying the specific types of white blood cells is an integral part of a complete blood count (CBC) test, which aids in monitoring the health of people. With the advancements in deep learning, blood film images can be classified in less time and with high accuracy using various algorithms. This paper exploits a number of state-of-the-art deep learning models and their variations based on CNN architecture. A comparative study on model performance based on accuracy, F1-score, recall, precision, number of parameters, and time was conducted, and DenseNet161 was found to demonstrate a superior performance among its counterparts. In addition, advanced optimization techniques such as normalization, mixed-up augmentation, and label smoothing were also employed on DenseNet to further refine its performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    非靶向液相色谱-质谱代谢组学研究通常在大致相同的实验设置下进行。用不同的LC-MS方案或在延长的时间间隔后获得的测量值由于色谱改变而在保留时间和光谱丰度方面存在显著变化。光谱,和其他因素,提出了许多数据分析挑战。我们开发了一种计算工作流程,用于合并和协调在不同LC-MS条件下获得的代谢组学数据。使用不同的仪器和LC-MS程序从两组相隔三年的母体受试者中收集血浆代谢物谱。使用metabCombiner比对代谢组学特征以生成在所有实验批次中检测到的化合物的列表。我们应用了数据集特定的归一化方法来消除光谱强度的批次间和周期间变化,能够对组装的数据矩阵进行统计分析。生物信息学分析显示,在妊娠的前三个月和第三个三个月之间以及母体血浆和脐带血之间,母体血浆发生了大规模的代谢变化。我们观察到从孕早期到妊娠期间类固醇激素和游离脂肪酸的增加,随着氨基酸的减少与脐带血水平的增加。这项工作证明了整合非相同获得的LC-MS代谢组学数据的可行性及其在非常规代谢组学研究设计中的实用性。
    Untargeted liquid chromatography-mass spectrometry metabolomics studies are typically performed under roughly identical experimental settings. Measurements acquired with different LC-MS protocols or following extended time intervals harbor significant variation in retention times and spectral abundances due to altered chromatographic, spectrometric, and other factors, raising many data analysis challenges. We developed a computational workflow for merging and harmonizing metabolomics data acquired under disparate LC-MS conditions. Plasma metabolite profiles were collected from two sets of maternal subjects three years apart using distinct instruments and LC-MS procedures. Metabolomics features were aligned using metabCombiner to generate lists of compounds detected across all experimental batches. We applied data set-specific normalization methods to remove interbatch and interexperimental variation in spectral intensities, enabling statistical analysis on the assembled data matrix. Bioinformatics analyses revealed large-scale metabolic changes in maternal plasma between the first and third trimesters of pregnancy and between maternal plasma and umbilical cord blood. We observed increases in steroid hormones and free fatty acids from the first trimester to term of gestation, along with decreases in amino acids coupled to increased levels in cord blood. This work demonstrates the viability of integrating nonidentically acquired LC-MS metabolomics data and its utility in unconventional metabolomics study designs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    影像组学在精准诊断和癌症治疗方面正在迅速发展。然而,在转化为临床应用之前,有几个挑战需要解决。这项研究提出了一个特设加权统计框架,以探索放射学生物标志物,以更好地表征乳腺癌中的放射学表型。本研究纳入了36名女性乳腺癌患者。从MRI和PET成像技术中提取每位患者的恶性和健康病变的放射学特征。为了减少受试者内部的偏见,计算每位患者从两个病灶中提取的影像组学特征的比率.放射学特征进一步归一化,比较z分数,分位数,和美白归一化方法,以减少受试者之间的偏见。在通过Spearman\的相关性进行特征缩减之后,采用了基于主成分分析(PCA)的方法学方法。对27例患者的肿瘤分级进行了比较和验证,Ki-67指数,和使用分类方法的分子癌症亚型(LogitBoost,随机森林,和线性判别分析)。分类技术使用一个PC实现了高的曲线下面积值,该PC是通过分位数方法对放射学特征进行归一化而计算的。这项初步研究帮助我们建立了一个强大的分析框架,以生成组合的放射学特征,这可能导致更精确的乳腺癌预后。
    Radiomics is rapidly advancing in precision diagnostics and cancer treatment. However, there are several challenges that need to be addressed before translation to clinical use. This study presents an ad-hoc weighted statistical framework to explore radiomic biomarkers for a better characterization of the radiogenomic phenotypes in breast cancer. Thirty-six female patients with breast cancer were enrolled in this study. Radiomic features were extracted from MRI and PET imaging techniques for malignant and healthy lesions in each patient. To reduce within-subject bias, the ratio of radiomic features extracted from both lesions was calculated for each patient. Radiomic features were further normalized, comparing the z-score, quantile, and whitening normalization methods to reduce between-subjects bias. After feature reduction by Spearman\'s correlation, a methodological approach based on a principal component analysis (PCA) was applied. The results were compared and validated on twenty-seven patients to investigate the tumor grade, Ki-67 index, and molecular cancer subtypes using classification methods (LogitBoost, random forest, and linear discriminant analysis). The classification techniques achieved high area-under-the-curve values with one PC that was calculated by normalizing the radiomic features via the quantile method. This pilot study helped us to establish a robust framework of analysis to generate a combined radiomic signature, which may lead to more precise breast cancer prognosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这项研究是为了确定65岁及以上老年人对COVID-19的恐惧与他们对新常态的适应水平之间的关系。“这项描述性横断面研究完成了623名老年人。确定适应“新常态”的个体对老年有很高的适应能力,而他们对COVID-19的恐惧水平略高于平均水平(p<0.01)。老年人试图适应“新常态”,同时也经历了对COVID-19的恐惧。为了最大程度地减少老年人在COVID-19期间的恐惧,应提供足够的支持和心理支持。
    This study was carried out to determine the relationship between the fear of COVID-19 in the elderly aged 65 years and over and their levels of adaptation to the \"new normal.\" This descriptive cross-sectional study was completed with 623 elderly individuals. It was determined that the individuals who adapted well to the \"new normal\" had high levels of adaptation to old age, while their levels of fear of COVID-19 were slightly above average (p < 0.01). Elderly individuals have tried to adapt to the \"new normal\" while also experiencing fear of COVID-19. In order to minimize the fear experienced by the elderly during COVID-19, adequate support and psychological support should be provided.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    为了从RNA测序(RNA-seq)数据中正确解码表型信息,仔细选择RNA-seq定量测量对于样品间比较和下游分析至关重要,例如两种或多种条件之间的差异基因表达。已经提出并继续使用几种方法。然而,关于RNA-seq数据分析的最佳基因表达定量方法尚未达成共识.
    在本研究中,我们使用了来自20个患者来源的异种移植(PDX)模型中每一个的重复样本,跨越15种肿瘤类型,对于通过NCI患者衍生模型库(PDMR)获得的总共61个人类肿瘤异种移植样本。我们比较了基于TPM(每百万转录本)的重复样品的可重复性,FPKM(每百万个映射片段的转录物的每千碱基片段),和使用变异系数的归一化计数,组内相关系数,和聚类分析。
    我们的结果表明,归一化计数数据的分层聚类倾向于比TPM和FPKM数据更准确地将来自同一PDX模型的重复样本分组在一起。此外,观察到归一化计数数据具有最低的中值变异系数(CV),与TPM和FPKM数据相比,来自相同模型的所有重复样品以及所有PDX模型的相同基因的最高组内相关性(ICC)值。
    我们为进行PDXRNA-seq数据的下游分析的优选定量测量提供了令人信服的证据。据我们所知,这是在PDX模型上进行的RNA-seq数据定量测量的第一个比较研究,已知其本身比细胞系模型更易变。我们的发现与其他人对人类肿瘤和细胞系的研究结果一致,并进一步支持标准化计数是分析样品中RNA-seq数据的最佳选择。
    In order to correctly decode phenotypic information from RNA-sequencing (RNA-seq) data, careful selection of the RNA-seq quantification measure is critical for inter-sample comparisons and for downstream analyses, such as differential gene expression between two or more conditions. Several methods have been proposed and continue to be used. However, a consensus has not been reached regarding the best gene expression quantification method for RNA-seq data analysis.
    In the present study, we used replicate samples from each of 20 patient-derived xenograft (PDX) models spanning 15 tumor types, for a total of 61 human tumor xenograft samples available through the NCI patient-derived model repository (PDMR). We compared the reproducibility across replicate samples based on TPM (transcripts per million), FPKM (fragments per kilobase of transcript per million fragments mapped), and normalized counts using coefficient of variation, intraclass correlation coefficient, and cluster analysis.
    Our results revealed that hierarchical clustering on normalized count data tended to group replicate samples from the same PDX model together more accurately than TPM and FPKM data. Furthermore, normalized count data were observed to have the lowest median coefficient of variation (CV), and highest intraclass correlation (ICC) values across all replicate samples from the same model and for the same gene across all PDX models compared to TPM and FPKM data.
    We provided compelling evidence for a preferred quantification measure to conduct downstream analyses of PDX RNA-seq data. To our knowledge, this is the first comparative study of RNA-seq data quantification measures conducted on PDX models, which are known to be inherently more variable than cell line models. Our findings are consistent with what others have shown for human tumors and cell lines and add further support to the thesis that normalized counts are the best choice for the analysis of RNA-seq data across samples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号