关键词: Aberrant splicing RNA-seq outlier detection rare disease rare disease diagnostics rare variant

Mesh : Humans Alternative Splicing / genetics Introns / genetics RNA Splicing / genetics RNA-Seq Algorithms

来  源:   DOI:10.1016/j.ajhg.2023.10.014   PDF(Pubmed)

Abstract:
Detection of aberrantly spliced genes is an important step in RNA-seq-based rare-disease diagnostics. We recently developed FRASER, a denoising autoencoder-based method that outperformed alternative methods of detecting aberrant splicing. However, because FRASER\'s three splice metrics are partially redundant and tend to be sensitive to sequencing depth, we introduce here a more robust intron-excision metric, the intron Jaccard index, that combines the alternative donor, alternative acceptor, and intron-retention signal into a single value. Moreover, we optimized model parameters and filter cutoffs by using candidate rare-splice-disrupting variants as independent evidence. On 16,213 GTEx samples, our improved algorithm, FRASER 2.0, called typically 10 times fewer splicing outliers while increasing the proportion of candidate rare-splice-disrupting variants by 10-fold and substantially decreasing the effect of sequencing depth on the number of reported outliers. To lower the multiple-testing correction burden, we introduce an option to select the genes to be tested for each sample instead of a transcriptome-wide approach. This option can be particularly useful when prior information, such as candidate variants or genes, is available. Application on 303 rare-disease samples confirmed the relative reduction in the number of outlier calls for a slight loss of sensitivity; FRASER 2.0 recovered 22 out of 26 previously identified pathogenic splicing cases with default cutoffs and 24 when multiple-testing correction was limited to OMIM genes containing rare variants. Altogether, these methodological improvements contribute to more effective RNA-seq-based rare diagnostics by drastically reducing the amount of splicing outlier calls per sample at minimal loss of sensitivity.
摘要:
异常剪接基因的检测是基于RNA-seq的罕见疾病诊断中的重要步骤。我们最近开发了FRASER,一种基于去噪自动编码器的方法,优于检测异常剪接的替代方法。然而,因为FRASER的三个拼接指标是部分冗余的,并且倾向于对测序深度敏感,我们在这里介绍一个更稳健的内含子切除度量,内含子Jaccard指数,结合了替代捐赠者,替代受体,和内含子保留信号转换为单个值。此外,我们通过使用候选稀有剪接破坏变异体作为独立证据,优化了模型参数和过滤截止值.在16,213个GTEx样本上,我们的改进算法,FRASER2.0,通常称为少10倍的剪接异常值,同时将候选稀有剪接破坏变体的比例增加10倍,并大大降低测序深度对报告的异常值数量的影响。为了降低多重测试校正负担,我们引入了一个选项来选择每个样本的待测试基因,而不是全转录组方法。当先验信息、如候选变异或基因,是可用的。在303个罕见疾病样本上的应用证实了异常值的数量相对减少,从而导致灵敏度略有下降;FRASER2.0在先前确定的26例致病性剪接病例中恢复了22例,具有默认截止值,而在多次测试校正仅限于包含罕见变体的OMIM基因时,FRASER2.0恢复了24例。总之,这些方法学上的改进通过在最小的灵敏度损失下大幅减少每个样本的剪接异常调用量,从而有助于更有效的基于RNA-seq的罕见诊断.
公众号