关键词: DNA coding DNA storage ECA-PCRAIR biological constraint nonspecific pairing constraint storage density

Mesh : Algorithms Polymerase Chain Reaction / methods DNA / genetics Information Storage and Retrieval / methods DNA Primers / genetics Base Sequence

来  源:   DOI:10.3390/ijms25126449   PDF(Pubmed)

Abstract:
Polymerase Chain Reaction (PCR) amplification is widely used for retrieving information from DNA storage. During the PCR amplification process, nonspecific pairing between the 3\' end of the primer and the DNA sequence can cause cross-talk in the amplification reaction, leading to the generation of interfering sequences and reduced amplification accuracy. To address this issue, we propose an efficient coding algorithm for PCR amplification information retrieval (ECA-PCRAIR). This algorithm employs variable-length scanning and pruning optimization to construct a codebook that maximizes storage density while satisfying traditional biological constraints. Subsequently, a codeword search tree is constructed based on the primer library to optimize the codebook, and a variable-length interleaver is used for constraint detection and correction, thereby minimizing the likelihood of nonspecific pairing. Experimental results demonstrate that ECA-PCRAIR can reduce the probability of nonspecific pairing between the 3\' end of the primer and the DNA sequence to 2-25%, enhancing the robustness of the DNA sequences. Additionally, ECA-PCRAIR achieves a storage density of 2.14-3.67 bits per nucleotide (bits/nt), significantly improving storage capacity.
摘要:
聚合酶链反应(PCR)扩增广泛用于从DNA存储中检索信息。在PCR扩增过程中,引物的3'末端和DNA序列之间的非特异性配对可以在扩增反应中引起串扰,导致干扰序列的产生和降低的扩增精度。为了解决这个问题,提出了一种高效的PCR扩增信息检索编码算法(ECA-PCRAIR)。该算法采用可变长度扫描和修剪优化来构造码本,该码本在满足传统生物学约束的同时最大化存储密度。随后,基于引物库构建码字搜索树以优化码本,可变长度交织器用于约束检测和校正,从而最大限度地减少非特异性配对的可能性。实验结果表明,ECA-PCRAIR可以将引物3'末端与DNA序列之间的非特异性配对概率降低到2-25%,增强DNA序列的鲁棒性。此外,ECA-PCRAIR的存储密度为每个核苷酸2.14-3.67位(位/nt),显著提高存储容量。
公众号