关键词: CDR-H3 database repertoire

Mesh : Humans Data Mining / methods Drug Discovery / methods Biological Products / immunology Complementarity Determining Regions / genetics immunology Immunoglobulin Variable Region / immunology genetics

来  源:   DOI:10.1080/19420862.2024.2361928   PDF(Pubmed)

Abstract:
The naïve human antibody repertoire has theoretical access to an estimated > 1015 antibodies. Identifying subsets of this prohibitively large space where therapeutically relevant antibodies may be found is useful for development of these agents. It was previously demonstrated that, despite the immense sequence space, different individuals can produce the same antibodies. It was also shown that therapeutic antibodies, which typically follow seemingly unnatural development processes, can arise independently naturally. To check for biases in how the sequence space is explored, we data mined public repositories to identify 220 bioprojects with a combined seven billion reads. Of these, we created a subset of human bioprojects that we make available as the AbNGS database (https://naturalantibody.com/ngs/). AbNGS contains 135 bioprojects with four billion productive human heavy variable region sequences and 385 million unique complementarity-determining region (CDR)-H3s. We find that 270,000 (0.07% of 385 million) unique CDR-H3s are highly public in that they occur in at least five of 135 bioprojects. Of 700 unique therapeutic CDR-H3, a total of 6% has direct matches in the small set of 270,000. This observation extends to a match between CDR-H3 and V-gene call as well. Thus, the subspace of shared (\'public\') CDR-H3s shows utility for serving as a starting point for therapeutic antibody design.
摘要:
初始人抗体库具有对估计的>1015种抗体的理论访问。鉴定可能发现治疗相关抗体的该过大空间的子集对于这些试剂的开发是有用的。以前证明,尽管序列空间巨大,不同的个体可以产生相同的抗体。研究还表明,治疗性抗体,通常遵循看似不自然的发育过程,可以自然独立产生。要检查探索序列空间的方式是否存在偏差,我们对公共存储库进行了数据挖掘,确定了220个生物项目,总共读取了70亿次。其中,我们创建了人类生物项目的一个子集,我们提供作为AbNGS数据库(https://naturalantibody.com/ngs/)。AbNGS包含135个生物项目,具有40亿个生产性人类重可变区序列和3.85亿个独特的互补决定区(CDR)-H3。我们发现270,000(3.85亿个中的0.07%)个独特的CDR-H3是高度公开的,因为它们出现在135个生物项目中的至少5个中。在700个独特的治疗CDR-H3中,总共6%在270,000的小集合中具有直接匹配。该观察也延伸到CDR-H3和V基因调用之间的匹配。因此,共享(“公共”)CDR-H3s的子空间显示了作为治疗性抗体设计起点的实用性。
公众号