蛋白质在固体表面的吸附是一个与生物相关的过程,medical,工业,和环境应用。尽管人们对测量技术有着广泛的兴趣和进步,蛋白质吸附的复杂性阻碍了其准确预测。为了应对这一挑战,在这里,收集了过去四十年报告的蛋白质吸附数据,检查完整性和正确性,有组织的,并存档在升级的文件中,可自由访问的生物分子吸附数据库,相当于大规模的,adhoc,众包多因素实验。使用内部程序(ProMS)作为PyMol软件的附件,在其分子表面上对数据库中存在的蛋白质的形状和物理化学性质进行定量。基于机器学习的分析表明,蛋白质在疏水和亲水表面的吸附是由不同的操作,结构,和分子表面的物理化学参数。分别,关于四种“基准”蛋白质的吸附数据,即,溶菌酶,白蛋白,IgG,和纤维蛋白原,通过分段线性回归处理,以蛋白质单层为断点,使用Langmuir等温线形式主义的线性化,产生预测蛋白质吸附的半经验关系。这些关系,分别用于亲水和疏水表面,很好地描述了表面上的蛋白质浓度与溶液中蛋白质浓度的关系,吸附表面接触角,离子强度,pH值,和携带流体的温度,以及pH值和蛋白质等电点之间的差异。当将基准蛋白质的半经验关系应用于具有已知PDB结构的另外两个“测试”蛋白质时,即,β-乳球蛋白和α-乳白蛋白,发现这种外推的误差与基准和测试蛋白质之间的差异呈线性关系。本文提出的工作可用于估算调节蛋白质吸附的各种应用的操作参数,例如诊断设备,制药,生物材料,或食品工业。
Protein adsorption on solid surfaces is a process relevant to biological, medical, industrial, and environmental applications. Despite this wide interest and advancement in measurement techniques, the complexity of protein adsorption has frustrated its accurate prediction. To address this challenge, here, data regarding protein adsorption reported in the last four decades was collected, checked for completeness and correctness, organized, and archived in an upgraded, freely accessible Biomolecular Adsorption Database, which is equivalent to a large-scale, ad hoc, crowd-sourced multifactorial experiment. The shape and physicochemical properties of the proteins present in the database were quantified on their molecular surfaces using an in-house program (ProMS) operating as an add-on to the PyMol software. Machine learning-based analysis indicated that protein adsorption on hydrophobic and hydrophilic surfaces is modulated by different sets of operational, structural, and molecular surface-based physicochemical parameters. Separately, the adsorption data regarding four \"benchmark\" proteins, i.e., lysozyme, albumin, IgG, and fibrinogen, was processed by piecewise linear regression with the protein monolayer acting as breakpoint, using the linearization of the Langmuir isotherm formalism, resulting in semiempirical relationships predicting protein adsorption. These relationships, derived separately for hydrophilic and hydrophobic surfaces, described well the protein concentration on the surface as a function of the protein concentration in solution, adsorbing surface contact angle, ionic strength, pH, and temperature of the carrying fluid, and the difference between pH and the isoelectric point of the protein. When applying the semiempirical relationships derived for benchmark proteins to two other \"test\" proteins with known PDB structure, i.e., β-lactoglobulin and α-lactalbumin, the errors of this extrapolation were found to be in a linear relationship with the dissimilarity between the benchmark and the test proteins. The work presented here can be used for the estimation of operational parameters modulating protein adsorption for various applications such as diagnostic devices, pharmaceuticals, biomaterials, or the food industry.