变压器在化学信息学中的应用.Application of Transformers in Cheminformatics.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

By accelerating time-consuming processes with high efficiency, computing has become an essential part of many modern chemical pipelines. Machine learning is a class of computing methods that can discover patterns within chemical data and utilize this knowledge for a wide variety of downstream tasks, such as property prediction or substance generation. The complex and diverse chemical space requires complex machine learning architectures with great learning power. Recently, learning models based on transformer architectures have revolutionized multiple domains of machine learning, including natural language processing and computer vision. Naturally, there have been ongoing endeavors in adopting these techniques to the chemical domain, resulting in a surge of publications within a short period. The diversity of chemical structures, use cases, and learning models necessitate a comprehensive summarization of existing works. In this paper, we review recent innovations in adapting transformers to solve learning problems in chemistry. Because chemical data is diverse and complex, we structure our discussion based on chemical representations. Specifically, we highlight the strengths and weaknesses of each representation, the current progress of adapting transformer architectures, and future directions.

摘要：

通过以高效率加速耗时的流程，计算已成为许多现代化学管道的重要组成部分。机器学习是一类计算方法，可以发现化学数据中的模式，并将这些知识用于各种下游任务。如属性预测或物质生成。复杂多样的化学空间需要具有强大学习能力的复杂机器学习架构。最近,基于变压器架构的学习模型彻底改变了机器学习的多个领域，包括自然语言处理和计算机视觉。自然,一直在努力将这些技术应用于化学领域，导致短时间内出版物激增。化学结构的多样性，用例，学习模式需要对现有工作进行全面总结。在本文中,我们回顾了最近在适应变压器以解决化学学习问题方面的创新。因为化学数据是多样和复杂的，我们基于化学表述来构建我们的讨论。具体来说,我们强调每个代表的优点和缺点，适应变压器架构的当前进展，和未来的方向。