TY - JOUR
T1 - Modern semiempirical electronic structure methods and machine learning potentials for drug discovery
T2 - Conformers, tautomers, and protonation states
AU - Zeng, Jinzhe
AU - Tao, Yujun
AU - Giese, Timothy J.
AU - York, Darrin M.
N1 - Publisher Copyright:
© 2023 Author(s).
PY - 2023/3/28
Y1 - 2023/3/28
N2 - Modern semiempirical electronic structure methods have considerable promise in drug discovery as universal "force fields"that can reliably model biological and drug-like molecules, including alternative tautomers and protonation states. Herein, we compare the performance of several neglect of diatomic differential overlap-based semiempirical (MNDO/d, AM1, PM6, PM6-D3H4X, PM7, and ODM2), density-functional tight-binding based (DFTB3, DFTB/ChIMES, GFN1-xTB, and GFN2-xTB) models with pure machine learning potentials (ANI-1x and ANI-2x) and hybrid quantum mechanical/machine learning potentials (AIQM1 and QDπ) for a wide range of data computed at a consistent ωB97X/6-31G∗ level of theory (as in the ANI-1x database). This data includes conformational energies, intermolecular interactions, tautomers, and protonation states. Additional comparisons are made to a set of natural and synthetic nucleic acids from the artificially expanded genetic information system that has important implications for the design of new biotechnology and therapeutics. Finally, we examine the acid/base chemistry relevant for RNA cleavage reactions catalyzed by small nucleolytic ribozymes, DNAzymes, and ribonucleases. Overall, the hybrid quantum mechanical/machine learning potentials appear to be the most robust for these datasets, and the recently developed QDπ model performs exceptionally well, having especially high accuracy for tautomers and protonation states relevant to drug discovery.
AB - Modern semiempirical electronic structure methods have considerable promise in drug discovery as universal "force fields"that can reliably model biological and drug-like molecules, including alternative tautomers and protonation states. Herein, we compare the performance of several neglect of diatomic differential overlap-based semiempirical (MNDO/d, AM1, PM6, PM6-D3H4X, PM7, and ODM2), density-functional tight-binding based (DFTB3, DFTB/ChIMES, GFN1-xTB, and GFN2-xTB) models with pure machine learning potentials (ANI-1x and ANI-2x) and hybrid quantum mechanical/machine learning potentials (AIQM1 and QDπ) for a wide range of data computed at a consistent ωB97X/6-31G∗ level of theory (as in the ANI-1x database). This data includes conformational energies, intermolecular interactions, tautomers, and protonation states. Additional comparisons are made to a set of natural and synthetic nucleic acids from the artificially expanded genetic information system that has important implications for the design of new biotechnology and therapeutics. Finally, we examine the acid/base chemistry relevant for RNA cleavage reactions catalyzed by small nucleolytic ribozymes, DNAzymes, and ribonucleases. Overall, the hybrid quantum mechanical/machine learning potentials appear to be the most robust for these datasets, and the recently developed QDπ model performs exceptionally well, having especially high accuracy for tautomers and protonation states relevant to drug discovery.
UR - https://www.scopus.com/pages/publications/85150885139
UR - https://www.scopus.com/pages/publications/85150885139#tab=citedBy
U2 - 10.1063/5.0139281
DO - 10.1063/5.0139281
M3 - Article
C2 - 37003741
AN - SCOPUS:85150885139
SN - 0021-9606
VL - 158
JO - Journal of Chemical Physics
JF - Journal of Chemical Physics
IS - 12
M1 - 124110
ER -