Skip to main content

Advertisement

Implication of heteroatom tautomer in QSAR models

For molecules with mobile H atoms, the result of quantitative structure-activity relationships (QSAR) depends on the actual position of the respective hydrogen atoms. Thus, to obtain reliable results, tautomerism needs to be taken into account. A new algorithm to create all tautomer forms of a given molecule based on the mobile H-layer information in the InChI (The IUPAC International Chemical Identifier) code is presented. Unlike published tautomer generation models, this requires no particular rule set. The algorithm eliminates atoms not participating in tautomerism and finally generates all possible combinations. The algorithm was applied to ca. 70,000 structures of the EINECS (European Inventory of Existing Commercial Chemical Substances) database. Ca. 7,500 structures with mobile H atoms were detected. In total, around 200,000 tautomers were generated. Typically, the number of tautomers for single compounds is below 25, but for some molecules this number is extremely large.

Two estimation models, one for the soil sorption coefficient and one for the water solubility, were applied to the subset of compounds with tautomerism. For each substance, the variability of the results due to the different tautomer forms has been inspected. Calculation results exceeding the range of reasonable values have been excluded from the study. The average variation of the soil sorption coefficient within tautomers of individual compounds was almost 0.5 in logarithmic units, and differences up to 3 orders of magnitude were obtained for particular chemicals. For the water solubility, the average variation was between 1 and 2 orders of magnitude, with maximum differences of more then ten logarithmic units.

Author information

Correspondence to T Thalheim.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Keywords

  • Average Variation
  • Water Solubility
  • Chemical Substance
  • Actual Position
  • QSAR Model