Chemical complexity mapping in QSAR models
Chemistry Central Journal volume 3, Article number: P40 (2009)
QSAR models relate features of chemical structures to target properties or effects. Quality models are supposed to apply validated data sets. Typically, the target data are validated in terms of accuracy and reliability. To each data item, a chemical structure is assigned, and in case of 3D geometry models some more or less sophisticated geometry optimisation is performed. However, usually less attention is drawn to the proper representation of chemical identities themselves before entering the model training set. Reported chemical names or even registry numbers often relate to ambiguous chemical structures. There are chemical aspects such as isomerism, mesomerism, and tautomerism, and measured data may relate to generic compound specifications, or to mixtures of defined or even undefined compositions.
Within the framework of the EU projects OSIRIS and 2-FUN, a database concept is introduced to reflect these aspects of chemical complexity. One of the goals of this development is to provide a tool for obtaining representative data sets for QSAR developments, taking into account the chemical complexity in an appropriate manner.
The importance of this approach is demonstrated by example calculations to show the effect of uncertainties due to ambiguous chemical structures on the output of QSAR models. This study is supported by the EU projects OSIRIS (contract No. 037017) and 2-FUN (contract No. 036976).
About this article
Cite this article
Thalheim, T., Ebert, RU., Kühne, R. et al. Chemical complexity mapping in QSAR models. Chemistry Central Journal 3 (Suppl 1), P40 (2009). https://doi.org/10.1186/1752-153X-3-S1-P40