Distance-dependent: characterizing virtual screening datasets

Anthes, C; Rohrer, SG; Baumann, K

doi:10.1186/1752-153X-3-S1-P19

Volume 3 Supplement 1

4th German Conference on Chemoinformatics: 22. CIC-Workshop

Poster presentation
Open access
Published: 05 June 2009

Distance-dependent: characterizing virtual screening datasets

C Anthes¹,
SG Rohrer¹ &
K Baumann¹

Chemistry Central Journal volume 3, Article number: P19 (2009) Cite this article

1528 Accesses
Metrics details

Many reports evaluating ligand-based virtual screening methods show that the results are strongly dependent on the composition of the employed benchmark datasets. Recently, it became apparent, that two causes for overoptimistic validation results need to be avoided: artificial enrichment and analogue bias. Artificial enrichment is observed when the decoy set (i. e. the background) differs significantly from the set of actives regarding "simple" molecular properties. Analogue bias describes the fact that in the dataset of actives certain scaffolds are over-represented. Both phenomena render retrieval of actives trivial.

Several techniques were proposed in the literature to cope with these problems. Most of them use the mean of pair wise distances or the mean of pair wise similarity coefficients to characterize dataset diversity [1]. It is obvious that these measures depend on the dataset but also on the employed structure descriptor and the distance/similarity measure.

The goal of this study was to assess whether or not commonly employed measures of diversity reasonably characterize benchmark dataset composition. Therefore, previously published diversity measures were compared to recently introduced spatial statistics-based figures of dataset topology [2]. The relative advantages and disadvantages of the studied figures are contrasted. Interestingly, figures based on more distant neighbours than just the nearest one, performed very well. From a detailed analysis of the findings, a guideline for characterizing ligand-based virtual screening datasets is derived.

References

Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A: J Chem Inf Comput Sci. 2004, 44: 1177-1185.
Article CAS Google Scholar
Rohrer SG, Baumann K: J Chem Inf Model. 2008, 48: 704-718. 10.1021/ci700099u.
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Pharmazeutische Chemie, Technische Universität Braunschweig, Beethovenstr. 55, 38106, Braunschweig, Germany
C Anthes, SG Rohrer & K Baumann

Authors

C Anthes
View author publications
You can also search for this author in PubMed Google Scholar
SG Rohrer
View author publications
You can also search for this author in PubMed Google Scholar
K Baumann
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Anthes, C., Rohrer, S. & Baumann, K. Distance-dependent: characterizing virtual screening datasets. Chemistry Central Journal 3 (Suppl 1), P19 (2009). https://doi.org/10.1186/1752-153X-3-S1-P19

Download citation

Published: 05 June 2009
DOI: https://doi.org/10.1186/1752-153X-3-S1-P19

4th German Conference on Chemoinformatics: 22. CIC-Workshop

Distance-dependent: characterizing virtual screening datasets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

BMC Chemistry

Contact us

4th German Conference on Chemoinformatics: 22. CIC-Workshop

Distance-dependent: characterizing virtual screening datasets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Chemistry

Contact us