Skip to main content

Assessment of complex SARs using Formal Concept Analysis of fragment combinations

Molecular fragments are widely recognized as powerful descriptors having the advantage of being chemically interpretable [1]. Several methodologies have been introduced to annotate molecular fragments with activity information based on their occurrence and/or abundance in bioactive compounds [2][3]. However, these methods usually do not focus on comparison of several activity classes.

We present FragFCA, a fragment-based approach that utilizes Formal Concept Analysis [4], a data mining technique adapted from information theory. FragFCA makes it possible to generate user-defined complex SAR queries spanning multiple activity classes and incorporating potency information. Using the freely available ToscanaJ software package, we have designed a graphical interface allowing the interactive assembly of queries of varying levels of complexity [5].

Using a publicly available data set of GPCR inhibitors belonging to seven different classes with overlapping activity, we have analyzed the ability of fragment combinations to discriminate between different activities and/or potency ranges. Fragment pairs and triplets, rather than individual fragments, are shown to contain most selectivity information. Moreover, using FragFCA, we have been able to identify fragment combinations that successfully distinguish selective from non-selective cathepsin L inhibitors in HTS data. This suggests that FragFCA can be applied to extract fragment combinations on the basis of training sets for the prediction of compound selectivity.


  1. 1.

    Erlanson DA, McDowell RS, O'Brien T: J Med Chem. 2004, 47: 3463-3482. 10.1021/jm040031v.

    CAS  Article  Google Scholar 

  2. 2.

    Schnur DM, Hermsmeier MA, Tebben AJ: J Med Chem. 2006, 49: 2000-2009. 10.1021/jm0502900.

    CAS  Article  Google Scholar 

  3. 3.

    Sutherland JJ, Higgs RE, Watson I, Vieth M: J Med Chem. 2008, 51: 2689-2700. 10.1021/jm701399f.

    CAS  Article  Google Scholar 

  4. 4.

    Priss U: Ann Rev Inf Sci Technol. 2006, 40: 521-543. 10.1002/aris.1440400120.

    Article  Google Scholar 

  5. 5.

    Lounkine E, Auer J, Bajorath J: J Med Chem. 2008

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to J Bajorath.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Lounkine, E., Auer, J. & Bajorath, J. Assessment of complex SARs using Formal Concept Analysis of fragment combinations. Chemistry Central Journal 3, P3 (2009).

Download citation


  • Bioactive Compound
  • Activity Class
  • Mining Technique
  • Data Mining Technique
  • Activity Information