 Research
 Open Access
 Published:
Comparison of various methods for validity evaluation of QSAR models
BMC Chemistry volume 16, Article number: 63 (2022)
Abstract
Background
Quantitative structure–activity relationship (QSAR) modeling is one of the most important computational tools employed in drug discovery and development. The external validation of QSAR models is the main point to check the reliability of developed models for the prediction activity of not yet synthesized compounds. It was performed by different criteria in the literature.
Methods
In this study, 44 reported QSAR models for biologically active compounds reported in scientific papers were collected. Various statistical parameters of external validation of a QSAR model were calculated, and the results were discussed.
Results
The findings revealed that employing the coefficient of determination (r^{2}) alone could not indicate the validity of a QSAR model. The established criteria for external validation have some advantages and disadvantages which should be considered in QSAR studies.
Conclusion
This study showed that these methods alone are not only enough to indicate the validity/invalidity of a QSAR model.
Introduction
Quantitative structure–activity relationship (QSAR) is a numerical method for finding the relationships between chemical structure and drug properties i.e., biological activity in drug discovery processes [1]. Developing a QSAR model composed of different stages i.e., (1) collecting data from the literature, (2) calculation of parameters performed by different software packages such as Dragon software or image analysis (2DQSAR), force field calculations based on threedimensional structures (3DQSAR) and etc., (3) developing the QSAR model by various statistical technique e.g. multiple linear regression, artificial neural network and partial least square, and (4) validation of the model by internal (leave one out and leave many out) and external validation [2]. There are various critical points in QSAR studies that should be considered by researchers [3]. However, the challenges on selecting appropriate parameters for external validation have been seen in the literature [4, 5].
In QSAR studies, training a model by linear and nonlinear models is not enough to confirm the prediction capability. The developed model should be applied to other data sets which did not synthesize in virtual screening and designing new drug compounds. On the way, whenever we can say a QSAR model is acceptable that it could predict the activity of other compounds with reasonable accuracy. Therefore, external validation (splitting data into training and test sets) is one of the major challenges in QSAR studies [6,7,8]. Various types of cross validation analysis i.e., leave one out, leave many out and repeated double cross validation are recommended in QSAR studies especially when the available sample size is small [9, 10]. However, external validation is one of the most common criteria for evaluating the validity of a QSAR model [11,12,13].
Different criteria and rules were proposed for evaluating the validity of the QSAR models, which most of them focused on the external validation [13, 14]. Five criteria proposed in authentic journals were selected in this study and details have been described in method section. They are highly cited and several researchers were used them to evaluate validity of QSAR models [15,16,17,18]. Designers of each criterion have been shown advantages of them in comparison with others for external validation of QSAR models [5, 6, 19,20,21]. Some models have certain defects from the statistical viewpoint and various results are observed based on the applied software e.g. the correlation coefficient (r^{2}) of regression through origin [5]. Nevertheless, there is no comprehensive comparison between them for the evaluation of the external validity of QSAR models. The aim of this study is the comparison of external validation of QSAR models by them to find advantages and disadvantages of each method.
Methods
Fortyfour data sets (training and test sets) composed of experimental biological activity and corresponding calculated activity (resubstitution value for training data set) using QSAR models with various statistical approaches were collected from the published articles [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48] indexed in Scopus database (see Additional file 1 and Table 1). The absolute error (AE) of each datum (absolute difference between experimental and calculated data) was calculated. External validation of these data set was assessed with the following methods:
Proposed criteria by Golbraikh and Tropsha
I. r^{2} > 0:6, r^{2} is the coefficient of determination between the experimental activity and predicted values based on regression analysis.
II. 0.85 < K < 1.15 or 0.85 < K' < 1.15.
K and K' are slopes of regression lines through the origin between the experimental activity and predicted, and vice versa, respectively.
III. \(\frac{{\text{r}}^{2}{\text{r}}_{0}^{2}}{{\text{r}}^{2}}\text{<0.1 or }\frac{{\text{r}}^{2}{\text{r}}_{0}^{^{\prime}2}}{{\text{r}}^{2}}\text{<0.1}\)
r_{0}.^{2} and \({\text{r}}_{0}^{^{\prime}2}\) is the coefficient of determination between the experimental activity and predicted values and predicted versus experimental activity, respectively, based on regression through origin analysis (linear regression by least square method without a constant term) [19].
Proposed criteria by Roy based on regression through origin (RTO)
Roy and coworkers suggested \({\text{r}}_{{\text{m}}}^{{2}}\) which calculated by Eq. 1, and it is one of the most famous equations which used by QSAR experts in literature [20, 49]:
In this equation,\(r_{0}^{2}\) value computed using regression through origin (RTO) and RTO referred to linear regression by least square method without a constant term.
Concordance correlation coefficient (CCC)
Gramatica and coworker [4] suggested the concordance correlation coefficient (CCC) for external validation of a QSAR model:
Y_{i} is the experimental value, \(\mathop {\text{Y}}\limits^{  }\) is the average of experimental values, \({\text{Y}}_{{{\text{i}}^{\prime } }}\) is the predicted value of activity and \({\overline{\text{Y}}}_{{\text{i}}}\) is the average of the predicted value of the activity. EXT is external prediction set or test set. CCC > 0.8 accounts as a valid model.
Statistical significant between deviation of experimental activity and calculated data
In 2014, our research group challenged the regression through origin and proposed the calculation of model errors for training and test sets and comparison of them as a reliable method to external validation of QSAR models [5].
Criteria based on training set range and the deviation between experimental and calculated data
Roy and coworkers [21] similar to our method (method 4) proposed new principles based on training set range and absolute average error (AAE) i.e., the difference between experimental and the predicted values of test set, and corresponding standard deviation (SD) for training and test sets as follows:
Good prediction: AAE ≤ 0.1 × training set range and AAE + 3 × SD ≤ 0.2 × traning set range
Bad prediction: AAE > 0.15 × training set range or AAE + 3 × SD > 0.25 × traning set range
A good model should be passed both above criteria. However, the predictions which fall into one of the conditions could be considered as of moderately acceptable model.
Results and discussion
Table 1 listed the numerical values of statistical parameters that need to calculate the mentioned criteria for external validation of 44 developed QSAR models.
The main factor in the validation of QSAR models from a statistical point is different equations even to calculate simple parameters such as r^{2} and r_{0}^{2} [22, 50]. These different equations will affect the comparison. The r^{2} in this work was calculated by SPSS software based correlation between experimental and calculated values. However, in the studied criteria in this work, there is a controversy in the calculation of r_{0}^{2}. The following equations were applied to the calculation of r_{0}^{2} and in method 1, 2 and Excel software [21]
Instead, the alternative formula was proposed instead of the Eqs. 3 and 4 because of statistical defects to the calculation of r^{2} of RTO [5, 22] which recommended by statistical books in the literature [51, 52]:
In addition to statistical defects in Eq. (3) and (4) for the calculation of r_{0}^{2} and r_{0}^{′2}, QSAR researchers, may apply Eq. (5) which proposed as an appropriate equation for r_{0}^{2} and officinal statistical package such as SPSS, and do not give reasonable results. Calculation of \({\text{r}}_{{\text{m}}}^{{2}}\) based on computed \(r_{0}^{2}\) by Eq. (5) (or SPSS software) is not possible because of r^{2} is commonly less than \(r_{0}^{2}\) and therefore \({\text{r}}^{{2}} {\text{  r}}_{{0}}^{{2}} { < 0}\). This is the most defect of methods 1 and 2 for the external validation of QSAR models.
Seven of the studied models have r^{2} < 0.6 (Table 2). Therefore, they could not account as valid models. r^{2} is simple parameter to evaluate the correlation between experimental and predicted values in QSAR studies and for estimating the correlation between concentration and response in analytical chemistry. It is a primary criterion, and a QSAR model or a developed analytical method with a high r^{2} value does not necessarily have an acceptable validity [53, 54]. In addition, the squared factors e.g. r^{2}, negatively affects the possibility to distinguish errors in one or in another direction: overpredicted or underpredicted values; these two kinds of errors have a huge different in toxicity and regulatory evaluation.
The numerical values of other proposed criteria in method 1 show that all models have K or K' between 0.85 and 1.15. The third rule (\(\frac{{\text{r}}^{2}{\text{r}}_{0}^{2}}{{\text{r}}^{2}}\text{<0.1 or }\frac{{\text{r}}^{2}{\text{r}}_{0}^{^{\prime}2}}{{\text{r}}^{2}}\text{<0.1}\)) is only nonacceptable for 7 models which 3 of them have r^{2} < 0.6. Therefore, based on the suggested principles in method 1, 11 models are not valid.
Method 2 proposed based on RTO and r_{0}^{2} calculated by Eq. (3). Twentysix models have \({\text{r}}_{{\text{m}}}^{{2}}\) > 0.5, and the results are similar to method 1 (both of them are based on RTO). The valid models based on method 1 with r^{2} > 0.75 have \({\text{r}}_{{\text{m}}}^{{2}}\) > 0.5 except model 27 with r_{0}^{2} = 0.101 (close to threshold, 0.1).
The third studied method was proposed by Gramatica and named CCC [4]. Twentynine models have CCC > 0.8. All of them are valid models based on method 1. The results of methods 2 and 3 are very similar. Two models (20 and 27) only have CCC > 0.8 while the defined values near to threshold i.e., 0.4 < \({\text{r}}_{{\text{m}}}^{{2}}\) < 0.5. Method 3 is comparable to developed methods based on RTO. However, it has not statistical defects and nonidentical datum for r_{0}^{2} based on proposed equations (Eq. (3) and (4) or Eq. (5)) or software (e.g. Excel or SPSS).
Method 4 is based on the calculation of model errors for training and test sets and compares them as a possible reliable method to external validation for models with r^{2} > 0.6 for test set. The aim of developing a QSAR model is the prediction and elucidation of mechanisms of drug action. It is obvious that the prediction capability of training and test sets should be identical. Without considering the training set, it possible statistical parameters for external validation of test set could be acceptable but a significant difference (independent ttest) between prediction power of training and test set might be a weakness for the model. Twentysix models have r^{2} > 0.6 and no significant difference between absolute error (AE) of training and test sets (p > 0.05). Twentythree models of them have been selected by CCC as a valid model (CCC > 0.8 and p > 0.05). Model 16 has a CCC = 0.55, and AAE of training and test sets are 0.412 ± 0.352 and 0.645 ± 0.489 (p = 0.16), respectively. High values for SD because of outlier data, is the possible reason for nonsignificant difference between AEs and it could not account validity of the developed model. On the other hand, models 5, 24 and 25 have CCC > 0.9 and p < 0.01. The relative frequencies of AEs for models 5, 24 and 25 sorted in three subgroups, < 0.1, 0.1–0.2 and > 0.2 and illustrated in Figure 1. In these models, AAE values are low; however, there is 50–250% difference between AAE of training and test sets. On the other hand, in model 5, 48% of the training set and 10% of test sets have AE less than 0.1 while 15% of the training set and 60% of test set have AE more than 0.2. Similar patterns are observed in models 24 and 25. In addition, for those models, residual plots have been illustrated in Figure 2. These plots confirm that there is a significant difference between the prediction capability of developed models for training and test sets and it could not be acceptable for a QSAR model to approve prediction capability.
The last method (method 5) proposed by Roy’s research group based on the training set range and mean and standard deviation of test set data [21]. The models could be classified as GOOD, MODERATELY GOOD and BAD according to their proposed parameters. Most of the models were categorized as BAD (45%) and GOOD (39%) and a few models were MODERATELY GOOD models (Table 2). The first point that should be considered is r^{2} > 0.6 as a necessary criterion. All models which have r^{2} < 0.6 classified as BAD model. Moreover, a good correlation is observed between CCC and GOOD model based on method 5. However, model 11 is a GOOD model while CCC = 0.75 and there is a significant difference between AE of training and test set (AAE of training and test sets are 0.05 and 0.13, respectively and p = 0.01). In comparison with method 4, models 5, 24 and 25 (GOOD models) have a vast difference between AAE of training and test set (Figure 1), although the proposed principles in method 5 could not detect it. A model with a statistically significant difference between the AE of training and test sets might not confirm developing a valid model.
Furthermore, model 3 is a BAD model while CCC = 0.84 and pvalue for the difference between AE of training and test is 0.18. AAE of the training set is 0.167 ± 0.171 and 0.266 ± 0.244 (AE ± SD), respectively. High values for SD of training and test sets indicate that there are outlier data which could be considered using statistical parameters e.g. SD of mean errors, in the external validation of QSAR models.
Typographic errors and ununiformity of applied data set for QSAR modeling or mistake in the determination of the biological activity of studied compounds are a common reason for outlier data, which can decrease the prediction capability of a model. Docking study of outlier cases and comparison with other compounds can help researchers to detect outlier data in developing a QSAR model [55].
These results confirm the results of previous studies which more than a single criterion is recommended to assess the real external predictivity of QSAR models [56]. Moreover, other recommended guidelines in developing QSAR models such as cross validation, appropriate splitting training and test sets variable allocation and correlation coefficients adjusted by degrees of freedom, are other important issues in QSAR studies which should be considered by researchers [10, 57,58,59]. In addition, cross (internal) validation analysis e.g., leave many out and leave one out are recommended in QSAR studies especially when the sample size is small [9, 10], and some reports showed its superiority in external validation [60]. Therefore, both internal and external validation analysis with considering various criteria are necessary to check the validity of a QSAR model.
Conclusion
The aim of developing a QSAR model is an acceptable prediction of activity of a compound before synthesis and biological evaluation. Therefore, external validation is necessary. All of the developed methods for external validation of a QSAR model are useful and a good correlation was observed between the studied methods for the selected models. However, some differences were detected between established methods. Methods 1 and 2 are valuable but they are some questionable points in the applied equation for \(r_{0}^{2}\) calculation. CCC is a valuable parameter, though in some cases, it cannot detect outlier data. Similar to methods 1 and 2, training data set are not included in CCC. Method 4 and 5 established based on training and test sets. They detected most invalid models, but method 5 considered some model as a GOOD model while the difference between AE of training and test sets are substantial (p < 0.05). On the other way, high SD value in both of training and test sets may pass proposed criterion of method 4 while accounted as a invalid model because of outlier data in training and test sets. Finally, evaluation of a model with either established method is useful, but they did not necessarily mean validity/invalidity of a QSAR model. The results of this study show the importance of calculation error of training and test sets and detection of outliers for checking the validity of a model.
Availability of data and materials
All data is available as supplementary.
Abbreviations
 QSAR:

Quantitative structure–activity relationship
 AE:

Absolute error
 RTO:

Regression through origin
 CCC:

Concordance correlation coefficient
 AAE:

Absolute average error
 SD:

Standard deviation
References:
Norouzi S, Farahani M, Nejad Ebrahimi S. The Integration of pharmacophorebased 3DQSAR modeling and virtual screening in identification of natural product inhibitors against SARSCoV2. Pharm Sci. 2021;27:S94–108.
Dearden JC. Whither QSAR? Pharm Sci. 2017;23(2):82–3.
Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, et al. QSAR modeling: Where have you been? Where are you going to? J Med Chem. 2014;57(12):4977–5010.
Chirico N, Gramatica P. Real external predictivity of QSAR models: How to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient. J Chem Inf Model. 2011;51(9):2320–35.
Shayanfar A, Shayanfar S. Is regression through origin useful in external validation of QSAR models? Eur J Pharm Sci. 2014;59(1):31–5.
Gramatica P, Cassani S, Roy PP, Kovarich S, Yap CW, Papa E. QSAR modeling is not “push a button and find a correlation”: a case study of toxicity of (Benzo)triazoles on Algae. Mol Informatics. 2012;31(11–12):817–35.
Veselinović JB, Veselinović AM, Toropova AP, Toropov AA. The Monte Carlo technique as a tool to predict LOAEL. Eur J Med Chem. 2016;116:71–5.
Zivkovic M, Zlatanovic M, Zlatanovic N, Golubović M, Veselinović AM. The application of the combination of monte carlo optimization method based QSAR modeling and molecular docking in drug design and development. MiniRev Med Chem. 2020;20(14):1389–402.
Hawkins DM, Basak SC, Mills D. Assessing model fit by crossvalidation. J Chem Inf Comput Sci. 2003;43(2):579–86.
Gütlein M, Helma C, Karwath A, Kramer S. A largescale empirical evaluation of crossvalidation and external test set validation in (Q)SAR. Mol Informatics. 2013;32(5–6):516–28.
Filzmoser P, Liebmann B, Varmuza K. Repeated double cross validation. J Chemometr. 2009;23(4):160–71.
Esbensen KH, Geladi P. Principles of proper validation: use and abuse of resampling for validation. J Chemometr. 2010;24(3–4):168–87.
Gramatica P. External evaluation of QSAR models, in addition to crossvalidation: verification of predictive capability on totally new chemicals. Mol Informatics. 2014;33(4):311–4.
Muratov EN, Bajorath J, Sheridan RP, Tetko IV, Filimonov D, Poroikov V, Oprea TI, Baskin II, Varnek A, Roitberg A, et al. QSAR without borders. Chem Soc Rev. 2020;49(11):3525–64.
Đorđević V, Pešić S, Živković J, Nikolić GM, Veselinović AM. Development of novel antipsychotic agents by inhibiting dopamine transporter: in silico approach. New J Chem. 2022;46(6):2687–96.
Perić V, Golubović M, Lazarević M, Marjanović V, Kostić T, Đorđević M, Milić D, Veselinović AM. Development of potential therapeutics for pain treatment by inducing Sigma 1 receptor antagonism: in silico approach. New J Chem. 2021;45(27):12286–95.
Živković JV, Trutić NV, Veselinović JB, Nikolić GM, Veselinović AM. Monte Carlo method based QSAR modeling of maleimide derivatives as glycogen synthase kinase3β inhibitors. Comput Biol Med. 2015;64:276–82.
HamzehMivehroud M, KhoshravanAzar Z, Dastmalchi S. QSAR and molecular docking studies on nonimidazolebased histamine h3 receptor antagonists. Pharm Sci. 2020;26(2):165–74.
Golbraikh A, Tropsha A. Beware of q2! J Mol Graph Model. 2002;20(4):269–76.
Roy PP, Roy K. On some aspects of variable selection for partial least squares regression models. QSAR Comb Sci. 2008;27(3):302–13.
Roy K, Das RN, Ambure P, Aher RB. Be aware of error measures. Further studies on validation of predictive QSAR models. Chemometr Intell Lab Syst. 2016;152:18–33.
Eisenhauer JG. Regression through the origin. Teach Stat. 2003;25(3):76–80.
Zhang X, Zhang H. 3DQSAR studies on 1,2,4triazolyl 5azaspiro [2.4]heptanes as D3R antagonists. Chem Phys Lett. 2018;704:11–20.
Patil RB, Barbosa EG, Sangshetti JN, Sawant SD, Zambre VP. LQTAR: A new 3DQSAR methodology applied to a set of DGAT1 inhibitors. Comput Biol Chem. 2018;74:123–31.
Aouidate A, Ghaleb A, Ghamali M, Ousaa A, Choukrad M, Sbai A, Bouachrine M, Lakhlifi T. 3D QSAR studies, molecular docking and ADMET evaluation, using thiazolidine derivatives as template to obtain new inhibitors of PIM1 kinase. Comput Biol Chem. 2018;74:201–11.
Gao J, Sun J, Wang T, Sheng S, Huang T. Combined 3DQSAR modeling and molecular docking study on spiroderivatives as inhibitors of acetylCoA carboxylase. Med Chem Res. 2017;26(2):361–71.
Arthur DE, Uzairu A, Mamza P, Abechi SE, Shallangwa G. Activity and toxicity modelling of some NCI selected compounds against leukemia P388ADR cell line using genetic algorithmmultiple linear regressions. J King Saud Univ Sci. 2020;32(1):324–31.
González MP, Teran Moldes MDC, Fall Y, Dias LC, Helguera AM. A topological substructural approach to the mutagenic activity in dental monomers. 3. Heterogeneous set of compounds. Polymer. 2005;46(8):2783–90.
Xu F, Yang ZZ, Ke ZL, Xi LM, Yan QD, Yang WQ, Zhu LQ, Lin FL, Lv WK, Wu HG, et al. Synthesis, antitumor evaluation and 3DQSAR studies of [1,2,4]triazolo[4,3b][1,2,4,5]tetrazine derivatives. Bioorg Med Chem Lett. 2016;26(19):4580–6.
Ugale VG, Patel HM, Surana SJ. Molecular modeling studies of quinoline derivatives as VEGFR2 tyrosine kinase inhibitors using pharmacophore based 3D QSAR and docking approach. Arab J Chem. 2017;10:S1980–2003.
Arthur DE, Uzairu A, Mamza P, Abechi SE, Shallangwa G. Insilco study on the toxicity of anticancer compounds tested against MOLT4 and p388 cell lines using GAMLR technique. BeniSuef Univ J Basic Appl Sci. 2016;5(4):320–33.
Bhatia MS, Pakhare KD, Choudhari PB, Jadhav SD, Dhavale RP, Bhatia NM. Pharmacophore modeling and 3D QSAR studies of aryl amine derivatives as potential lumazine synthase inhibitors. Arab J Chem. 2017;10:S100–4.
Aouidate A, Ghaleb A, Ghamali M, Chtita S, Ousaa A, Choukrad M, Sbai A, Bouachrine M, Lakhlifi T. QSAR study and rustic ligandbased virtual screening in a search for aminooxadiazole derivatives as PIM1 inhibitors. Chem Cent J. 2018;12:32.
Sharma MC, Jain S, Sharma R. Trifluorophenylbased inhibitors of dipeptidyl peptidaseIV as antidiabetic agents: 3DQSAR COMFA, CoMSIA methodologies. Netw Model Anal Health Inform Bioinform. 2018;7:1.
Tong J, Lei S, Qin S, Wang Y. QSAR studies of TIBO derivatives as HIV1 reverse transcriptase inhibitors using HQSAR, CoMFA and CoMSIA. J Mol Struct. 2018;1168:56–64.
Liu G, Wang W, Wan Y, Ju X, Gu S. Application of 3DQSAR, pharmacophore, and molecular docking in the molecular design of diarylpyrimidine derivatives as HIV1 nonnucleoside reverse transcriptase inhibitors. Int J Mol Sci. 2018;19(5):1436.
Behgozin SM, Fatemi MH. 3DQSAR modeling of maximum steadystate fluxes of some substituted benzenes and quinolone derivatives through polydimethylsiloxane membrane. J Iran Chem Soc. 2018;15(6):1293–300.
Kaczor AA, Żuk J, Matosiuk D. Comparative molecular field analysis and molecular dynamics studies of the dopamine D2 receptor antagonists without a protonatable nitrogen atom. Med Chem Res. 2018;27(4):1149–66.
Wang ZZ, Ma CY, Yang J, Gao QB, Sun XD, Ding L, Liu HM. Investigating the binding mechanism of (4Cyanophenyl)glycine derivatives as reversible LSD1 by 3DQSAR, molecular docking and molecular dynamics simulations. J Mol Struct. 2019;1175:698–707.
Singh U, Gangwal RP, Dhoke GV, Prajapati R, Damre M, Sangamwar AT. 3DQSAR and molecular docking analysis of (4piperidinyl)piperazines as acetylCoA carboxylases inhibitors. Arab J Chem. 2017;10:S617–26.
Türkmenoğlu B, Güzel Y. Molecular docking and 4DQSAR studies of metastatic cancer inhibitor thiazoles. Comput Biol Chem. 2018;76:327–37.
ChunZhi H, ShuWei X, Hu W, Jun X, Liangmin Y. Using 3DQSAR and molecular docking insight into inhibitors binding with complexassociated kinases CDK8. J Mol Struct. 2018;1173:498–511.
Ajay Kumar TV, Athavan AAS, Loganathan C, Saravanan K, Kabilan S, Parthasathy V. Design, 3D QSAR modeling and docking of TGFβ type I inhibitors to target cancer. Comput Biol Chem. 2018;76:232–44.
Ounissi M, Kameli A, Tigrine C, Rachedi FZ. Computeraided identification of natural lead compounds as cyclooxygenase2 inhibitors using virtual screening and molecular dynamic simulation. Comput Biol Chem. 2018;77:1–16.
Ghasemi JB, Davoudian V. 3DQSAR and docking studies of a series of β carboline derivatives as antitumor agents of PLK1. J Chem. 2014;2014:10.
Zheng J, Kong H, Wilson JM, Guo J, Chang Y, Yang M, Xiao G, Sun P. Insight into the interactions between novel isoquinolin1,3dione derivatives and cyclindependent kinase 4 combining QSAR and molecular docking. PLoS ONE. 2014;9(4): e93704.
Li Y, Ning J, Wang Y, Wang C, Sun C, Huo X, Yu Z, Feng L, Zhang B, Tian X, et al. Drug interaction study of flavonoids toward CYP3A4 and their quantitative structure activity relationship (QSAR) analysis for predicting potential effects. Toxicol Lett. 2018;294:27–36.
Hao M, Ren H, Luo F, Zhang S, Qiu J, Ji M, Si H, Li G. A computational study on thiourea analogs as potent MK2 inhibitors. Int J Mol Sci. 2012;13(6):7057–79.
Ojha PK, Mitra I, Das RN, Roy K. Further exploring rm 2 metrics for validation of QSPR models. Chemometr Intell Lab Syst. 2011;107(1):194–205.
Avdeef A. Do you know your r2? ADMET DMPK. 2021;9(1):69–74.
Chattefuee S, Hadi AS. Regression analysis by example. 4th ed. Hoboken: Wiley; 2006.
Hulsizer MR, Woolf LM. a guide to teaching statistics: innovations and best practices. Oxford: Wiley; 2009.
Kaneko H. Beware of r2 even for test datasets: using the latest measured yvalues (r2 LM) in time series data analysis. J Chemometr. 2019;33(2): e3093.
Shayanfar A, Ershadi S. Developing new criteria for validity evaluation of analytical methods. J AOAC Int. 2019;102(6):1908–16.
Ghandadi M, Shayanfar A, HamzehMivehroud M, Jouyban A. Quantitative structure activity relationship and docking studies of imidazolebased derivatives as Pglycoprotein inhibitors. Med Chem Res. 2014;23(11):4700–12.
Gramatica P, Sangion A. A historical excursus on the statistical validation parameters for QSAR models: a clarification concerning metrics and terminology. J Chem Inf Model. 2016;56(6):1127–31.
Rácz A, Bajusz D, Héberger K. Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters. SAR QSAR Environ Res. 2015;26(7–9):683–700.
Tóth G, Király P, Kovács D. Effect of variable allocation on validation and optimality parameters and on crossoptimization perspectives. Chemometr Intelligent Lab Syst. 2020;204:104106.
Dearden JC, Cronin MTD, Kaiser KLE. How not to develop a quantitative structureactivity or structureproperty relationship (QSAR/QSPR). SAR QSAR Environ Res. 2009;20(3–4):241–66.
Majumdar S, Basak SC. Beware of external validation!a comparative study of several validation techniques used in qsar modelling. Curr ComputAided Drug Des. 2018;14(4):284–91.
Acknowledgements
Not applicable.
Funding
The authors would like to thanks from Tabriz University of Medical Sciences for the financial support (65369) of the project.
Author information
Authors and Affiliations
Contributions
SS and AS performed data collecting, analysis and manuscript writing. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1.
Fortyfour data sets (training and test sets) composed of experimental biological activity and corresponding calculated activity.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Shayanfar, S., Shayanfar, A. Comparison of various methods for validity evaluation of QSAR models. BMC Chemistry 16, 63 (2022). https://doi.org/10.1186/s13065022008564
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13065022008564
Keywords
 Biological activity
 External validation
 QSAR
 Statistical parameters