QSAR study on the removal efficiency of organic pollutants in supercritical water based on degradation temperature

This paper aims to study temperature-dependent quantitative structure activity relationship (QSAR) models of supercritical water oxidation (SCWO) process which were developed based on Arrhenius equation between oxidation reaction rate and temperature. Through exploring SCWO process, each kinetic rate constant was studied for 21 organic substances, including azo dyes, heterocyclic compounds and ionic compounds. We propose the concept of TR95, which is defined as the temperature at removal ratio of 95%, it is a key indicator to evaluate compounds’ complete oxidation. By using Gaussian 09 and Material Studio 7.0, quantum chemical parameters were conducted for each organic compound. The optimum model is TR95 = 654.775 + 1761.910f(+)n − 177.211qH with squared regression coefficient R2 = 0.620 and standard error SE = 35.1. Nearly all the compounds could obtain accurate predictions of their degradation rate. Effective QSAR model exactly reveals three determinant factors, which are directly related to degradation rules. Specifically, the lowest f(+) value of main-chain atoms (f(+)n) indicates the degree of affinity for nucleophilic attack. qH shows the ease or complexity of valence-bond breakage of organic molecules. BOx refers to the stability of a bond. Coincidentally, the degradation mechanism could reasonably be illustrated from each perspective, providing a deeper insight of universal and propagable oxidation rules. Besides, the satisfactory results of internal and external validations suggest the stability, reliability and predictive ability of optimum model.


Introduction
Along with sustainable development of industry, a variety of organic pollutants are released into the environment through different ways, which is potentially noxious to human health and the environment [1,2]. Due to the complexity of pollutants and the difficulty of destruction, conventional treatments could hardly remove organic compounds. Advanced oxidation processes (AOPs) have been proven particularly effective and fast for treating a wide variety of organic wastewater [3][4][5][6]. Supercritical water oxidation (SCWO), one of the AOPs, has been taken as an effective method to degrade substances for higher efficiency, faster reaction rate and less selectivity [7,8].
Quantitative structure activity relationship (QSAR) models are rapid and cost-effective alternatives to predict theoretical data through building the relationship between molecular structure and physicochemical properties [9,10]. Several researchers have applied QSAR models to evaluate the eco-toxicity of chemicals without experimental testing [11][12][13]. At present, numbers of studies have investigated the removal of organic pollutants in SCWO system, which mainly focused on two fields. One is the industrial application of the SCWO technology [14,15]. Another is exploring relationship between reaction conditions and the degradation efficiency [16,17]. Compared with factors like pressure and residence time, temperature has been deemed to play a controlling role as reported by Crain et al. [18]. More importantly, the type of treated pollutant accounts for certain appropriate temperature, which is a key indicator when designing and running SCWO system. However, there are seldom researches about theoretical model to offer rapid predictions of systematic effective temperature, which overcome limitations in repeated experiments, like high operational cost and expensive materials [8,19,20]. Therefore, in consideration of the rigorous requirements for reaction system, it is of great value and necessity to explore a convenient and efficient QSAR study. This model is significant in both industrial application and theoretical prediction.
It is our emphasis to figure out a common rule available for SCWO system. Also, the impact of Fukui indices and effective temperature on oxidation process were prioritized in QSAR analysis. Primarily, kinetic experiments of diverse compounds were explored. Later, temperature-dependent QSAR models were developed using multiple linear regression. Finally, validations were performed to testify that the optimal model can robustly make predictions.

Reaction system
The experiments were conducted in a supercritical flow reactor (SFR) system that had been used for previous studies in our laboratory [21]. The major parts consisted of high-pressure plunger pump, hydrogen peroxide tank, waste water tank, gas release valve, check valve, thermometer, pressure gage, heat exchanger, heater and reactor, temperature recording controller, condenser, back pressure regulator and effluent tank. The construction of the SFR was displayed in Fig. 1. It was designed to work under 773.15 K of operating temperature and 30 MPa of operating pressure.
With the aim to study the influence of temperature, compounds thermolysis and oxidation experiments were all performed under isoconcentration (1 g L −1 ) and isobaric (24 MPa) conditions. Meanwhile, reaction system was supplied with sufficient residence time (100-150 s) and oxygen (500% excess). The content of total organic carbon (TOC) in the samples was monitored using a TOC analyzer (TOC-V CPN , Shimadzu Corporation, Japan). Hydrogen peroxide (30 wt%) was used as the oxidant in the SCWO experiments and all reagents were analytical pure.

Arrhenius equation in SCWO system
Temperature is particularly vital in the supercritical reaction conditions. Some orthogonal experiment researches have confirmed the significance of temperature on destruction of the organic structures. The Arrhenius equation is a simple and remarkably accurate formula for the temperature dependence of the reaction rate constant, which can be expressed as follows.
where A is the pre-exponential factor and R is the gas constant. The units of A are identical to those of the rate constant k and will vary depending on the order of the reaction. It can be seen that either increasing the temperature T or decreasing the activation energy E a (for example through the use of catalysts) will result in an increase in rate of reaction. When oxygen exceeds, the degradation process of SCWO system is in accordance with the pseudo-first-order kinetic reaction equation.
In short, the Arrhenius equation gives a reliable and applicable principle between lnk of oxidation reactions and T (in absolute temperature). Based on present researches focused on the relationship between lnk and quantum molecular parameters, function could be assumed as Eq. (3) [22,23]. It is reasonable to develop a temperatures-dependent QSAR in order to predict oxidation efficiency by theoretical descriptors.

Computation details
All the calculations were carried out by using chemical density functional theory (DFT) methods in Gaussian 09 (B3LYP/6-311G level) and Material Studio 7.0 (Dmol3/ GGA-BLYP/DNP(3.5) basis) [24]. Structure optimization and the total energy calculations of the optimized geometries were based on B3LYP method. During the calculation process, exchange and correlation terms were considered with a B3LYP function (6-311G basis set). Meanwhile, natural population analysis (NPA) of atomic charge was obtained by the same method. The localized double numerical basis sets with polarization functional (DNP) from the DMol3 software were adopted to expand the Kohn-Sham orbitals. The self-consistent field procedure was carried out with a convergence criterion of 10 −6 a.u. on energy and electron density. Density mixing was set at 0.2 charge and 0.5 spin. The smearing of electronic occupations was set as 0.005 Ha. Molecular parameters of each organic compound are listed in Table 1. They included energy of molecular orbital (E LOMO /E HOMO ), bond order (BO), Fukui indices [f(+), f(−) and f(0)] and so on. In "Optimization" section, they were introduced in detail.
In order to obtain optimum number of variables for the correlation model, stepwise regression procedure was used to build QSAR models by the SPSS 17.0 for windows program. The quality of derived QSAR was evaluated in accordance with the squared regression coefficient (R 2 ), the standard error (SE) as well as t test and the Fisher test. The internal validation was performed by leave-oneout cross-validation (q 2 ), and the external validation was also computed (Q 2 EXT ). In both validation methods, a validation value greater than 0.5 indicates a robust and predictive model.

Results and discussion
The degradation process of 21 kinds of organic pollutants was investigated at 24 Mpa from the subcritical to supercritical temperature with 500% excess oxygen. Sampling occurred from 523.15 to 773.15 K. An important design consideration in the development of SCWO is the optimization of operating temperature. As shown in Fig. 2, TOC degradation efficiency of compounds tends to be higher with the increase of operating temperature. When the temperature reached 773.15 K, most organics could be totally oxidized into water and carbon dioxide. The compounds are considered to be completely removed while the degradation efficiency reaches 95%. Consequently, we propose the concept of T R95 , which is defined as the temperature at removal ratio of 95%, as the key indicator to evaluate compounds' complete oxidation. T R95 values of the reaction system are distinguished, ranging from 540.65 K (of Methylene blue trihydrate) to 764.26 K (of melamine), which indicate that organic compounds in this study are different and complex. Thus, among diverse molecules, it is significant to set up a temperature-dependent QSAR which can predict SCWO thermodynamics and oxidization activities and conclude universal rules.

Optimization
The structure optimization of organic matter and the calculation of the total energy for the optimized geometry are based on the B3LYP method in Gaussian 09 and Dmol3 code in Material Studio 7.0. All quantum descriptors are directly available from the output file of two software. Finally, as shown in Table 1, we got the following 15 molecular descriptors of organics: dipole moment (μ), most positive partial charge on a hydrogen atom (qH), most negative or positive partial charge on a carbon or nitrogen atom (q(CN) n /q(CN) x ), energy of the lowest unoccupied molecular orbital (E LUMO ), energy of highest occupied molecular orbital (E HOMO ), minimum or maximum of bond order values in the molecule (BO n / BO x ), and maximum or minimum of Fukui indices [f(+) x / f(+) n , f(−) x /f(−) n and f(0) x /f(0) n ].

Main theoretical parameters
All organic pollutants and their 14 respective molecular parameters are listed in Table 1. These theoretical parameters are important to observe which sites are active to be attacked and which bonds are sensitive to be ruptured. Fukui indices, frontier molecular orbits, bond orders are key concepts to portray the decomposition sequence of organic structure in oxidation.
Fukui indices are defined as affinity for radical attack. They are significant for analysis of site reactive selectivity among the oxidation paths, as hydrogen substitution by oxidant radicals and addition of oxidant group to double bonds are the most events. In this study, f(+) n , f(−) n and f(0) n stand for the minimum values of nucleophilic attack, electrophilic attack and · OH radical attack respectively. f(+) x   The variation of each Fukui indices was extremely huge. Moreover, it is noticeable that cyanuric acid and 1-methylimidazole always have high values of all Fukui indices.
As stated earlier, NPA has been developed to calculate atomic charges and orbital populations of molecular wave functions in general atomic orbital basis sets. NPA is an alternative to conventional Mulliken population analysis. It improves numerical stability and describes the charge distribution better. qH is considered as charge of hydrogen atoms in the molecular structure system. q(CN) n and q(CN) x , refer to the minimum and maximum of most negative partial charge on a main-chain carbon or nitrogen atom in the molecule. In this study, qH, q(CN) n and q(CN) x have the average values of 0.355e, − 0.498e and 0.295e respectively. At the same time, the maximum of qH, q(CN) n and q(CN) x reach 0.497e, − 0.191e and 0.945e respectively, while the minimum of them are 0.203e, − 0.787e and − 0.032e respectively. It is also noticeable that the distinguish between the largest and the smallest value of q(CN) x is 0.977e, which is a wide range for compounds, leading the challenges and values of our study.

Construction of QSAR models
Using the obtained molecular descriptors as variables, the correlation models of the predictable rate constants were developed by Multivariate linear regression (MLR) method. There are three out of 14 descriptors, f(+) n , qH, and BO x , correlated well with T R95 respectively. With the exclusion of parameters of the least importance, the relationship for degradation rate of organic pollutants was established using MLR analysis. Three effective models with their associated data indices are shown in Table 2. All the predictable values of T R95 values (Pred.) by three QSAR models and the experimental values are listed in Table 3.
It is widely reported that favorable models are generally determined by R 2 and SE [25,26]. According to the predictable performance shown in Fig. 3 [model (1), (2) and (3)], R 2 increase with the number of variables. To avoid the over-parameterization of model, the value of leave-one-out cross-validation q 2 closer to corresponding R 2 was chosen as the breakpoint criterion. Therefore, model (2) with two descriptors was considered as the best one, which also fits well with both ideal regression (R 2 = 0.620 > 0.600) and internal validation (q 2 = 0.570 > 0.500). These statistics guarantee that the model is very robust and predictive. Apart from that, it can be seen from Fig. 3 that model (2) also had the best fitting curve between the predicted and experimental data. Tested T R95 values increase almost linearly with all organic pollutants except for methylene blue trihydrate   (2) are in observed to be in good agreement. In this view, it is worthwhile and reasonable to predict degradation rules by model (2). Model (2), the optimum model, contains two variables f(+) n and qH. Each variable plays an important role in the supercritical water oxidation process, revealing the reaction rules. Firstly, f(+) n is a measurement of the affinity for nucleophilic attack. When f(+) n is larger, it is easier of main-chain atom (carbon or nitrogen) to be attacked. So, compounds with high f(+) n values have weak endurance to oxidants and not so high appropriate temperature, such as isatin and 3,4-dichloroaniline. Secondly, qH shows the non-uniformity of electric charge on hydrogen, which indicates the ease or complexity of valence-bond breakage of organic molecules. Take Eriochrome blue black R for example, it is tested as high qH value (0.497e), leading to its low efficient degradation temperature (T R95 = 575.30 K).

Validation and performance
To check the stability of optimum model, leave-one-out cross-validation, pairwise correlation coefficients, t test and Fisher test are employed using SPSS 17.0 for window program. The values of leave-one-out cross-validation q 2 of three models are shown in Table 2. As can be seen from that, q 2 of model (2) is the best of three models and is larger than 0.500. Pairwise correlation coefficients of model (2) are shown in Table 4. The correlation coefficients order between the tested values of T R95 and independent variables are as follows: f(+) n > qH > BO x . The correlation coefficient is 0.346 between f(+) n and qH, so model (2) is acceptable.
The standard regression coefficients and t values of independent variables for model (2) are listed in Table 5. And all the absolute t values are larger than the standard one, suggesting that four variables are able to accept. Furthermore, we could evaluate the correlation degree of each independent variable by calculating their variation inflation factors (VIF). VIF = 1/(1 − r 2 ), in which r is the correlation coefficient of multiple regressions between one variable and the others. If VIF ranges from 1.000 to 5.000, the related equation is acceptable; and if VIF is larger than 10.000, the regression equation is unstable and recheck is necessary. It can be seen from Table 5, most VIF values are slightly over 1.000 and the maximum is 5.226, indicating model (2) has obvious statistical significance. An external validation of suggested model has been performed for three compounds, which are not involved in the model-building process. A test set was randomly selected with interval of seven, including Eriochrome blue black R, aniline and 1,10-phenanthroline monohydrate. The Q 2 EXT value (as shown in Table 2) of 0.741 (> 0.500) indicates that suggested models have good predictive potential.

Conclusions
Appropriate reaction temperature is an important factor to design and operate the supercritical water oxidation (SCWO) system. In this paper, QSAR models for organic compounds were developed on the basis of Arrhenius equation between oxidation reaction rate and temperature in SCWO process. According to the calculations of molecular parameters by DFT methods in Gaussian 09 and Material Studio 7.0, f(+) n , qH and BO x appeared in established QSAR models focusing on the impact of Fukui indices and effective temperature, which reveals they are significant in understanding degradation mechanism. The optimum model has ideal regression and internal validation (R 2 = 0.620, SE = 35.1). The results of t test and Fisher test suggested that the model exhibited optimum stability. Both internal and external validations showed its robustness and predictive capacity. Coincidentally, the obtained determinant factors are included with degradation process including the affinity for attack, difficulty of electron loss as well as non-uniformity of valence bond. Together with them, the degradation mechanism could reasonably be illustrated from each perspective, providing a deeper insight of universal and propagable oxidation rules.