VZJ
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 23 January 2008
Published in Vadose Zone J 7:124-130 (2008)
DOI: 10.2136/vzj2007.0021
© 2008 Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Fan, Z.
Right arrow Articles by Casey, F. X. M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Fan, Z.
Right arrow Articles by Casey, F. X. M.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Fan, Z.
Right arrow Articles by Casey, F. X. M.
Related Collections
Right arrow Solute Transport Models
Right arrow Nonequilibrium Transport
Right arrow Vadose Zone Processes and Chemical Transport

TECHNICAL NOTES

Estimating Solute Transport Parameters Using Stochastic Ranking Evolutionary Strategy

Zhaosheng Fan and Francis X. M. Casey*

Dep. of Soil Science, North Dakota State Univ., Fargo, ND 58105
* Corresponding author (francis.casey{at}ndsu.edu).

All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher.


Received 25 January 2007.



    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Case Studies
 Conclusions
 REFERENCES
 
It is a significant challenge to reliably identify fate and transport parameters for solutes that simultaneously undergo multiple and complex processes. Evolutionary optimization techniques allow process parameters to be identified for complicated problems. One evolutionary optimization technique, the stochastic ranking evolutionary strategy (SRES), was used in this study to find the global or near-global optimal parameters for two solute breakthrough curve experiments. The uncertainties of the estimated parameters were assessed using a nonparametric bootstrap resample method. The two miscible-displacement cases considered were (i) a simulated experiment of a solute undergoing transformation and (ii) an actual experiment of trichloroethylene undergoing simultaneous nonequilibrium transport and transformation. In both cases, multiple breakthrough curves were produced from a parent solute and its metabolites, and the SRES was used with an appropriate convective–dispersive model to inversely identify the model parameters. The SRES inverse method provided an excellent model fit to the experimental data, and the model parameter estimates had minimal uncertainty even when 5% experimental error was introduced.

Abbreviations: Abbreviations: CI, confidence interval • GO, global optimization • LO, local optimization • PDE, partial differential equation • SRES, stochastic ranking evolutionary strategy • TCE, trichloroethylene


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Case Studies
 Conclusions
 REFERENCES
 
Computer modeling is one of the most useful strategies available to researchers to discern and predict water transfer and solute transport. Before the 1980s, solute transport simulation developed slowly due to the oversimplification of modeling processes, the lack of data, and the lack of adequate computing resources (Zheng and Bennett, 2002). Most of these difficulties have been reduced with the development of computer technologies, and transport simulations can now include more complicated and detailed processes. However, challenges and uncertainties have arisen as a result of complicated theoretical and conceptual models. One of the biggest challenges in solute transport and water flow modeling is the reliable identification of parameters associated with individual processes that can occur in a mixture of multiple and simultaneous processes.

Many numerical and analytical programs have been developed to inversely model experimental data to estimate equilibrium and nonequilibrium solute transport parameters. Some widely used programs with the capability of inverse transport modeling include 3DADE (Leij and Bradford, 1994), CFITIM (van Genuchten, 1981), CXTFIT (Toride et al., 1995), HYRUS-1D/2D (Simunek et al., 1996 2005), SUFI-2 (Abbaspour et al., 2004), and ModGA_P (Zheng, 1997). These programs approach the model inversion by minimizing an objective function that quantifies the error between model predictions and observations. The parameters used to calculate the objective function are iteratively changed using a gradient method, which determines the "direction" of the parameter search using a slope function. These gradient methods are highly sensitive to the initial values of parameters, and the predicted results are usually only locally optimized (Abbaspour et al., 2004). Additionally, gradient methods can provide poor results in problems with objective functions that have long narrow valleys (e.g., Rosenbrock's function, also known as a banana function). For complicated, nonlinear models with many parameters, the traditional local optimization (LO) methods, including multistart LO, are not suitable to find the best set of parameters (Moles et al., 2003). The reason is that LO methods become easily entrapped in local minimum, and they do not use global information to seek a global minimum of an objective function with multiple local minima.

The two major approaches for global optimization (GO) methods are deterministic and stochastic. The deterministic GO approaches (e.g., branch and bound methods, outer approximation methods) are exact algorithms that provide a theoretical guarantee that a global optimum will be located (Liberti and Kucherenko, 2005). As the size or scale of the problem increases, however, the computational cost of the deterministic GO approach becomes quite expensive, which makes it difficult, if not impossible, to find an exact solution in a finite period of time. Stochastic GO approaches (e.g., clustering methods, simulated annealing) are based on probability theory and assume that only near-global optimality can be located in a finite period of time (Banga et al., 2003).

Several GO strategies have been used to estimate parameters for solute transport, water transfer, and vadose zone processes, which include shuffled complex evolution metropolis (Vrugt and Bouten, 2002; Heimovaara et al., 2004; Vrugt et al., 2004), sequential uncertainty fitting procedure (Abbaspour et al., 2004), annealing-simplex (Pan and Wu, 1998), simulated annealing (Cunha, 1999), and ant colony optimization (Abbaspour et al., 2001). Comparative studies in other research fields (e.g., Banga et al., 2003; Moles et al., 2003; Runarsson and Yao, 2000) have assessed several of these GO strategies for problems with as many as 36 parameter estimates. These studies indicate that several GO methods do not provide efficient parameter estimates and that the SRES GO method (Runarsson and Yao, 2000) is the most efficient and robust for estimating parameters of problems with large numbers of local solutions and large dimensionality. The SRES method uses a "parent and offspring" (µ, {lambda}) evolutionary strategy coupled with stochastic ranking as the constraint handling method to achieve the solution of a given objective function. The SRES method can automatically balance objective and penalty functions in a stochastic manner during the evolutionary search (Runarsson and Yao, 2000).

In this paper, we demonstrate the SRES GO method for its usefulness in confidently estimating large sets of parameters of complex reactive transport models. A challenge in using GO methods is the assessment of optimized parameter quality. This paper also presents a nonparametric bootstrap method to directly assess GO parameter estimate quality.


    Materials and Methods
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Case Studies
 Conclusions
 REFERENCES
 
The common GO problem is defined as

Formula 1[1]
and subjected to

Formula 2[2]
where f(x) is the objective function, xi is the ith parameter, n is the number of parameter to be estimated, ximin and ximax are the lower and upper bounds of the parameter xi, and Rn is the n-dimensional search space bounded by the parametric constraints. A C library function, libSRES (Ji and Xu, 2006), was used in this paper to solve the SRES problem and is free for academic use. The following sections describe a detailed procedure for solving the SRES problem with libSRES.

Step 1: The Objective Function Is Defined
The GO problem is sought to minimize the objective function f(x), which is defined as

Formula 3[3]
where l is the number of data for each experiment, Ci is the known experimental data point, Formula 3 is the predicted data point calculated from the model with a given set of parameters, and wi is a weighing variable. In this paper, wi was set to 1, which means all experimental data were treated equally.

Step 2: Define the Lower and Upper Parameter Estimate Bounds
The lower and upper parameter estimate bounds should have a physical meaning based on literature or experiments, and they should be large enough to cover the entire region of possible solutions. In general, these bounds have no significant influence on the computational cost for a small search space (e.g., few parameters). For a large search space (e.g., more than 12 parameters), however, appropriate bounds will dramatically reduce the computational cost unless high-performance computational resources are available.

Step 3: Select Appropriate Evolutionary Strategies
The number of parents and population size can be set in this step; the ratio is usually 1:7 (parents:population size) (Ji and Xu, 2006). In libSRES, the user can also define other parameters for the SRES strategies, including the coverage speed of the SRES, the pressure that is used to balance the objective and penalty functions, the number of generations, and whether the parents will be selected into the next generations. Determining the population size of the SRES is key to obtaining an efficient global optimization, but it is challenging (Atallah, 1999). Increasing the population size will generally improve model performance by increasing the quality of the initial population and thus improving the possibility of finding the global optimum at an early stage. A large population size, however, also decreases the speed of the total GO process. In this paper, the number of parents was set to 30, and the population size was set to 200 as recommended by Runarsson and Yao (2000).

Step 4: Assess the Simulation
The libSRES starts searching the global optimum using the given objective function, constraints, and optimization strategies. In this study, the fate and transport models contained several partial differential equations (PDEs). To solve these PDEs, a finite difference solver, CVODE (Cohen and Hindmarsh, 1994), which is suitable for both nonstiff and stiff problems, was incorporated into libSRES.

Step 5: Calculate the Confidence Intervals of the Parameter Estimates
After the global optimum is found, the confidence interval (CI) for each parameter is calculated using a generic nonparametric bootstrap method (Efron and Tibshirani, 1993). This is done by first constructing an empirical probability distribution function of the experimental data by setting a probability of 1/l at each data point, where l is the number of experimental data. Second, a random sample size, l, is drawn from the empirical distribution function with replacement (note that some original data will be represent more than once, while some will be omitted). Third, parameters are estimated for each resampled experimental data using the SRES method described earlier (i.e., steps 1–4). At this stage, the bounds of each parameter can be redefined based on the global optimum to reduce the computation time. Fourth, the sampling and reestimation processes are repeated several times (here, 20 times). Fifth, a relative frequency distribution for each parameter is constructed. This frequency distribution can then be used to calculate the CI and other statistical information (e.g., standard error).


    Case Studies
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Case Studies
 Conclusions
 REFERENCES
 
To evaluate the inverse model performance of the SRES method, two cases studies were considered. For the first case study, 13 parameters of a reactive transport model were used to generate solute transport breakthrough curve data. The SRES GO method was then used to inversely model this generated data to back-estimate these same thirteen model parameters, which were then compared to the actual values. Additionally, 5% error was then added to the generated breakthrough curves and then the SRES GO method was used again to estimate the thirteen parameters, which were also compared to the original input parameters. For the second case study, real data from the fate and transport of chlorinated hydrocarbons subject to sequential transformation reactions (Casey and Simunek, 2001), was inversely modeled using the SRES GO method. All the programs used in this paper were written in ANSI C using the libSRES library and run on a PC/Pentium IV (2.99 GHz, 1 GB of RAM) with a Windows XP operation system (Microsoft, Seattle, WA).

Case Study 1
In the first case study, we used the following soil physical properties: bulk density of 1.5 g cm–3, volumetric water content of 0.5 cm3 cm–3, soil column length of 5.0 cm, pore water velocity of 1.0 cm h–1, dispersivity of 0.5 cm, and the solvent pulse input time of 1.0 h. Reactive transport of a hypothetical organic compound, A, was considered in this soil. This compound transformed into two compounds, A1 and A2, in the aqueous phase. Compound A could also be transformed into A2 in the sorbed phase. All three compounds (A, A1, and A2) can be further mineralized to CO2 in the aqueous phase. Compounds A1 and A2 do not undergo further degradation or transformation. All these compounds were also subject to two-site chemical nonequilibrium sorption (e.g., Selim et al., 1977). The following PDEs describe the fate and transport processes of compounds A, A1, and A2 for steady-state flow in a homogeneous porous media:

Formula 5[5]
The subscripts 1, 2, and 3 represent the compounds A, A1, and A2, respectively; {rho}b is soil bulk density (g cm–3), {theta} is the volumetric water content (cm3 cm–3), {lambda} is the dispersivity (cm), v is the pore water velocity (cm h–1), t is time (h), x is depth (cm), C is the aqueous concentration (mg cm–3), S is the sorbed phase concentration (mg g–1), {omega}w is the first-order transformation rate constant in the liquid phase (h–1), {omega}m is first-order mineralization rate constant (h–1), and {omega}s is the first-order transformation rate constant in sorbed phase (h–1). The variable S is given by

Formula 6[6]
where {alpha} is the mass transfer coefficient (h–1), Kd (cm3 g–1) is a linear distribution coefficient between the sorbed and aqueous phases, Se is the sorbed phase concentration on type-1 instantaneous sorption sites, Sk is sorbed phase concentration on the type-2 kinetic sorption sites, and f (0 ≤ f ≤ 1) is the fraction of type-1 sorption sites. Equations [5] and [6] contain 13 unknown parameters: Kd,1, Kd,2, Kd,3, {omega}s,1, {omega}w,1, {omega}w,2, {omega}m,1, {omega}m,2, {omega}m,3, {alpha}1, {alpha}2, {alpha}3, and f.

The initial condition for this problem was given as

Formula 7[7]
where L is the length of the column.

The upper boundary condition of the soil column was given as

Formula 8[8]
and the lower boundary condition of the soil column was given as

Formula 9[9]
where t0 is the pulse duration and C0(t) is the dimensional concentration from time zero to t0.

The best result for the 13 estimated parameters was obtained with a computation time of 66 h using the SRES method (Method 1 in Table 1 ). Table 1 lists the true parameters used to generate the data, the bounds of each parameter, and the optimized parameters. The results of this evaluation are plotted in Fig. 1 . Relatively wide initial ranges were set for each parameter and the initial values (seeds) for each parameter were randomly generated. This was done in the absence of compelling a priori information for the optimal parameter values. The SRES method was able to identify all of the parameters used to generate the model solution, and the 95% CIs of these parameter estimates were narrow. Small discrepancies between the true values and the optimized values may be caused by the global convergence of the SRES method.


View this table:
[in this window]
[in a new window]

 
TABLE 1. The model parameter estimates for case study 1 using stochastic ranking evolutionary strategies.

 

Figure 1
View larger version (21K):
[in this window]
[in a new window]

 
FIG. 1. The pseudo-experimental data (A) without measurement noises, (B) with measurement noises, and the simulated results with the 13 parameters estimated using the stochastic ranking evolutionary strategies. A, A1, and A2 represent the parent compound and its two metabolites, respectively.

 
A large amount of computation time was needed to obtain these optimized parameters because of the wide initial bounds and the problem's dimensionality. To reduce the computation time, the procedure was refined, which involved three steps. In the first step, only the breakthrough curve of compound A was modeled by estimating the parameters Kd,1, {omega}w,1, {omega}w,2, {omega}m,1, {omega}s,1, {alpha}1, and f. The purpose of this step was to obtain a rough idea of the range of parameter values. At this point, it was not necessary to obtain the global optimum for the breakthrough curve A as long as the results approached a predefined objective (e.g., coefficient of determination, r2 = 0.98, or a specific sum-of-squares is reached). After an acceptable result was obtained, the bounds of the parameters to be estimated (i.e., Kd,1, {omega}w,1, {omega}w,2, {omega}m,1, {omega}s,1, {alpha}1, and f) were redefined based on the information obtained in the first step. In the second step, the A1 breakthrough curve was modeled using the reduced bounds obtained from the first step in addition to parameters Kd,2, {omega}m,2, and {alpha}2 with wide bounds. Again, a global optimum was not necessary in this step; only an acceptable optimization result was obtained. At the end of step 2, the bounds of the estimated parameters in steps 1 and 2 were redefined. In the final step, the global optimization was obtained for the A, A1, and A2 breakthrough curves using the redefined parameter bounds from steps 1 and 2 in addition to parameters Kd,3, {omega}m,3, and {alpha}3 with wide bounds. Table 1 summarizes the optimized results using this procedure (Method 2). The results indicate that the computational time decreased from 66 h to 21.9 h. The computation effort, however, was still very time consuming, partly due to the tight convergence criteria used (r2 = 1.0). From a practical point of view, however, an acceptably good result (r2 = 0.99) can be obtained in a few hours, which is shown in the following case.

The excellent parameter estimates from the first case study were obtained from input data with no error or measurement noises. In reality, measurement noise in experimental data can result in parameter estimate errors. To examine the effect of measurement noise on the parameter estimates, a 5% random error was added to each of the generated data points for case study 1 (Fig. 1B). The SRES GO parameter estimates and model fits (Table 1 and Fig. 1B, respectively) were successful even under the condition of relatively large measurement errors. The 13 true values used to generate the synthetic data fell within the 95% CI of the estimated parameters. This indicates that the uncertainties of the estimated parameters were reliable even when measurement noise was present.

Case Study 2
The second case study compares the SRES method to the Casey and Simunek (2001) method for estimating reactive transport model parameters. The Casey and Simunek (2001) paper should be referenced for experimental details. In general, a pulse of trichloroethylene (TCE) solution was passed through columns containing various metal filings. As the TCE passed through these columns, it underwent a transformation into ethylene, resulting in simultaneous TCE and ethylene breakthrough curves. For this second case study, the data used from the Casey and Simunek (2001) study was from the intermediate flow rate from the column filled with iron fillings coated with copper, where the bulk density was 3.09 g cm–3, the volumetric water content was 0.51 cm3 cm–3, the pore water velocity was 37.44 cm h–1, the column length was 12.4 cm, and the solvent pulse input time was 0.33 h. Casey and Simunek (2001) used a modified version of HYDRUS-1D (version 2.0) to inversely model the simultaneous breakthrough curves of TCE and ethylene. The following nonequilibrium, two-site sorption transport model with steady-state flow was used to model the breakthrough curves:

Formula 10[10]
where subscripts 1 and 2 represent TCE and ethylene, respectively, and S is given by

Formula 11[11]
The parameters {lambda}, v, x, {theta}, {rho}b, {omega}s, {alpha}, f, Kd, C, S, Se, and Sk are defined in Eq. [5] and [6]. Equations [10] and [11] contain eight unknown parameters: Kd,1, Kd,2, {alpha}1, {alpha}2, {omega}s,1, {omega}s,2, {lambda}, and f. The initial condition is defined in Eq. [7], the upper boundary condition is defined in Eq. [8], and the lower boundary condition of the column is defined in Eq. [9].

In the Casey and Simunek's (2001) paper, the initial parameter values were obtained from previous batch studies or from the literature. In the current study, very wide initial ranges for each parameter were used in the SRES strategy (Table 2 ). In addition, a second set of initial values (IV2 in Table 2) for HYDRUS-1D was used to show that the Marquardt–Levenberg LO method can easily become entrapped in a local optimum and result in a wrong solution. The optimum model fit (Fig. 2 ) and parameter estimates (Table 2) of the SRES GO method was obtained in a few hours using the standard Method 1 (Table 2) but could be further reduced to 1.3 h using the refined procedure (Method 2 described earlier). Also, the SRES results were quite similar to the results of HYDRUS-1D. Minor differences in the parameter estimates of the two methods (SRES and HYDRUS) may have resulted from differences in the two numerical procedures, where HYDRUS-1D uses finite elements and the SRES GO method uses finite differences. Numerical oscillation or inaccurate convergences are associated with finite element solutions (Fuselier et al., 2006), and errors, such as numerical dispersion, are associated with finite difference solutions (Bedient et al., 1999). These results also indicate that the two sets of initial values (IV1 and IV2 in Table 2) can both yield excellent model fits (e.g., high r2) and narrow 95% CI for the parameter estimates; however, the solutions may yield very different sets of parameters (Table 2), thus demonstrating the sensitivity of the LO methods to initial parameter values.


View this table:
[in this window]
[in a new window]

 
TABLE 2. The model parameter estimates for case study 2 using stochastic ranking evolutionary strategies (SRESs).

 

Figure 2
View larger version (17K):
[in this window]
[in a new window]

 
FIG. 2. Experimental breakthrough curve data from the column filled with iron fillings coated with copper for the intermediate flow rates presented by Casey and Simunek (2001). The solid lines represent the inverse model solutions using the stochastic ranking evolutionary strategies (SRESs), and the dash lines represent the results obtained using HYDRUS-1D. (TCE = trichloroethylene.)

 

    Conclusions
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Case Studies
 Conclusions
 REFERENCES
 
It was demonstrated that reactive transport experimental data could be modeled successfully using the SRES method to confidently estimate large sets of parameters with relatively wide initial parameter bounds. Also, it was demonstrated that uncertainties of GO parameter estimates could be assessed using a nonparametric bootstrap resample method. The main drawback of the SRES method is its computational effort in terms of time; however, the computational effort could be significantly reduced using a refined procedure (Method 2) or by setting narrower but appropriate initial bounds. The computational time may also be reduced using parallel computing.


    ACKNOWLEDGMENTS
 
This research was based on work supported by the National Science Foundation under Grants No. 0244169 and No. 070492. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Case Studies
 Conclusions
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Fan, Z.
Right arrow Articles by Casey, F. X. M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Fan, Z.
Right arrow Articles by Casey, F. X. M.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Fan, Z.
Right arrow Articles by Casey, F. X. M.
Related Collections
Right arrow Solute Transport Models
Right arrow Nonequilibrium Transport
Right arrow Vadose Zone Processes and Chemical Transport


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome