VZJ sign up for etocs
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Vrugt, J. A.
Right arrow Articles by Hopmans, J. W.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Vrugt, J. A.
Right arrow Articles by Hopmans, J. W.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Vrugt, J. A.
Right arrow Articles by Hopmans, J. W.
Related Collections
Right arrow Soil Hydrology
Right arrow Inverse Procedures/Parameter Estimation
Right arrow Soil Physics
Vadose Zone Journal 2:98-113 (2003)
© 2003 Soil Science Society of America

Toward Improved Identifiability of Soil Hydraulic Parameters

On the Selection of a Suitable Parametric Model

Jasper A. Vrugt*,a, Willem Boutena, Hoshin V. Guptab and Jan W. Hopmansc

a Institute for Biodiversity and Ecosystem Dynamics, Section Physical Geography, University of Amsterdam, Nieuwe Achtergracht 166, Amsterdam, 1018 WV, The Netherlands
b Department of Hydrology and Water Resources, University of Arizona, Tucson, AZ, 85721, USA
c Hydrology Program, Department of Land, Air and Water Resources (LAWR), University of California, Davis, CA 95616, USA

* Corresponding author (j.vrugt{at}science.uva.nl)

Received 5 July 2002.


ABSTRACT

We present a thorough identifiability analysis of the soil hydraulic parameters in the parametric models of Brooks and Corey (BC; Brooks and Corey, 1964), Mualem–van Genuchten (VG; van Genuchten, 1980), and Kosugi (KC; Kosugi, 1996, 1999) using the recently developed Shuffled Complex Evolution Metropolis (SCEM-UA) algorithm (Vrugt et al., 2002b, and unpublished data). Because the SCEM-UA algorithm globally thoroughly exploits the parameter space and therefore explicitly accounts for parameter interdependence and nonlinearity of the employed parametric models, the algorithm is suited to generate a useful description of parameter uncertainty and its antithesis, parameter identifiability. A set of measured water retention characteristics of the UNSODA database (Leij et al., 1996) spanning a wide range of soil textures and three transient laboratory outflow experiments with decreasing flow rates were used to illustrate that a parameter identifiability analysis facilitates the selection of an adequate parametric model structure and provides useful information about the limitations of a model. Moreover, results suggest that one should be especially careful in establishing pedotransfer functions without knowledge of the underlying posterior uncertainty associated with the soil hydraulic parameters using direct estimation methods.

Abbreviations: BC, Brooks and Corey model • KS, Kosugi model • MCMC, Markov Chain Monte Carlo • MH, Metropolis Hastings • RMSE, root mean square error • SCEM-UA, Shuffled Complex Evolution Metropolis algorithm • SLS, simple least square • VG, Mualem–van Genuchten model

AN ADEQUATE HYDROLOGICAL description of water flow and contaminant transport in the vadose zone relies heavily on accurate estimates of the soil water retention and unsaturated soil hydraulic characteristics. Today many laboratory experiments, ranging from relatively simple static or steady-state flow experiments to more sophisticated transient experiments, can be used to determine these highly nonlinear soil hydraulic functions. Although the static or steady-state flow experiments have the advantage of being relatively simple to implement, they are typically time-consuming and require restrictive initial and boundary conditions to satisfy the assumptions of the corresponding analytical solutions. On the contrary, transient experiments allow much more flexibility in experimental design than the static or steady-state experiments because they do not require steady-state conditions during the experiment, thereby allowing a rapid characterization of the hydraulic properties of the considered soil domain. With transient experiments, the soil hydraulic properties are inferred indirectly (e.g., inverse modeling) by fitting a numerical solution of the Richards equation to observations of the measured variables during the experiment.

During the last two decades, a great deal of research has been devoted to exploring the applicability and suitability of the inverse approach for the identification of soil hydraulic properties. That research has focused primarily on five issues: (i) the type of transient experiment and kind of prescribed initial and boundary conditions suited to yield a reliable characterization of the soil hydraulic properties (Hopmans et al., 1992; van Dam et al., 1992, 1994; Ciollaro and Romano, 1995; Santini et al., 1995; Simunek and van Genuchten, 1996, 1997; Simunek et al., 1998b; Romano and Santini, 1999; Durner et al., 1999; Wildenschild et al., 2001), (ii) the determination of the appropriate quantity and most informative kind of observational data (Zachmann et al., 1981; Kool et al., 1985; Parker et al., 1985; Kool and Parker, 1988; Valiantzas and Kerkides, 1990; Toorman et al., 1992; Eching and Hopmans, 1993; Eching et al., 1994; Durner et al., 1999; Vrugt et al., 2002a), (iii) the selection of an appropriate model of the soil hydraulic properties (Zachmann et al., 1982; Russo, 1988; Zurmühl and Durner, 1998), (iv) the construction and weighting of multiple sources of information in an objective function (Van Dam et al., 1994; Clausnitzer and Hopmans, 1995; Hollenbeck and Jensen, 1998; Vrugt and Bouten, 2002), and (v) the adoption and development of techniques to infer the residual parameter uncertainty remaining after calibration (Kool and Parker, 1988; Hollenbeck and Jensen, 1998; Vrugt and Bouten, 2002). While there has been considerable attention to the experimental development of transient experiments, relatively little attention has been paid to the development of optimization routines that solve the inverse problem for the indirect estimation of soil hydraulic functions and that compute accurate probabilistic uncertainty values for the optimized soil hydraulic parameters. In any case, unique and identifiable parameters are required so pedotransfer functions (Leij et al., 2002) can be successfully applied to relate hydrologic model parameters to easily measurable soil or land surface characteristics.

In a previous paper (Vrugt and Bouten, 2002), we established the limitations of local gradient-based search methodologies widely employed in soil hydrology to solve the inverse problem for transient experiments. Starting from an arbitrary initial set of parameter values these local gradient-based algorithms are easily trapped in local minima along their way to the global minimum, thereby terminating their search prematurely. For instance, to quote Hopmans et al. (2002)(p. 8) "... it is recommended to test the uniqueness of a parameter set by solving the inverse problem repeatedly using different initial parameter estimates."

In collaboration with colleagues at the University of Arizona, we have recently developed an effective and efficient method, entitled the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA). The method is suited to simultaneously infer the most likely parameter set and its underlying probabilistic posterior distribution within a single optimization run (see Vrugt et al., 2002b and unpublished data). The SCEM-UA recognizes the existence of multiple behavioral soil hydraulic parameter sets, each associated with a probabilistic likelihood (e.g., Keesman, 1990; van Straten and Keesman, 1991; Beven and Binley, 1992). Moreover, because the SCEM-UA algorithm globally thoroughly exploits the parameter space and therefore explicitly accounts for parameter interdependence and nonlinearity of the employed hydraulic functions, the algorithm is suited to generate a useful description of parameter uncertainty and its antithesis, parameter identifiability.

Using the current available SCEM-UA algorithm, we conducted a thorough identifiability analyses of the parameters in BC, VG, and KS, the three commonly used parametric models of soil hydraulic properties. An identifiability analysis of the soil hydraulic parameters for different soil types and experimental conditions constitutes important information for studies that aim to find pedotransfer functions, relating the directly inferred soil hydraulic parameter to basic soil data (e.g., Wösten et al., 2001; among many others). Poor identifiability of any of the soil hydraulic parameters will inhibit our ability to relate these parameters to easily measurable basic soil properties.

The remainder of this paper is organized as follows. In the following sections, we present a short introduction on the inverse identification of soil hydraulic parameters, present the SCEM-UA algorithm suited to tackle the problems encountered when inversely identifying hydraulic parameters from transient experiments, and briefly review the BC, VG, and KS parametric models to describe soil hydraulic properties. Subsequently, the SCEM-UA algorithm is applied to a set of measured retention characteristics of the UNSODA database (Leij et al., 1996) spanning a wide range of soils and to three laboratory transient outflow experiments with decreasing drainage rates to assess diagnostic measures of parameter identifiability for different soil types and experimental conditions on an identical soil sample. In this section we are especially concerned with the multivariate correlation structure induced in the joint distribution of the parameters. Finally, we present a summary and draw conclusions.

Inverse Identification of Soil Hydraulic Parameters

To facilitate the description of the employed optimization procedures, let us cast the hydrologic model, f, in a general statistical framework,

[1]
where y is t x 1 vector of model predictions, X is t x p matrix of input variables and ß is a vector of p unknown parameters. We assume that,

[2]
where p denotes the p-dimensional Euclidean space. If B is not the entire domain space p, the problem is said to be constrained. Given observed output values of y, denote the residual errors from the predictions by

[3]

The classical approach of model calibration is to find the best attainable values of the parameters ß such that R in Eq. [3] is in some sense made as close to zero as possible. By far the most popular measure for R is the simple least square (SLS) or maximum likelihood estimator, appropriate when the measurement errors are believed to be homoscedastic and uncorrelated,

[4]

Under these circumstances, Hollenbeck and Jensen (1998) stretched the importance of model adequacy before sound statements can be made about the final parameter estimates and their uncertainty,

[5]
where {sigma}T denotes the error deviation of the measurements and Q(·) is the {chi}2 cumulative distribution with (t - p) degrees of freedom. Having homoscedastic and uncorrelated error residuals, the adequacy test gives us a measure of how well the optimized model fits the observations, relative to their measurement precision. Using the definition of Eq. [5], models are adequate if padeq > 0.5, since then SLS < t{sigma}2T.

Many algorithms have been developed in the past to solve the nonlinear SLS optimization problem stated in Eq. [4]. These algorithms may be classified as local search methodologies, when seeking systematic improvement of the objective function using an iterative search starting from a single arbitrary initial point in the parameter space, or as global search methods in which multiple concurrent searches from different starting points are conducted within the parameter space.

One of the most simple local search optimization methods, which is commonly used in the field of soil hydrology is a Gauss–Newton type of derivative based search,

[6]
where ß(k+1) is the updated parameter set and {nabla}f(ß(k)) and H(ß(k)) denote the gradient and Hessian matrix, respectively, evaluated at ß = ß(k),

[7]

From an initial first guess of the parameters ß(0), a sequence of parameter sets is generated {ß(1),ß(2),...,ß(k+1)} that is intended to converge to the global minimum of R(ß) in the parameter space. If f(X|ß) depends linearly on each parameter ßj (j = 1,2,...,p), the minimization problem stated in Eq. [6] reduces to a linear regression problem for which analytical solutions exist.

The derivative-based search defined in Eq. [6] and [7] will evolve toward the global solution in the parameter space in situations where the objective function exhibits a topographical convex shape in the entire parameter domain. However, practical experience with hydrologic models suggests that the objective function seldom satisfies these restrictive conditions. To illustrate this problem, consider Fig. 1 (taken from Vrugt and Bouten, 2002), which presents posterior marginal distributions (histograms) for four hydraulic parameters in the global minimum of the parameter space, derived using measured outflow dynamics from a transient multistep outflow experiment. When the objective function exhibits a convex shape in the entire parameter domain, the histograms for each of the soil hydraulic parameters exhibit a clear Gaussian distribution with a single mode. However, the large number of different modes (local minima) for each of the parameters depicted in Fig. 1 is the most probable reason for the numerous reports in the literature of the inability to find a unique set of hydraulic parameter values using observed outflow dynamics from multistep outflow experiments (Kool et al., 1985; Parker et al., 1985; Toorman et al., 1992; van Dam et al., 1992; among others). Because the local gradient-based search algorithms are not designed to handle the peculiarities of the response surface illustrated in Fig. 1, they will terminate the search prematurely with their final solution essentially being dependent on the starting point in the parameter space. Another emerging problem reported by Hopmans et al. (2002) is that some of the hydraulic parameters are typically highly correlated using observed outflow dynamics, further reducing the chances of getting a single unique solution with local search methodologies.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 1. Univariate posterior probability distributions for the soil hydraulic parameters (A) {theta}r, (B) {alpha}, (C) Ks, and (D) lvg in the Mualem–van Genuchten model (1980) using observed outflow dynamics during the a transient one-step outflow experiment. The symbol p(ß) along the y-axis in each of the figures denotes the posterior density.

 
The existence of these nonuniqueness problems with local search methodologies has led soil hydrologists to argue that there is not sufficient information in the measurements to enable a unique characterization of the soil hydraulic properties. However, we argue that it is not the information in the measurements that is lacking to obtain a unique set of parameters, but the problem is that the classical local-search optimization procedures utilized in soil hydrology are typically not powerful enough to solve the problems illustrated in Fig. 1. Our arguments on uniqueness of parameters are not based on the convergence problems of the applied optimization methods, but on the shape of the posterior marginal distributions of the parameters. Hence, closer inspection of Fig. 1 demonstrates that for each of the parameters there is a single optimum with highest posterior probability in the global minimum. Indeed, although not shown here, the multivariate probability distribution of the parameters confirms that these regions with highest posterior probability for each parameter coincide. In other words, the uniqueness of parameter estimates is now inferred from the shape of the univariate posterior probability distributions of the parameters. In particular, ill-posedness of an inverse problem due to data-noninformativeness issues, as demonstrated by Zijlstra and Dane (1996), would result in large uncertainties associated with the final parameter estimates. Thus, an optimization algorithm that successfully identifies the multivariate joint probability distribution of the parameters in the vicinity of the global minimum has desirable properties, since it not only provides useful information about the interactions of the parameters in the full parameter space, but also provides useful information about the quality of the data. Hence, ultimately data quality still determines the reliability of the final parameter estimates.

Only recently have more powerful optimization methods for the identification of soil hydraulic parameters begun to appear in the literature. These include the use of annealing simplex methods (Pan and Wu, 1998), genetic algorithms (Takeshita, 1999; Vrugt et al., 2001), grid sampling strategies (Abbaspour et al., 1997), and ant-colony methods (Abbaspour et al., 2001). Notwithstanding their successful application, these global search algorithms are computationally very demanding, requiring a substantial number of model evaluations to converge to the global solution. Moreover, they do not provide any information about the probabilistic uncertainty associated with the final parameter estimates.

In collaboration with colleagues at the University of Arizona, we have recently developed a general-purpose code, entitled the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA), which is robust and efficient and therefore admirably suited to tackle the topographical problems encountered with the response surface depicted in Fig. 1, while generating a description of the remaining parameter uncertainty (Vrugt et al., 2002b, and unpublished data). The algorithm is a Markov Chain Monte Carlo (MCMC) sampler generating a sequence of parameter sets {ß(1),ß(2),...,ß(k+1)} that convergences to the stationary posterior distribution, p(ß|y) for a large number of simulations. The SCEM-UA algorithm is related to the successful SCE-UA global optimization method (Duan et al., 1992, 1993), but uses the Metropolis Hastings (MH) strategy (Metropolis et al., 1953; Hastings, 1970) instead of the downhill simplex method for population evolution, and is therefore able to simultaneously infer both the most likely parameter set and its underlying posterior probability distribution within a single optimization run. By merging the strengths of the MH algorithm, controlled random search, competitive evolution, and complex shuffling, the SCEM-UA is designed to evolve to a stationary posterior target distribution of the parameters. The stochastic nature of the MH annealing scheme avoids the tendency of the SCE-UA algorithm to collapse into a single region of attraction (local minima), while the information exchange (shuffling) between parallel sequences allows the search to be biased in favor of better regions of the solution space.

Shuffled Complex Evolution Metropolis Algorithm

Taking a Bayesian perspective, the aim of model calibration is then to infer the posterior probability distribution, p(ß|y), which describes what is known about the model parameters ß given the observed data y and the prior information. We assume that the mathematical structure of the combined Richards hydraulic model is fixed and that we can define a uniform prior distribution on the feasible parameter space (e.g., between physically realistic upper and lower bounds on each of the model parameters). To solve this optimization problem, the SCEM-UA algorithm begins with an initial population of points (parameter sets) randomly distributed throughout the feasible parameter space using a Latin Hypercube sampling strategy. For each parameter set, the posterior density is computed using a Bayesian inference scheme (Box and Tiao, 1973; Thiemann et al., 2001),

[8]

[9]
where {gamma} [-1,1] is a fixed "shape parameter" that can be regarded as a measure of kurtosis, indicating the extent of the nonnormality of the error density distribution. The density is normally distributed when {gamma} = 0, double exponential when {gamma} = 1, and tends to a uniform distribution as {gamma} -> -1. In the case studies reported here, we assumed a Gaussian error model (i.e., {gamma} = 0). In this particular instance, the term between brackets defined in Eq. [8] simply reduces to the SLS measure previously defined in Eq. [4]. Moreover, the model adequacy test defined in Eq. [5] is then valid. The advantage of using Bayesian estimators rather than simple least-squares estimators for the problem of finding unsaturated soil hydraulic parameters has recently been discussed by Hollenbeck and Jensen (1998). The population of points is partitioned into q complexes, and in each complex j (j = 1,2,...q) a parallel sequence is launched from the point that exhibits the highest posterior density. A new candidate point in each sequence j is generated using the current draw for that particular sequence in combination with the parameter covariance structure induced in the corresponding complex j. The Metropolis-annealing (Metropolis et al., 1953) criterion is used to test whether the candidate point should be added to the current sequence or not. Subsequently, the new candidate point randomly replaces a member of the current complex j using a trapezoidal weight distribution. Finally, after a prespecified number of iterations, the complexes are mixed and new complexes are formed through a process of shuffling. This procedure enhances survivability by a sharing of information about the search space, gained independently by each parallel sequence. This series of operations results in a robust MCMC sampler that conducts a robust and efficient search of the parameter space. Convergence of the SCEM-UA algorithm to a stationary posterior target distribution is monitored by comparing the mean and variance within and between the different q generated parallel sequences with respect to each of the parameters (see Gelman and Rubin, 1992),

[10]
where SR is the scale reduction convergence diagnostic, N is the number of iterations within each sequence, B is the variance between the q sequence means, and W is the average of the q within-sequences variances for the parameter under consideration respectively. A score of 1 for indicates convergence. However, as a score of unity is difficult to achieve, Gelman and Rubin (1992) recommended using values <1.2 to declare convergence to a stationary distribution. For more information about the SCEM-UA algorithm, please refer to Vrugt et al. (2002b).

Diagnostic Measures of Parameter Identifiability in Nonlinear Models

In view of the complicated nonlinear nature of soil hydrologic models, it is evident that a classical, traditional first-order approximation of parameter uncertainty and its antithesis, parameter identifiability, of the various parameters is often not adequate (Vrugt and Bouten, 2002). Hence, besides exhibiting strong and nonlinear parameter interdependence, the surface of the posterior parameter distribution can deviate significantly from the multi-normal distribution depicted in Fig. 1. Since the SCEM-UA algorithm performs a thorough exploitation of the global parameter space and therefore explicitly accounts for parameter interdependence and nonlinearity of the employed hydraulic functions, the algorithm is suited to generate a useful description of parameter uncertainty and its antithesis, parameter identifiability. Diagnostic measures of parameter identifiability can be estimated by computing various statistical measures of the multivariate posterior distribution of the parameters (i.e., mean, standard deviation, and covariance structure). Moreover, the shape of the univariate posterior probability plots in the vicinity of the optimal solution reveals important information about the uniqueness of the final estimates.

Review of Water Retention and Unsaturated Hydraulic Conductivity Models

Several functions have been proposed to empirically describe the soil water retention curve and unsaturated soil hydraulic conductivity. The unsaturated soil hydraulic properties, {theta}(h) and K(h) are in general highly nonlinear functions of the pressure head h. In the following we briefly review three different parametric models for the hydraulic properties (Brooks and Corey, 1964; van Genuchten, 1980; Kosugi, 1996, 1999).

Brooks and Corey Model
Brooks and Corey (1964) suggested a water retention model (the BC model) in which the water content is expressed as a power function of the soil water pressure head,

[11a]
where {theta} (L3 L-3) is the volumetric water content, h is the soil water pressure head (L), hb is the bubbling pressure (L), and {lambda} is a unitless soil characteristic parameter.

When combining the Brooks and Corey model with the capillary model of Mualem (1976), the hydraulic conductivity functions are obtained,

[11b]
where Ks denotes the saturated hydraulic conductivity (L T-1) and lbc is a unitless pore-connectivity parameter.

Mualem–van Genuchten Model
One of the most commonly used models of capillary pressure head–saturation is the water retention function of van Genuchten (1980), who used the statistical pore-size distribution model of Mualem (1976) to obtain a predictive equation for the unsaturated hydraulic conductivity function in terms of water retention parameters.


[12a]

[12b]
where {alpha} is related to the inverse of the air-entry pressure (L), n is a unitless measure of the pore-size distribution, m is related to n by m = 1 - 1/n (van Genuchten, 1980), and lvg is a unitless pore-connectivity parameter.

Lognormal Pore-Size Distribution Model of Kosugi
Applying a lognormal distribution law to the probability density function of the soil pore radius, Kosugi (1996) derived the following soil water retention model:

[13a]
in which, h0.5 (L) is the capillary pressure head for which the saturation fraction equals 0.50 and {sigma} (-) is the standard deviation of the pore-size distribution. Recently, Kosugi (1999) derived a general model for unsaturated hydraulic conductivity for soils that obey lognormal pore-size distribution,

[13b]
where, lk is a parameter related to the soil pore tortuosity (-).

In conclusion, the different soil water retention functions contain four parameters: the saturated ({theta}s) and residual water content ({theta}r), a pore throat size parameter (hb, {alpha}, and h0.5), and a parameter that relates the width of the pore radius distribution ({lambda}, n, and {sigma}). The predictive equations for K(h) introduce two additional parameters, the saturated conductivity (Ks) and a pore-connectivity parameter (lbc, lvg, and lk).

CASE STUDIES

We applied the SCEM-UA algorithm to infer the identifiability of soil hydraulic parameters in the different parametric model structures for two case studies with increasing complexity. The first case study considers assessing diagnostic measures of parameter identifiability using a set of soil water retention data. In this study we are especially concerned with the relationship between the soil type under investigation and the uncertainty associated with the retention parameters in each of the parametric model structures. The second case study involves the identification of the posterior uncertainty of the soil hydraulic parameters using three transient laboratory outflow experiments with decreasing flow rate on an identical soil sample. In this study we explore whether the identifiability of the hydraulic parameters is related to the size, consecutive order, and number of pressure steps applied in the outflow experiment.

Soil Water Retention Curves of a Sandy, Loamy, and Clayey Soil
Soil water retention measurements were taken from the UNSODA soil hydraulic properties database (Leij et al., 1996). Experimental data were selected for a sandy soil (4520), sandy loam (4130), and clayey soil (2362) to obtain a range from coarse- to fine-textured soils. The number in parentheses refers to the UNSODA code. The error standard deviation of the water content measurements in Eq. [8] was set identical to 0.01 (m3 m-3), which is a conservative estimate of the error of the observed water contents using a pressure plate experiment.

Figure 2 presents the prediction uncertainty intervals of the soil water retention curves for the BC (Fig. 2A), VG (2B), and KS (2C) parametric models associated with the uncertainty in the parameter estimates (blue region) for the sandy and clayey soil. The red symbols correspond to the measured retention data, whereas the most likely parameter set is indicated with the dark line. The most likely parameter values for the different soils as well as their final root mean square error (RMSE) and model adequacy are listed in Table 1. In general, the water retention fit was generally very good for all soils and all parametric models. As the VG and KS do not account for the bubbling pressure but do have an inflection point, those models performed better than the BC model for many soils, particularly for data near saturation (Kosugi, 1996; van Genuchten and Nielsen, 1985). Hence, the discontinuity in the slope of the water retention curve in the latter model formulation is in most cases not clearly visible in the experimental data. The prediction uncertainty ranges bracket the observed retention data for the entire range of pressure heads, confirming the adequacy of the retention models (see Table 1). It is interesting to note that for more fine-textured soils, the most likely parameter set typically evolved well removed from the center of the uncertainty bounds, suggesting a skewed distribution for at least one of the retention parameters in the parametric models. This is demonstrated in Fig. 3, which presents a histogram of the SCEM-UA sampled {lambda}, n, and h0.5 parameters for the clayey soil.



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 2. Prediction uncertainty intervals of the Brooks and Corey , Mualem–van Genuchten, and Kosugi retention curves associated with the uncertainty in the posterior parameter estimates for the sandy and clayey soil. Measured retention points are indicated with the red symbols, whereas the dark line indicates the most likely parameter set.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Optimized soil hydraulic parameters of the sandy, sandy loam, and clayey soil for the selected parametric models. Also included are the RMSE values and model adequacy values at the final solution (in parentheses).

 


View larger version (18K):
[in this window]
[in a new window]
 
Fig. 3. Univariate posterior probability distributions of the parameters (A) {lambda}, (B) n, and (C) h0.5 in the Brooks and Corey , Mualem–van Genuchten, and Kosugi parametric models for the clayey soil. The symbol p(ß) along the y-axis in each of the figures denotes the posterior density.

 
Table 2 presents the most likely parameter set, posterior mean, standard deviation, and coefficient of variation derived from the SCEM-UA samples for the parameters in the retention models of BC, VG, and KS, along with the correlation structure induced between the joint distributions of the retention parameters for the sandy, sandy loam, and clayey soils. Given the same experimental range of pressure heads, Table 2 demonstrates that the uncertainty in the retention parameters will generally increase with increasing water holding capacity of the soil (i.e., compare standard deviations and CV values of parameters between soils) because of the lack of measurements in the drier water content range for the more fine-textured soils. This is depicted in Fig. 4, which illustrates how the values of the first-order sensitivity coefficients to the various water retention parameters of the different soils in the VG model behave across a broad range of pressure head values.


View this table:
[in this window]
[in a new window]
 
Table 2. Most likely, posterior mean (Mean), standard deviation (SD) and coefficient of variation (CV, %) for the BC, VG, and KS model parameters derived using the SCEM-UA algorithm. The structure induced in the joint distribution of the SCEM samples is indicated with the correlation coefficient between the parameters.

 


View larger version (22K):
[in this window]
[in a new window]
 
Fig. 4. Sensitivity of water content to the retention parameters (A) {theta}s, (B) {theta}r, (C) {alpha}, and (D) n in the parametric model of Mualem–van Genuchten for the sandy soil, sandy loam, and clayey soil.

 
Figure 4 demonstrates that experiments that yield a wide range of water contents or pressure heads are beneficial for parameter estimation studies, since the measurements then contain independent information for most of the parameters. This increases the identifiability of the parameters and enhances the likelihood of uniqueness of the final parameter estimates. In general, the pressure head at which maximum sensitivity to the parameters {alpha}, n, and {theta}r occurs shifts toward higher pressure head values with increasing water holding capacity of the soil. Especially for fine-textured soils, it is to be expected that estimated {theta}r values are not very accurate because of the low sensitivity of this parameter in the experimental water content range that is far from the residual water content (see Table 2). Moreover, Fig. 4A, 4B, 4C, and 4D show that the highest sensitivities for the clayey soil, occur in the same measurement range, thereby explaining the strong correlation structure between the retention parameters {theta}r and n found in Table 2 and the large uncertainties associated with the residual water content for fine-textured soils.

Table 2 demonstrates that the standard deviations and CV values of the different parameters not only exhibit large differences within one model structure, but also show clear differences among identical counterparts of model structures. More precisely, while CV values for the saturated water content are nearly identical for all three parametric models, clear differences in CV values are found among the pore throat parameters ({alpha}, hb, and h0.5), the residual water content, and the parameters relating the width of the pore-size distribution (n, {lambda}, and {sigma}) in each of the parametric models. Whereas these differences are generally small in the case of the sandy and sandy loam soil, they become quite large for more-fine textured soils. For instance, for the clayey soil, the CV values of the shape parameters h0.5 and {sigma} in the KS model are significantly larger than their corresponding counterparts in the retention models of Brooks and Corey and van Genuchten. To understand why, consider Fig. 5, which presents a scatter plot of 4000 {theta}rhb (Fig. 5A), hb{lambda} (5B), {theta}r{lambda} (5C), {theta}r{alpha} (5D), {alpha}n (5E), {theta}rn (5F), {theta}rh0.5 (5G), h0.5{sigma} (5H), and {theta}r{sigma} (5I) SCEM-UA samples for the clayey soil from the different parametric models. Figure 5 depicts that for the clayey soil, the correlation coefficients between the parameters are generally higher in the KS model (Fig. 5G–5I) as compared with the BC model (Fig. 5A–5C) and VG model (Fig. 5D–5F). More specifically, while in the BC and VG models, only the parameters {theta}r{lambda} (Fig. 5C) and {alpha}n (Fig. 5F) are strongly correlated for the clayey soil, in the case of the KS model there are three pairs of parameters that show considerable correlation, {theta}rh0.5 (Fig. 5G), h0.5{sigma} (Fig. 5H), and {theta}r{sigma} (Fig. 5I). The strong hyperbolic-shaped correlation structure associated with the parameters {theta}r, h0.5, and {sigma} of the KS model not only decreases the identifiability of the parameters, but also enhances nonuniqueness problems of the final optimized parameters. Although the results presented here are restricted to two different sets of retention observations (sandy and clayey soil), analyzing other sets of water retention data of the UNSODA database yielded a similar picture (i.e., for more fine-textured soils). The correlation between the retention parameters in the posterior distribution is generally stronger for the KS model than for the BC and VG models.



View larger version (44K):
[in this window]
[in a new window]
 
Fig. 5. Scatter plot of 4000 combinations of (A) {theta}rhb, (B) hb{lambda}, (C) {theta}r{lambda}, (D) {theta}r{alpha}, (E) {alpha}n, (F) {theta}rn, (G) {theta}rh0.5, (H) h0.5{sigma}, and (I) {theta}r{sigma} parameters sampled for the clayey soil using the SCEM-UA algorithm for the BC model (A–C), VG model (D–F), and KS model (G–I), respectively.

 
The main reason that the retention parameters are more correlated in the KS model than in the BC and VG retention models in the case of limited information in the dry water content range is that the pore throat size parameter h0.5 in the KS model is defined more closely to the residual water content than identical counterparts (hb and {alpha}) in the BC and VG retention models. Hence, the parameter h0.5 is defined as the pressure head value at which {theta} = ({theta}s + {theta}r)/2, while the bubbling pressure hb in the BC model and (1/{alpha}) in the VG model are directly related to the air-entry value of the soil, in the water content range close to saturation. Consequently, with limited information in the dry water content range, this will automatically lead to high correlation between the parameters {theta}r, h0.5, and {sigma} in the KS model. Although not explicitly presented in this paper, if one would introduce a similar h0.5 parameter in the BC and VG models, which replace the existing hb and {alpha} parameters, respectively, then identical correlation structures are found between the parameters in these models as found for the KS-model.

Notwithstanding, the statistical moments of the posterior distribution of the parameters presented in Table 2, Fig. 4, and Fig. 5 demonstrate that there is significant uncertainty associated with some of the retention parameters for the sandy loam and clayey soil. For example, Fig. 5 depicts that for the clayey soil the posterior distribution for the parameters {theta}r, h0.5, and {sigma} extend over the entire prior defined range for this parameter, clearly implying a lack of uniqueness. Although the individual parameter uncertainty will hardly directly affect the output of a Richards type of hydrologic model, since the retention parameters are always evaluated in a combined manner with the hydraulic model, their large uncertainty will inhibit our ability to find pedotransfer functions relating these parameters to easily measurable basic soil properties. Hence, for the more fine-textured soils (e.g., Fig. 5) it is typically difficult to select a single well-defined set of retention parameters suited for use in pedotransfer functions. Therefore, one should be especially careful in selecting a single representation of the hydraulic parameters for use in pedotransfer functions without knowledge of the underlying posterior uncertainty associated with these parameters (Schaap et al., 1998; Wösten et al., 2001). Uniqueness of the parameters is a prerequisite to our being able to successfully find pedotransfer functions.

The results in this section illustrate two additional important findings. First, given the same experimental data, the CV values and scatter plots of the sampled parameters illustrate how the choice of a model, and its characterization of the soil water retention curve, influences the identifiability of the parameters within the structure. Second, with recourse to the quality of the fit only (e.g., Table 1), the competing parametric models of VG and KS appear to fit the experimental data equally well. However, when examining other aspects, such as the multivariate correlation structure as inferred from the joint distributions of the SCEM-UA generated samples (Fig. 5), it becomes clear that there are drawbacks associated with the correlation structure of the parameters in the Kosugi model for fine-textured soils. It is clear that when possessing comparable predictive capabilities, the nonlinear model that most closely approaches linear behavior is favorable. Besides identifiability problems of the parameters, for linear models fewer iterations are necessary to achieve convergence in parameter estimation, and traditional computationally undemanding first-order statistical inferences will be more valid.

Transient Laboratory Outflow Experiments
Laboratory outflow experiments are now routinely used to simultaneously estimate the water retention and soil hydraulic conductivity characteristics from a single experiment (Hopmans et al., 2002). Because of the popularity of the outflow method, it is important to examine the influence of the boundary conditions on the final identifiability of the soil hydraulic parameters. From this perspective Wildenschild et al. (2001) recently performed one-step and multistep outflow experiments on identical, disturbed soil samples to evaluate the influence of drainage rate on water entrapment and thus on the estimated retention and unsaturated soil hydraulic conductivity curves. Because the focus of their study was a single best set of hydraulic parameters describing the outflow dynamics, no attention was given to the statistical uncertainty of the final parameter estimates. This dataset constitutes a unique opportunity to investigate whether the identifiability of the hydraulic parameters is related to the size, consecutive order, and number of pressure steps applied in the outflow experiment. We revisited the data for the loamy (Columbia) soil, consisting of a one-step (0–500 mbar), and two multistep experiments (0–250–500 mbar, 0–125–250–375–500 mbar). For more information about the experimental setup and soil, please refer to Wildenschild et al. (2001).

The unsaturated soil hydraulic parameters ({theta}r, hb, {lambda}, Ks, lbc), ({theta}r, {alpha}, n, Ks, lvg), or ({theta}r, h0.5, {sigma}, Ks, lk) for the BC, VG, and KS models were inversely estimated using the HYDRUS-1D model (Simunek et al., 1998b), which numerically solves Richards' equation in one dimension using a Galerkin-type linear finite element scheme. The saturated water content ({theta}s) was fixed at its measured value of 0.45 m3 m-3 for the Columbia soil. As we assumed a joint parametric description of the retention and unsaturated hydraulic conductivity curves, inversion of the Richards equation yielded parameter estimates that apply to both curves simultaneously.

Since there is no unambiguously correct way in which to select an appropriate value of the error variance {sigma}T in Eq. [8], two different scenarios were used to explore the identifiability of the soil hydraulic parameters using transient laboratory outflow experiments. In the first scenario (Scenario I), the error standard deviation of the outflow and pressure head readings were set identical to their measurement errors of 0.011 and 1 cm, respectively, reported by Hopmans et al. (2002). Using this kind of approach to transient unsaturated outflow observations, Hollenbeck and Jensen (1998) demonstrated that none of the model simulation results was within the measurement error bracket of the outflow observations, suggesting the presence of structural model inadequacies. To explicitly account for this structural error in the second scenario (Scenario II), the error variances of the different measurement types were set to the RMSE values of the best fits reported by Wildenschild et al. (2001). In this approach the error deviation of the measurements, {sigma}T in Eq. [8], is the sum of a measurement error term ({sigma}E) and a term accounting for structural model errors ({sigma}M).

Unfortunately, performing a global sampling of the feasible parameter space with the SCEM-UA algorithm will lead to physically unrealistic parameter combinations, thereby causing convergence problems of the solution of the Richards equation implemented in the HYDRUS-1D code. These convergence problems were so severe for the BC model (also caused by the discontinuity in the description of the soil water retention and hydraulic conductivity characteristics), that almost all of the outflow simulations were corrupted with large numerical errors, regardless of the settings of the numerical scheme. Consequently, the remainder of this discussion will focus on only the VG and KS parametric models.

Figure 6A and 6B illustrate the evolution of the Gelman–Rubin convergence statistic for the parameter {theta}s in the VG and KS parametric model structures during optimization iterations. Additionally, Fig. 6C and 6D present the evolution of the current best RMSE values found by the SCEM-UA algorithm for the observed outflow measurements and pressure head readings in the soil core. The error bars depict the standard deviation in the convergence diagnostic and RMSE values derived when running 10 independent trials with the SCEM-UA algorithm for the multistep (0–125–250–375–500 mbar) experiment for both parametric models. Due to random initializations in the apriori defined parameter space, the individual generated sequences tend to occupy different regions of the posterior surface at early stages during the evolution. This low mixing of the sequences is associated with a relatively high value of the Gelman–Rubin statistic, indicating no convergence. Thereafter, the convergence statistic for {theta}s narrows very quickly, suggesting convergence to a stationary posterior distribution after approximately 2000 HYDRUS-1D evaluations for both parametric model structures. Additionally, the results in Fig. 6C and 6D illustrate that the best RMSE values of the VG and KS model for the outflow and pressure head readings are substantially larger than the precision of the measurement devices, confirming the presence of structural model inadequacies.



View larger version (30K):
[in this window]
[in a new window]
 
Fig. 6. Evolution of the Gelman–Rubin convergence statistic for the parameter {theta}s in the (A) Mualem–van Genuchten and (B) Kosugi parametric model structures. Evolution of the current best RMSE value describing either observed outflow dynamics or matric head readings within the soil core for the (C) Mualem–van Genuchten and (D) Kosugi models.

 
Table 3 summarizes the posterior mean, standard deviation, and coefficient of variation of the various hydraulic parameters in the VG and KS parametric model structures for the different outflow experiments and scenarios. The results presented in this table illustrate several important findings:
  1. For both scenarios, the selected set of hydraulic parameters in both parametric model structures was well identifiable by calibration to measured outflow dynamics augmented with pressure head data in the soil core.
  2. The drainage rate imposed during the outflow experiment marginally influenced the final posterior uncertainty associated with the hydraulic parameters.
  3. The competing model structures appeared to yield fairly identical CV values of the parameters, suggesting that none of the parametric model structures is superior in parameter identifiability.


View this table:
[in this window]
[in a new window]
 
Table 3. Posterior mean, standard deviation, and coefficient of variation for the VG and KS model parameters using the SCEM-UA algorithm with two different scenarios for the error deviations of the measurements.

 
This finding for this loamy soil type confirms the results in Table 2 for the sandy and sandy loam soil types. Notice that the posterior uncertainty associated with the parameters has increased if the model error is included in the error analysis.

Inspection of the multivariate posterior distribution of the soil hydraulic parameters in both model structures for both scenarios revealed the existence of a single well-defined mode for each of the parameters. Examination of the correlation structure induced in the joint distributions of the sampled parameter sets demonstrated that the correlation between the parameters was typically low (R < 0.60) for all outflow experiments, with the exception of the {theta}rn (Fig. 7A), Kslvg (Fig. 7B), {theta}r{sigma} (Fig. 7C), and Kslk (Fig. 7D) parameters in the VG (Fig 7A and 7B) and KS (Fig. 7C and 7D) parametric models, respectively. A similar correlation structure between the retention parameters for more fine-textured soils, as illustrated in Table 2 and Fig. 5 is observed for the loamy soil in the outflow experiments. This suggests that the joint appearance of the parameters n and {sigma} in the retention and hydraulic conductivity curve does not influence the joint posterior distribution of these parameters.



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 7. Scatter plot of 1500 combinations of (A) {theta}rn, (B) Kslvg, (C) {theta}r{sigma}, and (D) Kslk parameters sampled for the one-step outflow experiment using the SCEM-UA algorithm in the case of Scenario II for the Mualem–van Genuchten model (A–B) and Kosugi model (C–D), respectively.

 
Figure 8 depicts how the posterior uncertainty associated with the soil hydraulic parameters for the one-step outflow experiment translates into uncertainty associated with the retention and unsaturated soil hydraulic conductivity curves of the VG and KS parametric models. The purple and blue regions correspond to the uncertainty intervals for Scenarios I and II, respectively, whereas the directly estimated retention and hydraulic conductivity points are indicated with the red symbols. Including the model error term in the error deviations of the measurements (blue region) significantly increases the size of the Bayesian confidence intervals for the retention and hydraulic conductivity curves, thereby improving the fit to the directly estimated retention points. Note, however, that the fit to the directly estimated hydraulic conductivities is quite poor for both scenarios, partly because of the correlation structure between Ks and l. It suggests that further attempts to improve the combined Richards–soil hydraulic property models would be most productive if focused on the validity of model assumptions.



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 8. Prediction uncertainty intervals for the retention and hydraulic conductivity curves associated with the uncertainty in the parameter estimates for Scenarios I (purple) and II (blue) for the multistep (0–125–250–375–500 mbar) outflow experiment using the (A,C) Mualem–van Genuchten and (B,D) Kosugi parametric models. The red symbols correspond to the directly estimated retention and hydraulic conductivity points, whereas the dark line denotes the most likely parameter set for Scenario II.

 
This is also demonstrated in Fig. 9, which shows prediction uncertainty intervals for the HYDRUS-1D (VG and KS) simulated outflows associated with the posterior uncertainty of the parameter estimates for Scenario I and II (purple and red regions) and the residual model uncertainty (blue region), for the multistep (0–125–250–375–500 mbar) outflow experiment. The black symbols correspond to the observed outflow data. Neglecting the presence of model error in the analyses of the posterior uncertainty of the parameter estimates leads to small prediction uncertainty ranges that only partly bracket the observed outflow. When explicitly accounting for the model structural error in the Bayesian inference scheme, the prediction uncertainty ranges of the HYDRUS-1D simulations increase, thereby including most of the measured outflow (and pressure head readings) for both models. However, for both scenarios, the prediction uncertainty associated with the posterior parameter estimates did not bracket the observations in the beginning of the outflow experiment, indicating a systematic model error. The relatively large size of the residual model uncertainty as compared with the residual parameter uncertainty for both scenarios demonstrates that either the model structure is in need of further improvements, or the assumed experimental errors are larger than assumed.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 9. (A–B) Outflow prediction uncertainty ranges of the HYDRUS-1D model associated with the uncertainty in the model (blue) and parameter estimates for Scenarios I (purple) and II (red) of the Mualem–van Genuchten and Kosugi parametric models for the multistep (0–125–250–375–500 mbar) outflow experiment (B–D). Uncertainties in the simulated outflow associated with the most probable parameter set of the Mualem–van Genuchten and Kosugi models. The blue region denotes model uncertainty, whereas the residual parameter uncertainty is indicated with the purple (Scenario I) and red regions (Scenario II). The black symbols correspond to the observed outflow data.

 
Comments about Model Adequacy Testing
It has been suggested by Hollenbeck and Jensen (1998) that a predictive model must be adequate before meaningful statements can be made about the final parameter estimates and their posterior uncertainty. We agree with their conclusion. According to the classical statistical theory for the fitting of empirical models to data, a model is adequate when the average residual error is smaller than the precision of the measurement devices. For the transient outflow experiments, these considerations lead us to conclude that the combined Richards–hydraulic model structure should be rejected because the residual of the best attainable fit is much larger than the error of the measurement devices (Fig. 6). Our experience with the calibration of hydrologic models in a broader sense suggests that the majority of these models will fail this adequacy test since the residuals after calibration are usually substantially larger than the errors of the measuring devices. However, this does not necessarily mean that the final parameter estimates in these models and their confidence intervals are meaningless; these models might possess excellent predictive capabilities. For instance, closer inspection of Fig. 9 suggests that the residual errors are typically large close to saturation, but decrease in the dry range of the outflow experiment. We argue therefore that one cannot draw strong conclusions about the adequacy of a model structure by using a single overall statistic that aggregates residual errors across a large range of hydrological behavior and measurement types. Moreover, it is our opinion that the testing of the adequacy of a model structure should include explicit recognition of the role of model error.

SUMMARY AND CONCLUSIONS

The quality of a parametric model structure is primarily determined from the ability of the model structure to fit the measurements. We have suggested that the choice of a suitable parametric model must include explicit recognition of the identifiability of the parameters within its model structure. Hence, "good" parameter identifiability is of utmost importance, if we want to be able to successfully determine and apply pedotransfer functions.

Measured water retention characteristics from the UNSODA database and three laboratory outflow experiments served to illustrate the quality and efficacy of three commonly used parametric model structures of soil hydraulic properties to infer soil physical parameters from laboratory experiments. The study considered estimation of parameter identifiability within the different model structures using sets of water retention observations ranging from a coarse- to a fine-textured soil. In general, uncertainty in the retention parameters increased with increasing water holding capacity of the soil. Consequently, for more fine-textured soils, the poor identifiability of some of the retention parameters will inhibit our ability to relate these model parameters to other easily measurable soil properties. Most significantly, using the SCEM-UA algorithm, CV values and scatter plots of the sampled parameters demonstrated how the choice of a model structure, characterizing the soil water retention curve, influenced the identifiability of the parameters.

The second case study examined the influence of the drainage rate and imposed boundary conditions during a laboratory transient outflow experiment on the identifiability of the soil hydraulic parameters. Results demonstrated that the total set of hydraulic parameters in each of the parametric models was well determined from calibration to outflow data and that there was no significant influence between the applied boundary conditions in the outflow experiment and the posterior uncertainty associated with the hydraulic parameters. Most significantly, the relatively large size of the model structural error as compared with the residual parameter uncertainty for the transient outflow experiments demonstrated that the model is deficient at near saturation and may need improvement. As a whole, the parameter identifiability analyses presented here provide useful information about the limitations of a model and will help in the selection of an adequate model structure.

Finally, attention is drawn to the free availability of software implementing the Shuffled Complex Evolution Metropolis, PIMLI, and Metropolis algorithm presented in this and our other work. The software is written in the MATLAB environment and can be obtained from the first author.

ACKNOWLEDGMENTS

The Earth Life Sciences and Research Council (ALW) supported the investigations of the first author with financial aid from the Netherlands Organization for Scientific Research (NWO).

REFERENCES