Published online 24 August 2006
Published in Vadose Zone J 5:951-962 (2006)
DOI: 10.2136/vzj2005.0130
© 2006 Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
SPECIAL SECTION: PARAMETER IDENTIFICATION AND UNCERTAINTY ASSESSMENT IN THE UNSATURATED ZONE
Evaluation of Model Complexity and Input Uncertainty of Field-Scale Water Flow and Salt Transport
G. Schoups and
J. W. Hopmans*
Hydrology Program, Department of Land, Air and Water Resources (LAWR), University of California, Davis, CA 95616, USA; G. Schoups, Currently at Department of Geological and Environmental Sciences, Stanford University, Stanford, CA 94305, USA
* Corresponding author (jwhopmans{at}ucdavis.edu)
Received 14 November 2005.
 |
ABSTRACT
|
|---|
Prediction of large-scale vadose zone water flow and salt transport is affected by errors due to uncertainties in model structure, that is, complexity of representation of hydrologic processes and model input uncertainties such as parameter values, boundary conditions, and initial conditions. Selection of an appropriate level of model complexity must consider these two sources of uncertainty. We illustrate this selection process for the prediction of field-scale crop transpiration and salt drainage in irrigated agriculture given a limited number of point-scale measurements. Various levels of model input uncertainty in hydraulic properties and infiltration rates are considered, representative of a range of spatial heterogeneities. Model complexity is defined relative to a "true" model, in terms of the level of spatial and temporal averaging used in the approximate model. This set-up allows for the separation of the total model error into different terms representing the model input error and the structural model error. Results show that the relative contribution of structural model error to the total model error decreases as spatial heterogeneity or uncertainty in the model input increases. Model input uncertainty may be reduced by taking a larger number of point-scale measurements. Our analysis further illustrates that there exists an optimal trade-off between model complexity and model input uncertainty. It suggests that complex models may be replaced by more simple ones as long as the resulting structural model error is smaller than the prediction errors due to input uncertainty. The methodology provides a straightforward and objective approach to identifying an optimal level of model complexity as a function of the degree of field-scale heterogeneity and data availability.
Abbreviations: ADE, advectiondispersion equation pdf, probability density function RE, Richards equation
 |
INTRODUCTION
|
|---|
TWO FUNDAMENTAL PROBLEMS complicate large-scale (field to regional) modeling of variably saturated subsurface flow and transport. First, no fundamental "laws" have been identified that describe flow and transport at the field-scale. Although Darcy's Law, the Richards equation (RE), and the advectiondispersion equation (ADE) have been shown to be valid at the local (laboratory) scale, their application at the field scale is debatable, for example in the presence of preferential flow and transport (
im
nek et al., 2003). Second, parameterization of these models is hampered by the tremendous spatial heterogeneity of the subsurface (e.g., soil unsaturated hydraulic conductivity) and the spatial and temporal variations of boundary conditions such as infiltration and evapotranspiration rates. Therefore, large-scale modeling results will be erroneous due to uncertainties in model structure and model input. In the following, we will refer to uncertainty in model structure as "structural uncertainty." This also includes errors associated with spatial and temporal discretization. Uncertainty associated with the model input, which includes parameters, boundary conditions, and initial conditions, will be referred to as "model input uncertainty." A third source of uncertainty is related to measurement errors of the system state. All these uncertainties lead to discrepancies between simulated and observed states, resulting in a total model error:
 | [1] |
where
o represents the observation error when measuring the state of the system (e.g., soil moisture content),
i is the error due to model input uncertainty, and
s is the error due to structural uncertainty. Theoretically, we expect
s to decrease when model complexity increases, such as by adding additional relevant hydrological processes or increasing the spatial and temporal discretization of the numerical simulation model. Similarly, we may also assume that
i decreases as more data are collected. However, in reality, a trade-off exists between the effects of data availability (inversely related to
i and
o) and model complexity (inversely related to
s) on predictive performance (Grayson and Blöschl, 2000). This is illustrated schematically in Fig. 1
. For a given amount of available data, an optimum model complexity exists. If the model is too complex, parameter identifiability may be difficult, whereas on the other hand the available data are not fully exploited by models that are too simple. As indicated in Fig. 1, this relationship between data availability and model complexity can be formulated in terms of uncertainties; for example, large parameter uncertainty justifies increasing structural uncertainty (less complex model). More importantly, Fig. 1 suggests that changes in optimum model complexity as a function of data availability are related to the scale of the application. When simulating small-scale systems, such as in a laboratory soil column, data availability is usually large and parameter uncertainty is small, thereby justifying the use of a complex model. When moving to larger scales, data availability decreases and parameter uncertainty increases, with
i and
o dominating the total model error
t. In that case, a less complex model with larger structural uncertainty may be appropriate, as long as
s remains small compared with
i and
o. Large-scale model applications are generally prone to large measurement errors, because the observed state of the system usually must be estimated from a large number of point measurements (Hopmans et al., 2002), or must be inferred indirectly from remotely sensed data (e.g., Boegh et al., 2004).

View larger version (26K):
[in this window]
[in a new window]
|
Fig. 1. Hypothesized relation between data availability and model complexity, and similarly between structural and model input uncertainty.
|
|
Hence, the discussion points to the need for balancing the complexity of model components, recognizing that the model accuracy, as measured by the magnitude of
t, is determined by the weakest error component, that is, by either process representation (
s) or the availability of appropriate data to estimate the model input (
i) and the system state (
o). One approach to choosing an appropriate level of model complexity is to start off with the simplest model that can reproduce the data and to add additional processes only if it results in an improved fit to the data. The approach ensures convergence to an optimum level of model complexity, where the data are fully exploited and model complexity is minimized. The advantages of using the simplest possible model are: (i) it focuses on the most relevant hydrological processes; (ii) it simplifies interpretation of simulation results; (iii) it may result in faster computations, making it easier to assess parameter uncertainties; and (iv) it reduces the number of parameters that need to be estimated.
Since Fig. 1 suggests that model complexity and errors from model simplifications are a function of model input uncertainty, the objective of this study was to investigate the relative effects of model input and structural uncertainty on total model error. As was suggested earlier, such studies may lead to the identification of an optimum level of model complexity as a function of the level of model input uncertainty. Examples of this philosophy of simplifying process descriptions in view of significant heterogeneity or uncertainty in field and regional applications include the work by Bresler and Dagan (1983), van der Zee and Boesten (1991), Chen et al. (1994), de Wit and Pebesma (2001), Perrin et al. (2001), Schoups and Hopmans (2002), and Schoups et al. (2006).
First, the general framework for error analysis is presented and its application to the prediction of field-scale water flow and salt transport in irrigated agriculture is outlined. This is followed by a discussion of the results for two case studies, and a summary of our findings.
 |
MATERIALS AND METHODS
|
|---|
Framework for Error Analysis
In this section the different types of errors are defined and the methods used for quantifying them are discussed. Let Y be the true value of a system state of interest (e.g., soil moisture content, crop yield), and Yo the corresponding observed value, then
 | [2] |
where
o is the observation error, which is the error of measuring the true state Y. At the local or point scale it is typically small and related to the accuracy of the measurement technique (e.g., measurement of moisture content in a soil core). At larger scales, however, direct observation becomes more difficult, and either many point-scale measurements are needed or an indirect technique such as remote sensing is used, leading to larger observation errors. The true state Y may also be compared with the output of a simulation model. Suppose that the true model of the system is known, then
 | [3] |
where Ys is the state simulated with the true model, and
i is the error due to model input uncertainty. It includes the effects of all errors in estimating the model inputgenerally the model parameter values, the initial conditions, and the boundary conditions. In the absence of input uncertainty, the true model reproduces reality exactly. In other words, it includes and describes all physical processes correctly. Typically the true model is not known, but only an approximate model is available, such that
 | [4] |
where
s is the state simulated with the approximate model, and
s is the structural model error. For example, a relevant physical process not accounted for by the approximate model will introduce an error
s. On the basis of the three basic errors defined above (observation error
o, model input error
i, and structural model error
s), we can derive composite errors describing the discrepancy between the true state and the approximate model
 | [5] |
as well as the discrepancy between the observed state and the approximate model
 | [6] |
where the last equation is basically the same as Eq. [1]. Since typically the true model is not known, the last two errors are usually lumped together in a general model error term. In our current analysis, we assume that the true model is known, so that the different error terms can be computed.
The general outline to compute the different error terms is shown in Fig. 2
. Given the true model input vector x and the true model f, we can compute the true system state as, Y = f(x). However, input x is typically not known exactly, and the uncertainty in the estimation or measurement of the model input x is typically quantified by treating x as a random vector characterized by a joint probability density function (pdf). Randomly drawing from this pdf yields an estimate
which will deviate from the true x. This uncertainty is propagated through the true model f, Ys = f (
), and through the approximate model
,
s =
(
). Based on the values for Y, Ys, and
s, the input and structural errors are calculated using Eq. [3] and [4]. In addition, the observation error is computed by drawing from a predefined pdf for
o. Repeated sampling from the pdfs of
and
o generates a large number of values for
o,
i, and
s from which sums of squared errors are calculated,
 | [7] |
for k equal to o, i, or s. These SSE values normalized by the total SSE (SSEt = SSEo + SSEi + SSEs) provide insight into the relative contributions of the observation, model input, and structural model errors to the total model error. Note that this approach is like a traditional analysis of variance (ANOVA) used to evaluate the relative statistical significance of various factors. The main purpose of this paper is to illustrate how these relative contributions change as a function of model complexity and model input uncertainty. We will do this using a case study from irrigated agriculture, as discussed in the next section.

View larger version (33K):
[in this window]
[in a new window]
|
Fig. 2. Schematic outline for calculating the different errors: observation error, model input error, and structural model error. f denotes the true or reference model, and is the approximate model.
|
|
Field-Scale Water Flow and Salt Transport in Irrigated Agriculture
At the point scale variably saturated water flow and salt transport is well understood and described by the Richards and advectiondispersion equations. However, prediction at the field scale is affected by uncertainty due to the large spatial heterogeneity in, for example, soil hydraulic properties and the difficulties with making direct measurements at the field scale. The spatial heterogeneity usually has to be inferred from a limited number of point-scale measurements, hence introducing uncertainty (Hopmans et al., 2002). Therefore, the problem discussed here is that of predicting field-scale water flow and salt transport given a number of point-scale measurements of the input vector x. In particular, we are concerned with predicting field-scale average (or total) crop yield and salt drainage for a heterogeneous field. It is the purpose to demonstrate how uncertainties in the spatial variability affect the desired complexity of the model.
For illustration purposes the commonly used parallel-columns model is adopted to describe field-scale water flow and salt transport (e.g., Bresler and Dagan, 1983; Destouni and Cvetkovic, 1991; van der Zee and Boesten, 1991; Toride and Leij, 1996; Kavvas et al., 1996; Foussereau et al., 2000). It is assumed that the field can be conceptualized as a collection of one-dimensional vertical noninteracting streamtubes or columns, each with a set of depth-averaged parameters. Variably saturated water flow and salt transport through each column is described with a one-dimensional model and field-scale flow and transport is obtained by statistically averaging these point-scale solutions over the probability distribution of the model input. Its validity for shallow depths has been illustrated in two-dimensional numerical experiments (Russo, 1991) and in field experiments (Lee and Casey, 2005). Although in reality the problem is three-dimensional, this simplification suffices for our purposes. "True" and approximate models will therefore only differ in the way that they represent point-scale one-dimensional flow and transport within a single vertical column. With that in mind, the true value for the field-scale average crop yield or salt drainage is
 | [8] |
where f is the true point-scale model and p is the true pdf of the model input x, which describes the true spatial heterogeneity of x within the agricultural field. Spatial correlation is ignored in the present analysis, which is justified as long as the horizontal extent of the agricultural field is much larger than the horizontal spatial correlation of x. Equation [8] also assumes ergodicity; that is, the areal average equals the ensemble average. Accordingly, field-scale predictions with the true model Ys and with the approximate model
s are given by
 | [9] |
and
 | [10] |
where
is the approximate point-scale model and
is the experimental pdf of x estimated from n point-scale "measurements" of x, which are obtained by drawing n random samples from the true pdf p. Next, Eq. [8]
through [10] are used to calculate the various error terms, namely the model input error
i and the structural model error
s, as discussed earlier (Fig. 2). The difference with Fig. 2 is that (i) instead of generating a single random sample
, n samples are generated and used to estimate
, and (ii) the errors are not based on a single model run but instead on various point-scale model runs to compute field-scale behavior according to Eq. [8]
through [10]. The "experiment" of taking n random point-scale measurements is repeated 10 000 times, and each time
i and
s are calculated. Finally, sums of squared errors, SSEi and SSEs are computed and normalized by the total SSEi + SSEs to assess their relative contributions. For the sake of simplicity observation errors are ignored. The integrals in Eq. [8]
through [10] are calculated using a fourth-order Runge-Kutta algorithm (Press et al., 1990).
It is assumed that the main source of uncertainty is due to the spatial heterogeneity of net infiltration, caused by variations in soil hydraulic conductivity and nonuniform irrigation methods (Warrick and Gardner, 1983; Letey et al., 1984; Dagan and Bresler, 1988). We prescribe the pdf of net infiltration rates q0 directly, assumed to be lognormal with mean µq0 and standard deviation
q0. In addition, spatially variable hydraulic properties are described by a lognormal distribution of scaling factors
that scale the pressure head h and the hydraulic conductivity K according to the Miller and Miller (1956) scaling relations:
 | [11a] |
 | [11b] |
where href and Kref represent reference water retention and hydraulic conductivity functions, respectively. Since variation in net infiltration is primarily caused by spatial heterogeneity in hydraulic properties (Hopmans, 1989), we further assume that net infiltration correlates perfectly with hydraulic conductivity, or
 | [12] |
Therefore, spatial heterogeneity is completely described by the lognormal pdf of net infiltration. Different levels of spatial heterogeneity are considered, with CVq0 ranging from 0 to 200%. Possible correlation between crop water stress parameters and soil hydraulic properties (Hupet et al., 2004) is not considered here.
Point-Scale Water Flow and Salt Transport in Irrigated Agriculture
As discussed in the previous section, the true and approximate models only vary in their description of point-scale water flow and salt transport. In this section we describe the different point-scale models used.
True or Reference Model
The true model is based on the solution of the one-dimensional Richards and advectiondispersion equations. The one-dimensional Richards equation is given by
 | [13] |
where h is the soil water pressure head (L),
is volumetric water content (L3 L3), K is hydraulic conductivity (L T1), t is time (T), z is depth below the land surface (L), and rw defines the root water uptake rate (1/T)
 | [14] |
where Tp is daily potential crop transpiration (L T1), r is the root distribution with depth and time (1/L), c is dissolved salt concentration (M L3), and
(h,c) accounts for water and salt stress effects on root water uptake. Here, a multiplicative parametric model for
(h,c) is used (Table 1), which has been compared with salinity data by van Genuchten (1987). The root distribution r (Table 1) is described by the exponential model of Raats (1974), which was normalized here to ensure that its integration over the root-zone depth Lr equals one. The root distribution parameter
changes the distribution from uniform (
) to very shallow with all the roots near the surface (
0). Root development during the growing season is simulated using an S-shaped function (Table 1), which assumes that the rooting depth Lr increases from an initial (L0) to a maximum value (L). The root growth coefficient rg (1/T) describes the rate of root development. Finally, the dependence of K and h on
in Eq. [13] is represented by the van Genuchten (1980) model, also shown in Table 1. The dissolved salt concentration c is obtained by solving the one-dimensional ADE:
 | [15] |
where D is the dispersion coefficient (L2 T1), and q is the Darcy water flux (L T1).
View this table:
[in this window]
[in a new window]
|
Table 1. Parametric models for root water uptake and hydraulic properties. h50, ph, c50, and pc are water uptake stress parameters, Lr is rooting depth, is a dimensionless root distribution parameter, L0 and L are initial and final (maximum) rooting depths, rg is a root growth coefficient (1/T), Tptot is seasonal transpiration (L), h is pressure head (L), s is saturation, is moisture content, r is residual moisture content, s is saturated moisture content, Ks is saturated conductivity (L T1), and n is a soil parameter.
|
|
The physical model described by Eq. [13]
through [15] is solved for an entire crop growing season of length ts, with the spatial domain extending from the land surface (z = 0) to the bottom of the maximum root-zone depth (z = L). Initial (IC) and boundary conditions (upper, UBC; lower, LBC) for water flow and salt transport are specified as follows,
 | [16] |
where hi and ci are uniform initial values for pressure head and salt concentration, q0 is the daily infiltration rate (L T1), c0 is salt concentration of the infiltrating water (M L3), and S0 is incoming salt load (M/L2T]. For simplicity, it is assumed that evaporation is much smaller than crop transpiration, and can be ignored. The lower boundary condition in Eq. [16] assumes gravity drainage (zero soil water pressure head gradient) and a zero concentration gradient, as in the case of a deep water table. This means that the drainage rate qL equals K(hL), where hL is the simulated pressure head at depth L, and the drainage salt load SL equals qLcL, where cL is the simulated salt concentration at depth L. The model domain is discretized into 1-cm increments and the model is numerically solved using the HYDRUS code (
im
nek et al., 1998), yielding values for h,
, and c as a function of depth z and time t. Variables of practical interest calculated from this are seasonal relative crop transpiration Tr, and seasonal salt drainage Stot (M L2) below the root-zone at depth L,
 | [17a] |
 | [17b] |
where Ta is actual daily transpiration (L T1), given by Ta =
rwdz. In the absence of water and salt stress, transpiration occurs at the potential rate, and Ta equals Tp.
Approximate Models
The main impetus for deriving an alternative simpler model is that the level of spatial and temporal detail provided by the complex simulation models is not required in practice because one is often interested in integrated quantities, such as seasonal crop yield. Uncertainties in the model input, especially at larger scales, may justify the use of a simplified model, as long as the resulting structural model error is smaller than the model input error.
As a first case study the number of vertical nodes in the reference model was gradually decreased from 101 to 3, corresponding to the extremes of 1- and 50-cm increments, respectively. Alternatively, we may also derive a "lumped" model directly at the scale of the entire root zone, without consideration of the flow and transport dynamics within the root zone. The lumped model is based on a vertical depth-wise integration of the point-scale process-based equations describing water flow and salt transport, Eq. [13] and [15], from the land surface (z = 0) to the bottom of the root zone (z = L),
 | [18a] |
 | [18b] |
where W = 
dz is root-zone moisture storage (L), and C = 1/W
cdz is average root-zone salinity (M L3) weighted by the moisture content. All other quantities in Eq. [18] are as defined before. Since our interest is in calculating Tr and Stot, as defined in Eq. [17], we augment Eq. [18] with
 | [18c] |
 | [18d] |
Various terms such as drainage (qL, SL) and transpiration (Ta) depend on the depth-dependent values of h (or s) and c. Hence, to solve Eq. [18] these need to be expressed in terms of the depth-averaged variables W and C. The most straightforward method is to replace the point-wise variables by their depth-averaged equivalents (i.e., W replaces saturation s, and C replaces c),
 | [19] |
where fs is the water retention function (Table 1). Substituting these into the expressions for qL, Ta, and SL yields,
 | [20a] |
 | [20b] |
 | [20c] |
Equation [20a] assumes a unit hydraulic gradient, whereas Eq. [20b] assumes a uniform root distribution. In summary, the lumped model in Eq. [18] forms a system of four coupled ordinary differential equations, which was integrated from t = 0 to t = ts using explicit finite differences (Press et al., 1990). The model computes values for relative crop transpiration and salt drainage without considering any dynamics within the root zone. The lumped model was solved using daily and seasonally averaged boundary conditions and was subsequently compared with the reference numerical model to assess the resulting model errors.
 |
RESULTS AND DISCUSSION
|
|---|
Two separate case studies are presented to illustrate the methodology of identifying optimal model complexity as a function of model input uncertainty and data availability. The first case study examines the effect of vertical spatial discretization in the numerical model, whereas the second case compares the reference physically based model with the lumped model formulation. In both cases, the focus is on predicting field-scale seasonal crop transpiration (or crop yield) and salt drainage for a range of input uncertainties and a number of available point-scale measurements. A 150-d crop growing season is considered with specified daily potential transpiration and root-zone depth (Table 1) and six irrigation events (Table 2). Other input data are summarized in Table 2.
View this table:
[in this window]
[in a new window]
|
Table 2. Summary of parameter values, initial conditions, and boundary conditions used in the simulations. Alternative values are given in parentheses.
|
|
Case Study I: Effect of Vertical Discretization
Figures 3
to 5
present the results for predicting field-scale crop yield with the physically based numerical model using various levels of vertical spatial discretization, expressed by the number of nodes, and for various levels of model input uncertainty, quantified by the coefficient of variation (CV) of seasonal net infiltration and the number of point-scale measurements n. Figure 3 shows the relative sums of squared error (SSE) of the model input and structural model errors as a function of the CV of net infiltration for four different levels of vertical discretization. Recall that the reference model uses 101 nodes to discretize the 1-m root zone into 1-cm increments. With 41 nodes, corresponding to 2.5-cm increments, structural model errors (SSEs, triangles) are relatively unimportant compared with the model input errors (SSEi, squares), indicating that the 41-node model provides a very good approximation of the "true" 101-node model irrespective of the CV level, except for CV values smaller than 10%. As the number of nodes is decreased in Fig. 3 the structural model errors become more important, until they dominate the total model error in the three-node model, even for CV values of 200%. Nevertheless, in every case the relative importance of the model input error increases and that of the structural model error decreases when spatial heterogeneity increases. This suggests that the coarser-grid model may yield acceptable predictions in the presence of large uncertainty or spatial heterogeneity of the net infiltration rate. That conclusion is more clearly conveyed in Fig. 4, which shows the optimal number of nodes as a function of the CV of net infiltration. Optimality was here defined as the level of vertical spatial discretization for which structural model errors are equal to model input errors, or SSEi equals SSEs. The results for crop transpiration (diamonds in Fig. 4) confirm our hypothesis that as the uncertainty of the model input increases (here, the infiltration rate, and by correlation, the hydraulic properties), the required or optimal model complexity decreases. For example, less than 10 nodes are required for an accurate simulation of field-scale average crop transpiration when the CV of net infiltration is 100%. The large plateau in Fig. 4 at high CV values further suggests that there is a limit to model simplification. The same analysis was also done for field-scale salt drainage; the summary graph is shown in Fig. 4 (squares). It appears that the prediction of salt drainage is much less affected by the level of spatial discretization. Even for small heterogeneity (CV of 10%) a model with less than 10 nodes generates accurate results. Intuitively this makes sense since crop transpiration depends on the variation of water content and salt concentration throughout the root zone, whereas salt drainage is more determined by conditions near the bottom of the root zone.

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 3. Relative sums of squared errors (SSE) for the prediction of crop transpiration as a function of the coefficient of variation (CV) of net infiltration for different levels of vertical spatial discretization (i.e., 41, 21, 7, and 3 nodes, respectively). Triangles = SSEs; squares = SSEi. The number of point-scale measurements is n = 20. Crop parameters: c50 = 2 dS/m, h50 = 70 m and = 1.
|
|

View larger version (12K):
[in this window]
[in a new window]
|
Fig. 4. Trade-off between optimal model complexity, expressed by the number of vertical nodes, and the level of model input uncertainty, expressed by the CV of net infiltration. Optimal model complexity corresponds to a relative SSEs = 50%. Diamonds = crop transpiration; squares = salt drainage. The number of point-scale measurements is n = 20. Crop parameters: c50 = 2 dS m1, h50 = 70 m and = 1.
|
|

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 5. Contour plot of the number of nodes needed to achieve a relative SSEs of 50% as a function of the CV of net infiltration and log(n), which is the logarithm of the number of point-scale measurements. Crop parameters: c50 = 2 dS m1, h50 = 70 m and = 1.
|
|
So far we have looked at the effect of field-scale heterogeneity on the optimal level of model complexity assuming that 20 point-scale measurements of net infiltration are available (i.e., n = 20). The same analysis can be repeated by changing the number of point-scale measurements for a given level of field-scale heterogeneity. In other words, we can evaluate the effect of data availability on optimal model complexity and calculate how many additional measurements we need to take such that application of a more complex model is justified. Results for the prediction of crop transpiration are shown in Fig. 5, which displays a contour plot of the number of nodes needed to achieve a relative SSEs of 50% as a function of the CV of net infiltration and log(n), which is the logarithm of the number of point-scale measurements. The plot indicates that for a given level of field heterogeneity a coarser model should be replaced by a more finely discretized one as more data become available and n increases, especially when heterogeneity is large. In addition, for a given number of point-scale measurements the use of coarser models becomes justified as field heterogeneity increases. Note that coarse models with 10 nodes or fewer (i.e., 10-cm increments) are typically sufficiently accurate for a wide range of n and CV values. A similar contour plot may also be generated for the prediction of salt drainage. It should be mentioned that these results are specific to the case study presented here (i.e., no evaporation or runoff, deep water table, seasonal transpiration). When the focus of study is, for example, short-duration hydrological events, a vertical discretization using 10-cm increments may result in greater errors.
Case Study II: Evaluation of the Lumped Model
The purpose of the second case study is to illustrate how the methodology can be used to evaluate the validity of a simplified model over a range of spatial heterogeneities and levels of data availability. Predictions of field-scale relative crop transpiration and salt drainage with the lumped model using either daily or seasonally averaged boundary conditions are compared with simulations with the true or reference model. This is done for various levels of data availability (by varying n) and a range of field-scale heterogeneities for net infiltration. The effects of the water stress parameter h50, the salt stress parameter c50, and the root distribution parameter
were also investigated.
To evaluate the simulated dynamics of flow and transport, Fig. 6
compares moisture content and salinity throughout the growing season for the reference and lumped models. The reference model (Fig. 6a) predicts large variations in moisture content and salinity near the surface due to periodic irrigation events and subsequent periods of crop transpiration. These variations are dampened at depth. Salinity gradually increases at the bottom of the root zone due to crop transpiration. The lumped models (Fig. 6b) do not consider any vertical variations and predict root-zone average moisture contents and salt concentrations that are in between those at the surface and bottom of the root zone in Fig. 6a. In addition, when using seasonally averaged boundary conditions, the predictions also do not account for temporal variations of irrigation and transpiration cycles.

View larger version (31K):
[in this window]
[in a new window]
|
Fig. 6. Dynamics of moisture content and salt concentration during the 150-d growing season: (a) reference numerical model at the soil surface ("top") and at the bottom of the root zone ("bottom"), (b) lumped model using daily boundary conditions ("Daily BC") and seasonally averaged boundary conditions ("Seasonal BC"). Crop parameters: c50 = 2 dS m1, h50 = 70 m and = 1. Seasonal net infiltration is 0.8 m.
|
|
Figure 7
presents contour plots of structural model errors for simulating field-scale crop transpiration with the lumped model using daily and seasonal boundary conditions. Both plots are similar and show that larger n values (more point-scale measurements) and smaller CV values of infiltration (less spatial variation) result in relatively larger structural model errors. However, there are two other interesting results in this figure. First, for a given level of model input uncertainty of fixed n and CV values, the lumped model with seasonally averaged boundary conditions always outperforms and has smaller structural model errors than the lumped model with daily boundary conditions. This result suggests that vertical and temporal averaging errors compensate each other to yield a smaller overall structural model error. Second, as the CV of net infiltration increases above 100%, the relative structural model errors tend to increase, which contradicts our hypothesis that an increase of soil heterogeneity reduces the importance of structural model errors. Figure 8
clarifies the reason for this seemingly contradictory result. It shows that the lumped models underpredict crop transpiration for low values of net seasonal infiltration smaller than 0.2 m. Since at the field scale net infiltration is described by a lognormal distribution, a larger CV of net infiltration results in a higher frequency of low infiltration values, thereby causing errors in the prediction of field-scale crop transpiration. Note, however, that the increase in error for large CV is much more gradual than the decrease in error for small CV values, as indicated by the closeness of the contour lines in Fig. 7. In summary then, Fig. 7 shows that the lumped models provide very good predictions for CV values between 50 and 200% when less than 20 point-scale measurements are available.

View larger version (22K):
[in this window]
[in a new window]
|
Fig. 7. Contour plots of relative SSEs as a function of n and CV of net infiltration for field-scale prediction of crop transpiration with the lumped model using daily boundary conditions ("Daily BC") and seasonally averaged boundary conditions ("Seasonal BC"). Crop parameters: c50 = 2 dS m1, h50 = 70 m and = 1.
|
|

View larger version (19K):
[in this window]
[in a new window]
|
Fig. 8. Crop transpiration as a function of seasonal net infiltration predicted with the finely discretized numerical model and the lumped model using either daily boundary conditions ("Daily BC") or seasonally averaged boundary conditions ("Seasonal BC"). Crop parameters: c50 = 2 dS m1, h50 = 70 m and = 1.
|
|
Next, we investigated the performance of the lumped flow and transport model for a wider range of conditions. Figures 9
to 11
show the effects of various crop parameters on the field-scale prediction of crop transpiration with the lumped model. In each case, the specific crop parameter is varied deterministically, such that the only source of model input uncertainty is caused by spatial heterogeneity of net infiltration and hydraulic properties. In Fig. 9, the value of the crop salt stress parameter c50 is changed to 1 (top) and 4 dS m1 (bottom). The resulting contour plots are very similar to the ones presented in Fig. 7 for c50 = 2, except that larger salt sensitivity (c50 = 1) results in larger structural model errors. Again the lumped model with seasonal boundary conditions performs best, and has SSEs values less than 50% for CV values between 30 and 200% with 10 or fewer point-scale measurements. Seasonal transpiration of a salt tolerant crop (c50 = 4) is accurately predicted by both lumped models for all levels of model input uncertainty. Greater water stress sensitivity on the other hand (Fig. 10), as indicated by h50 = 22 m, results in better model predictions compared with the ones presented in Fig. 7 for h50 = 70 m. When the crop becomes more drought tolerant (h50 = 220 m) compared with Fig. 7, structural model errors are not much affected. Figure 11 shows the effect of changes in the root distribution characterized by parameter
(Table 1). Small values for
represent a root profile with most roots concentrated near the soil surface. As the value of
increases, the distribution of roots through the 1-m profile becomes increasingly uniform. As is demonstrated in Fig. 11, structural model errors dominate the total model prediction error when the root distribution is shallow (
= 1), even for very large CV values and few point-scale measurements. Thus, lumping or averaging the physical process of water and salt movement across the 1-m soil depth is inaccurate when most of the roots are concentrated near the soil surface. In those circumstances, the effective root-zone depth for the lumped model must be selected smaller than the root zone depth.

View larger version (35K):
[in this window]
[in a new window]
|
Fig. 9. Effect of crop salt stress sensitivity (parameter c50) on relative SSEs as a function of n and CV of net infiltration for field-scale prediction of crop transpiration with the lumped model using daily boundary conditions ("Daily BC") and seasonally averaged boundary conditions ("Seasonal BC"). Crop parameters: h50 = 70 m and = 1.
|
|

View larger version (40K):
[in this window]
[in a new window]
|
Fig. 10. Effect of crop water stress sensitivity (parameter h50) on relative SSEs as a function of n and CV of net infiltration for field-scale prediction of crop transpiration with the lumped model using daily boundary conditions ("Daily BC") and seasonally averaged boundary conditions ("Seasonal BC"). Crop parameters: c50 = 2 dS m1 and = 1.
|
|

View larger version (36K):
[in this window]
[in a new window]
|
Fig. 11. Effect of crop root distribution (parameter ) on relative SSEs as a function of n and CV of net infiltration for field-scale prediction of crop transpiration with the lumped model using daily boundary conditions ("Daily BC") and seasonally averaged boundary conditions ("Seasonal BC"). Crop parameters: c50 = 2 dS m1 and h50 = 70 m.
|
|
Results have focused on the prediction of field-scale crop transpiration. Similar to the first case study, the structural model errors for predicting salt drainage were always much smaller than those for crop transpiration. This is illustrated in Fig. 12
, which shows results for salt drainage when roots are shallow. It is evident that even for this worst case scenario, structural model errors remain small. In contrast to the crop transpiration analysis, resulting errors do not increase when CV values become very large.

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 12. Contour plots of relative SSEs as a function of n and CV of net infiltration for field-scale prediction of salt drainage with the lumped model using daily boundary conditions ("Daily BC") and seasonally averaged boundary conditions ("Seasonal BC). Crop parameters: c50 = 2 dS m1, h50 = 70 m and = 0.1.
|
|
 |
DISCUSSION
|
|---|
The sensitivity analysis presented in this paper may be very useful for field and regional-scale applications, when data are scarce and model input uncertainty is large. A reference model must first be identified, for example, by calibrating the RE or ADE model at the local scale to site-specific conditions. Subsequently, data uncertainty at the larger scale is quantified, either based on a number of point-scale measurements, a large-scale indirect measurement technique (e.g., geophysics, remote sensing), or based on data ranges using existing literature. Our analysis can be used to identify a more parsimonious or efficient model that provides accurate estimates of the variables of interest for the given data uncertainty. Our method allows this process to occur in an objective and quantitative manner. The simpler model may be used instead of the complex one as long as the effects of model input uncertainty are accounted for. Such an approach reduces unnecessary model complexity at large spatial scales. It should be noted that we assume that the true or reference model of field-scale water flow and transport can be obtained by a series of parallel noninteracting vertical columns. However, the presented case studies are useful for illustration nevertheless.
Several important factors were not discussed, including spatial autocorrelation, spatial heterogeneity or uncertainty of other model parameters than hydraulic properties and infiltration, and the effect of observation error. Spatial autocorrelation should be accounted for in the parameter estimation of the experimental pdf
of x when the correlation length of the model input is about equal to the field scale. In that case,
in Eq. [9] and [10] can be estimated with smaller uncertainty, for example by block-kriging, to attain smaller relative model input errors than when no autocorrelation is considered. Spatial heterogeneity in additional model parameters can also be included by considering the joint probability density function of these parameters in Eq. [8]
through [10]. Accounting for the uncertainty of additional model parameters will result in greater model input errors and relatively smaller structural model errors. Finally, although observation errors were initially included in the theoretical analysis, they were ignored in the two case studies to focus on the contrast between errors due to model input uncertainty and structural model uncertainty. Inclusion of observation errors can be done by specifying a pdf for the observation error
o. The corresponding SSE, determined by the observation error variance, can be added to the total model error, and will reduce the relative importance of structural model errors.
 |
CONCLUSIONS
|
|---|
We presented a methodology for calculating various model prediction errors of field-scale crop transpiration and salt drainage, given a limited number of point-scale measurements of net infiltration. A reference or true model was defined, and total model error was separated into errors due to model input uncertainty and errors due to model simplifications caused by averaging hydrologic processes with depth and time. Results of the first case study showed that the optimal level of vertical discretization critically depends on both the number of point-scale measurements, n, available and the degree (CV) of field-scale heterogeneity in net infiltration and hydraulic properties. It was concluded that the high resolution model with 1-cm increments is not required for accurate prediction of seasonal field-scale crop transpiration and salt drainage. Instead, a coarser discretization using 10-cm increments is sufficient for a wide range of n and CV values, when the interest is only in seasonal quantities (transpiration, salt drainage).
The second case study quantified structural model errors of a lumped model using either daily or seasonally averaged boundary conditions of irrigation and transpiration. Results showed that there is a trade-off between the appropriate level of model complexity and the degree of parameter uncertainty, suggesting that complex models may be replaced by simpler ones as long as the resulting structural model error is smaller than the prediction errors due to input uncertainty. The lumped model with seasonally averaged boundary conditions was found to perform equally well or better than the lumped model with daily boundary conditions, suggesting error cancellation due to the averaging in time and space. In general, the contribution of structural model error to total model error of the lumped model with seasonal boundary conditions was less than 50% as long as n < 20 and CV > 50%. Predictions were inaccurate for soils with shallow roots because of violation of the uniform root distribution assumption. For most investigated scenarios, the lumped modeling approach caused a gradual increase in structural model errors for large CV because of underprediction of crop transpiration at low infiltration rates. Model errors were always much smaller for salt drainage than for crop transpiration.
The methodology provides a straightforward and objective approach toward identification of an optimal model complexity level across spatial scales. It may be especially useful in regional-scale applications when a reduction in model complexity is highly desirable, provided that the effects of model input uncertainty are included. We pose that effects of spatial autocorrelation, spatial heterogeneity, and additional parameter uncertainty, as well as measurements error can be incorporated in the proposed analysis.
 |
ACKNOWLEDGMENTS
|
|---|
This work was supported by the U.S. Department of Agriculture Fund for Rural America Project 97-362000-5263 and by the University of California Salinity Drainage Program.
 |
REFERENCES
|
|---|
- Boegh, E., M. Thorsen, M.B. Butts, S. Hansen, J.S. Christiansen, P. Abrahamsen, C.B. Hasager, N.O. Jensen, P. van der Keur, J.C. Refsgaard, K. Schelde, H. Soegaard, and A. Thomsen. 2004. Incorporating remote sensing data in physically based distributed agro-hydrological modeling. J. Hydrol. 287:279299.[CrossRef]
- Bresler, E., and G. Dagan. 1983. Unsaturated flow in spatially variable fields. 3. Solute transport and their application to two fields. Water Resour. Res. 19:429435.
- Chen, Z.Q., R.S. Govindaraju, and M.L. Kavvas. 1994. Spatial averaging of unsaturated flow equations under infiltration conditions over areally heterogeneous fields. 1. Development of models. Water Resour. Res. 30:523533.[CrossRef]
- Dagan, G., and E. Bresler. 1988. Variability of yield of an irrigated crop and its causes. 3. Numerical simulation and field results. Water Resour. Res. 24:395401.
- Destouni, G., and V. Cvetkovic. 1991. Field-scale mass arrival of sorptive solute into the groundwater. Water Resour. Res. 27:13151325.[CrossRef]
- de Wit, M.J.M., and E.J. Pebesma. 2001. Nutrient fluxes at the river basin scale. II: The balance between data availability and model complexity. Hydrol. Processes 15:761775.[CrossRef]
- Foussereau, X., W.D. Graham, G.A. Akpoji, G. Destouni, and P.S.C. Rao. 2000. Stochastic analysis of transport in unsaturated heterogeneous soils under transient flow regimes. Water Resour. Res. 36:911921.[CrossRef]
- Grayson, R., and G. Blöschl. 2000. Spatial patterns in catchment hydrology: Observations and modeling. Cambridge Univ. Press, New York.
- Hopmans, J.W. 1989. Stochastic description of field-measured infiltration data. Trans. ASAE 32:19871993.
- Hopmans, J.W., D.R. Nielsen, and K.L. Bristow. 2002. How useful are small-scale soil hydraulic property measurements for large-scale vadose zone modeling. In D. Smiles et al (ed.) Heat and mass transfer in the natural environment, the Philip Volume. Geophysical Monogr. 129. American Geophysical Union, Washington, DC.
- Hupet, F., J.C. van Dam, and M. Vanclooster. 2004. Impact of within-field variability in soil hydraulic properties on transpiration fluxes and crop yields: A numerical study. Available at www.vadosezonejournal.org. Vadose Zone J. 3:13671379.[Abstract/Free Full Text]
- Kavvas, M.L., Z.Q. Chen, R.S. Govindaraju, D.E. Rolston, T. Koos, A. Karakas, D. Or, S. Jones, and J. Biggar. 1996. Probability distribution of solute travel time for convective transport in field-scale soils under unsteady and nonuniform flows. Water Resour. Res. 32:875889.[CrossRef]
- Lee, J., and F.X.M. Casey. 2005. Development and evaluation of a simplified mechanistic-stochastic method for field-scale solute transport prediction. Soil Sci. 170:225234.[CrossRef]
- Letey, J., H.J. Vaux, and E. Feinerman. 1984. Optimum crop water application as affected by uniformity of water infiltration. Agron. J. 76:435441.[Abstract/Free Full Text]
- Miller, E.E., and R.D. Miller. 1956. Physical theory for capillary flow phenomena. J. Appl. Phys. 27:324332.[CrossRef][ISI]
- Perrin, C., C. Michel, and V. Andreassian. 2001. Does a large number of parameters enhance model performance? Comparative assessment of common catchment model structures on 429 catchments. J. Hydrol. 242:275301.[CrossRef]
- Press, W.H., B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling. 1990. Numerical recipes. Cambridge Univ. Press, New York.
- Raats, P.A.C. 1974. Steady flows of water and salt in uniform soil profiles with plant roots. Soil Sci. Soc. Am. Proc. 38:717722.
- Russo, D. 1991. Stochastic analysis of simulated vadose zone solute transport in a vertical cross section of heterogeneous soil during nonsteady water flow. Water Resour. Res. 27:267283.[CrossRef]
- Schoups, G., and J.W. Hopmans. 2002. Analytical model for vadose zone solute transport with root water and solute uptake. Available at www.vadosezonejournal.org. Vadose Zone J. 1:158171.[Abstract/Free Full Text]
- Schoups, G., J.W. Hopmans, and K.K. Tanji. 2006. Evaluation of model complexity and space-time resolution on the prediction of long-term soil salinity dynamics. Hydrol. Processes (in press).
im
nek, J., M.
ejna, and M.Th. van Genuchten. 1998. The HYDRUS-1D software package for simulating the one-dimensional movement of water, heat, and multiple solutes in variably-saturated media. Version 2.0. IGWMC-TPS-70. International Groundwater Modeling Center, Golden, CO.
im
nek, J., N.J. Jarvis, M.Th. van Genuchten, and A. Gärdenäs. 2003. Review and comparison of models for describing non-equilibrium and preferential flow and transport in the vadose zone. J. Hydrol. 272:1435.[CrossRef]- Toride, N., and F.J. Leij. 1996. Convective dispersive stream tube model for field-scale solute transport: I. Moment analysis. Soil Sci. Soc. Am. J. 60:342352.[Abstract/Free Full Text]
- van der Zee, S.E.A.T.M., and J.J.T.I. Boesten. 1991. Effects of soil heterogeneity on pesticide leaching to groundwater. Water Resour. Res. 27:30513063.[CrossRef]
- van Genuchten, M.Th. 1980. A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci. Soc. Am. J. 44:892898.[Abstract/Free Full Text]
- van Genuchten, M.Th. 1987. A numerical model for water and solute movement in and below the root zone. Research Rep. 121. USDA-ARS, U.S. Salinity Laboratory, Riverside, CA.
- Warrick, A.W., and W.R. Gardner. 1983. Crop yield as affected by spatial variations of soil and irrigation. Water Resour. Res. 19:181186.
This article has been cited by other articles:

|
 |

|
 |
 
J. A. Vrugt and S. P. Neuman
Introduction to the Special Section in Vadose Zone Journal: Parameter Identification and Uncertainty Ass |