|
|
||||||||
Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK
* Corresponding author (matthew.pringle{at}bbsrc.ac.uk)
Received 31 January 2005.
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: AIC, Akaike Information Criterion LMCR, linear model of coregionalization MAPD, median absolute pair difference ME, mean error MVE, minimum-volume ellipsoid RMSE, root-mean-square-error
| INTRODUCTION |
|---|
|
|
|---|
One aspect of soil behavior where models are growing in importance is the carbon (C) cycle. Soil is an important component of C cycling because about 4% of the world's C is concentrated within 1 m of the soil surface (Lal et al., 1995). The global release of CO2 from soil to the atmosphere is, according to Raich and Schlesinger (1992), approximately 68000 Tg yr1. These same authors note that small changes in soil CO2 emissions could induce disproportionately larger changes in atmospheric CO2. The environmental importance of this process has led scientists to develop models of its behavior (e.g., Cook et al., 1998; Fang and Moncrieff, 1999; Pumpanen et al., 2003).
Soil process models, however, are only useful insofar as they predict the behavior of the soil robustly. Any model is necessarily an approximation to reality, so it is important that we quantify its success. This issue has been widely recognized, and the general problem of how to validate models has received considerable attention (e.g., Konikow and Bredehoeft, 1992; Gaunt et al., 1997; Oreskes and Belitz, 2001). Model validation is not a matter of verifying a model absolutely; we know that any model will always be, in some sense, "wrong" (Konikow and Bredehoeft, 1992). Rather, validation entails an assessment of the model's performance in quantitative terms (Addiscott and Tuck, 2001). There are various ways of doing this, all based on some comparison of an observed variable, zo, and a corresponding model prediction, zp. We may quantify model performance by considering the correlation of zo and zp, and by extracting various statistics of the model error, zo zp (Willmott, 1982; Addiscott and Whitmore, 1987; Wallach and Goffinet, 1987).
Despite the substantial interest in model validation, there has been comparatively little attention given to the fact that, for many important processes, model predictions are attached to different locations in space, x1, x2, ...xn, and that our predictions and observations are spatial variables, zp(x) and zo(x). The model error zo(x) zp(x) is, therefore, also a spatial variable. Why might this matter?
First, a key feature of many environmental spatial variables is that they are the combined effects of components of different spatial scale. The term "scale" is not always applied consistently; in our interpretation, components of spatial variation of low spatial frequency (long range in geostatistical terms) are of "coarse scale" relative to fine-scale components of higher frequency (shorter range). Any set of observations, z(xi) (i = 1, 2, ..., n), will have components of variation at different frequencies over some range, the upper limit of which is set by the finest sample interval. These components may have different causal origins. For example, if z(x) is soil C, then we can think of various factors that may contribute to our observed values such as broad climatic gradients at a continental scale; catchment-scale effects, such as local gradients of temperature and soil wetness; effects at within-field scale, such as variations in soil clay content; and very short-range effects such as the distribution of peaty lenses. In any particular case study, we will have observed only some of these spatial scales, but it is likely that some scales may be associated with more variation than others, and we, therefore, say that the variation is scale-dependent. This is widely observed for soil variationsee, for example, the seminal review by Beckett and Webster (1971)and affects model validation because it introduces the possibility that a particular model may reproduce the components of variation better at some spatial scales than it does at others. In the context of this paper, this is what we mean by "scale-dependence" of the performance of a model. Since different processes may be associated with different spatial scales, and models reflectmore or less successfullysome subset of these processes, it is at least possible that model performance is also scale-dependent. There is anecdotal evidence for such behavior in the literature. Groffman and Tiedje (1989) found that a model of denitrification performed poorly when used to predict at a within-field scale, but was successful at predicting differences between landscape positions. Addiscott and Mirza (1998) suggest that this may be a general patternmodels can describe variations at spatial scales coarser than some threshold, but break down at finer scales. Milne et al. (2005) showed this by using wavelet transforms to analyze the correlation between observed and modeled nitrous oxide emissions from soil on a 1024-m transect. At the coarsest scales, 128 and 256 m, the correlations were strong (around 0.8), but short-range variability was poorly predicted by the model (correlation smaller than 0.4). Such scale-dependence is important because it will tell us if a model is suitable for our particular purpose. For example, a model may be of little use for precision agriculture at the within-field scale, but still be useful for catchment-scale management.
A second consequence of scale-dependence is that models may be more or less successful at reproducing the relative variability of a process at different scales. The extent to which a model reproduces the spatial variability of a variable may be as important as the success with which the variable is predicted at any point. For example, if we wish to predict some aggregate behavior of a system from point predictions at a fine scale (e.g., net fluxes of solutes through a volume of soil), then this may be achieved by a model that successfully reproduces the variability of key variables over a range of spatial scales. Milne et al. (2005) showed that the distribution of the variance of predicted nitrous oxide emissions between spatial scales differed between different models, and that some showed similar spatial variability to the observations while others did not.
A third issue in the validation of models of spatial processes is that the spatial pattern of the model error may itself be informative. Model error arises when parameters are in error, or where a critical process is either not modeled or poorly modeled. In many cases, a map of the model error may help to highlight possible sources of error. For example, if a model assumes a freely draining soil, then there may be systematic errors associated with waterlogged soil. This is illustrated by the work of Konikow (1986) who conducted an audit of predicted groundwater levels in an Arizona catchment. A model of groundwater levels, based on about 40 yr of available data, was used to forecast levels for the following 10 yr. Konikow showed that the model systematically overestimated by about 23 m. The spatial distribution of model error was mapped, and it was noted that, in some parts of the study area, model error was correlated with the depth to the water-table and/or the amount of localized land subsidence.
We now consider three hypothetical examples of how some failure of a model will have consequences for its performance, and how this failure will be most clearly understood when we compare an observed variable with corresponding model predictions, with reference to the components of variation at different spatial scales. First, the model does not include a critical process. Under these conditions, the variation of model error, from point to point, will reflect the variation of this unmodeled process, and will be scale-dependent, just as the unmodeled process shows scale-dependence. The critical process might be at a coarse-scale, while some fine-scale process is well modeled, in which case model performance will be better at fine scales. Similarly, the distribution of variation in the model predictions will not reflect reality. The model may underestimate the amount of coarse-scale variation in these circumstances. If a process missing from a model shows marked spatial variation, the model performance might be better at some locations (i.e., where the missing process has little effect) than at others. Second, the model combines, in a nonlinear fashion, an input variable with a suboptimal parameter. Here, the model error is a function of both the variability of the input and the inadequacy of the model itself. The model error will have a scale dependence that reflects that of the input variable, and, in consequence, the error may be more significant at some scales than at others. Third, the model may describe some elements of the process well, but not others. For example, a model may account well for the effect of soil wetness on nitrous oxide emissions, but not the effect of C content. This could be because this latter is based on a crude measure of C content, whereas, in reality, certain pools of C are active and others are not. This would lead to scale-dependent model errors if the large variations of active C and wetness are at different scales (as they often will be). Here, we see a model input and an unknown variable creating scale-dependent model error as a result of the properties of the model.
It is clear from this discussion that there is a need for methods for the spatial analysis of model error. The wavelet methods that Milne et al. (2005) used are powerful and have the particular advantage that they do not make any assumptions about the form of the variability other than finite variance (Lark and Webster, 1999). However, they require regular sampling (on grids or transects) if the full analysis by scale is to be achieved. In this paper we consider the possibility of spatial analysis of model error using geostatistical methods.
We consider multivariate geostatistical analysis of observations and model predictions based on the linear model of coregionalization (LMCR). Because the LMCR regards variation as the sum of nested components with different spatial structures, it allows us to assess the performance of the model at different scales, and to decompose both observations and model predictions into components of different scale for direct comparison. We also use geostatistical methods to map model error.
We re-iterate that by "scale-dependent model performance," we mean the ability of a model to reproduce the components of variation of a variable associated with different spatial scales. This must not be confused with other scale-related issues in soil modeling; most notably, many scientists are concerned with the extent to which a model is scale-invariant, that is, whether it is restricted to describing a process in a soil volume of particular size and shape (the spatial support), or whether it gives reasonable results with inputs either aggregated to coarser scales ("up-scaling," e.g., the map polygon) or disaggregated to fine scales ("down-scaling"). Bierkens et al. (2000) discuss this issue. Scale-dependence, in the sense of this paper, is relevant to questions of model performance under up-scaling or down-scaling (assuming that some suitable method, such as sectioningAddiscott and Wagenet, 1985; Lark, 2000ahas been used to handle a lack of scale invariance). Aggregated model predictions may be presented on different supports (e.g., pixel sizes). If model performance is not scale-dependent (in the sense of this paper), then, on any support, the predictions will be an equally valid representation of the process. If, however, fine-scale variation is less well represented by the model than coarse-scale variation, then this should be considered, alongside other factors, in the choice of a support at which to represent model output. If the support is too small then we are purporting to represent variation at a scale that the model describes poorly, and it would be better to use a coarser support so that the variations at finer scales are largely smoothed. In this paper we use a geostatistical approach to quantify the correlation of observations and predictions on different supports, given scale-dependent model performance.
In the following section we discuss the geostatistical methods that may be used to analyze observations and model predictions as spatial entities, and also to map model error and investigate optimal data aggregation. We then illustrate these methods in a case study, where we compare observed soil CO2 emissions with model predictions at sites across an arable landscape.
| METHODS FOR GEOSTATISTICAL ANALYSIS OF MODEL ERROR |
|---|
|
|
|---|
Linear Model of Coregionalization
The General Form of the Model
We wish to form a statistical model of the joint spatial variation of the observed and modeled values of some soil property, denoted by zo(x) and zp(x), respectively, where x is a vector of spatial coordinates. We assume that these variables can be treated as realizations of two random functions, Zo(x) (observations) and Zp(x) (predictions). If these variables are intrinsically stationary (Matheron, 1962), their joint covariation is described by the experimental cross-variogram:
![]() | [1] |
o,o(h) and
p,p(h), which describe the variation of the observed and predicted values, respectively.
The cross-variogram is conventionally calculated from data using the method-of-moments estimator (after Matheron, 1962):
![]() | [2] |
o,oh and
p,ph. These point estimates must be modeled with continuous functions of h. Not any continuous function is permitted, since those fitted to the auto- and cross-variograms must jointly constitute a positive definite cross-covariance structure, that is, all linear combinations of the variables must have a positive variance. This is usually ensured by fitting a LMCR (Journel and Huijbregts, 1978).
The LMCR is based on the assumption that the random functions Zo(x) and Zp(x) are linear combinations of a common set of independent random functions of zero mean and unit variance:
![]() | [3] |
![]() | [4] |
![]() | [5] |
ao,klap,kl. We may then express the variogram models in matrix form:
![]() | [6] |
The matrix on the right-hand side of Eq. [6] is called the coregionalization matrix for the particular scale of variation associated with variogram gl(h). All coregionalization matrices must be positive definite. This constraint on the LMCR means that it is not simple to fit.
Is the Linear Model of Coregionalization Suitable for our Purposes?
When we fit the LMCR to variograms of two or more variables, we make the assumption that these are realizations of linearly coregionalized random functions. We must consider whether this is plausible, and how deviations from these assumptions might be detected.
First, is it reasonable to assume that our observed and predicted values are realizations of random processes? There is a general question here about actual soil properties, such as our zo(x), and a more specific question about model predictions, such as our zp(x).
We are concerned here with geostatistical analysis of soil properties, based on systematic sampling, and so assumptions of randomness are not justified by a random sample design, but rather constitute a model of our observations (so-called model-based statistics). Webster (2000) discusses the implications of this. First, note that this assumption of randomness may be made despite our recognition that observed variables are actually caused by mechanisms. We invoke assumptions of randomness because the mechanisms are complex and perhaps nonlinear, and they are not all understood and not all conditions of the system are known. This is the same basis on which the number returned by the rolling of a die is treated as random, although the motion of the die is governed by Newton's laws.
The assumption of randomness is not testable; the question is whether it is plausible. This might be addressed by examining exploratory statistics of the data and their histograms and the results of a cross-validation method for selection between alternative LMCRs, which is discussed below.
Similar considerations apply to our assumption that model predictions, zp(x), can be regarded as realizations of a random function, Zp(x). The outcome of a mechanistic model is determined if all its inputs are known, ignoring rounding error and other computational artifacts. The assumption that zp(x) is a realization of a random variable requires that this be a plausible assumption about the model inputs. This is a necessary, but not a sufficient condition; it is possible, for example, that a model whose processes depend on threshold values of one or more key variables could return uniform outputs from a variable set of inputs. It is, therefore, necessary to conduct exploratory analysis of the model output and the fitted LMCR.
The second question is whether the assumption that the random variables Zo(x) and Zp(x) are linearly coregionalized is plausible? If the model accounts reasonably well for the key processes that determine z (particularly nonlinear processes), then this assumption should be reasonable. In this respect, a poor fit of the LMCR may itself be an indicator that the model is failing to describe some key processes adequately. Nonlinearity may be seen in cross-variogram estimates that are much larger than is compatible with the constraints on the LMCR. A plot of the variogram estimates may reveal apparent nested components of the cross-variogram that are not seen in any auto-variogram.
Estimation and Fitting
Our concern is to obtain coregionalization matrices for the observed and predicted values that reflect their joint spatial variation. In the context of spatial analysis of models, we are interested in the covariance model for its own sake as much as for further estimation. One factor that could influence the LMCR is the presence of any outlying observations in the data, perhaps, in our particular case study, because of hot-spots of microbial activity or organic carbon. These will bias estimates of the variograms obtained with Eq. [2]. For this reason we also computed auto- and cross-variograms using the robust estimators proposed by Lark (2003). Robust estimators reduce the influence of extreme values and Lark showed how robust estimates of cross-variograms could be calculated. There are two robust estimators: one based on the robust covariance estimator of Gnanadesikan and Kettenring (1972) and the other based on Rousseeuw's (1985) minimum-volume ellipsoid (MVE) estimator of the covariance matrix. We, therefore, obtained three sets of estimates of the experimental auto- and cross-variograms: one set with the method-of-moments estimator in Eq. [2] and two sets of robust estimates.
A LMCR may then be fitted to each set of variogram estimates. This was done in our case study with Lark and Papritz's (2004) method, which fits all parameters of the model by weighted least-squares, ensuring that the constraints of the model are met. The number of pairs of observations at each lag, divided by the square of the lag, was used as the weight. Zhang et al. (1995) proposed this weighting and we use it here because Pelletier et al. (2004) showed in simulation that it consistently out-performed other weighting schemes with respect to the efficiency and bias of parameter estimates. We considered various variogram models (see Webster and Oliver, 1990; Goovaerts, 1997): simple models with exponential or spherical variograms and a nugget effect and more complex nested models (i.e., the double spherical and double exponential). A nested model will always fit better in the least-squares sense since it has more degrees of freedom. However, the cost is that more parameters must be estimatedan additional coregionalization matrix and distance parameter of gl(h) for each nested term. The models were compared using the modified Akaike Information Criterion (AIC) (Webster and McBratney, 1989). The statistic, A, the variable portion of the AIC for fits of different models to the same data, will be smallest for the most parsimonious model. The AIC penalizes an increase in the number of model parameters unless this achieves a sufficient improvement in the fit of the model.
Model Diagnostics and Selection
It is necessary to chose between LMCRs based on different estimates of the experimental variograms. We should use robust estimates of the variogram only when there is evidence that the method-of-moments estimator (Eq. [2]) has been influenced by outliers. This is because the robust estimators are generally less efficient than the method-of-moments (Lark, 2003). Lark's variant of the cross-validation procedure can be used to choose the best model for particular data. Each LMCR is used to estimate each variable (zo and zp) at each sampled site from observations and predictions at other sites by standardized cokriging. The estimates are denoted
o(x) and
p(x), and each has an associated cokriging variance,
2o(x) and
2p(x), respectively. These cokriging variances are sensitive to the LMCR and will tend to be overestimated if the variograms have been affected by outlying values. We then compute the standardized squared kriging error of zo:
![]() | [7] |
o(x) will tend to be <1. However, the mean value of
o(x) over the data set,
o, is a poor diagnostic since outliers will influence the numerator as well as the denominator (through the LMCR). This is why Lark (2003) used the median of
o(x) over all the data, denoted
o, as a diagnostic. With a correct LMCR, the expected value of
o is 0.455. A confidence interval for this value may be obtained by bootstrapping (Lark, 2003).
In the case study we computed the
statistic of both observed values and model predictions for each alternative LMCR. Bootstrap estimates of the confidence intervals were also obtained. A LMCR was then selected according to the following protocol:
o and
p were not significantly different from 0.455 (at P < 0.05) then this model was selected. Otherwise, we considered models fitted to these estimates in ascending order of A, although more complex models were rejected if they contained structures for which the coregionalization was at the boundary of the admissible set under the LMCR, that is, the structural correlations (see below) were exactly 1.0 or 1.0.
o and
p were closest to 0.455 was chosen.
Statistics
Scale-Dependent Relations
The focus of our paper is how to characterize the scale-dependence of any relationship between model predictions of a variable and our observations of the same. There are two tools for doing this with the estimated variograms and LMCR: (i) the codispersion function and (ii) structural correlations.
The codispersion function was proposed by Matheron (1965), and used by Goovaerts and Webster (1994) in a study of the spatial coregionalization of metals in soil. A codispersion coefficient is simply the cross-variogram estimate for a particular lag, standardized by the corresponding values of the auto-variograms:
![]() | [8] |
o,p(h) against the lag (i.e., against the lag distance if we are ignoring spatial anisotropy, or against lag distance for separate directions).
The codispersion function is a lag-dependent correlation. Its advantage is that it does not depend on a fitted LMCR because it is computed directly from point estimates of the variogram. However, this means that
o,p(h) is not constrained to lie in the interval [1,1] if any of the auto-variograms are computed using some additional data from sites not used to estimate the cross-variogram. If
o,p(h) does not change with lag, then the two variables are said to be intrinsically correlated, that is, the correlation is invariant with scale.
The LMCR induces a set of L coregionalization matrices, as we have seen, one for each component of the model where l = 0, 1, ..., L (l = 0 corresponds to the nugget term). Each corresponds to an independent, additive component of the model for a particular spatial scale. The correlation between the components for Zo (x) and Zp (x) with variogram gl(h) can be computed from the elements of the corresponding coregionalization matrix as:
![]() | [9] |
The structural correlation indicates the strength of the relationship between particular nested spatial components of Zo(x) and Zp(x). Its main advantage over the codispersion function is that it is based on a decomposition of the variables. The codispersion function is useful for indicating that there is scale-dependence in the correlation of two variables, but it is less useful as a measure of how strongly correlated the variables are. This is because the codispersion function at any one lag is influenced by the correlation over all scales, so a strong correlation at some spatial scale may be masked if there are large and uncorrelated nugget components. If the structural correlation for some component is 1.0 (or 1.0), then this indicates one of two possibilities: either the variables are perfectly correlated or the LMCR is a poor fit at this scale (i.e., the cross-variogram model is at the boundary of the admissible set).
The main disadvantage of the structural correlation is that it is dependent on a sound selection of a model for the LMCR and in particular on a judicious choice of how many components to include. This is why the model-selection procedure we described before is important.
Factorial Cokriging Analysis
In factorial cokriging analysis we decompose each of our two variables, zo(x) and zp(x), into independent, additive components, each of which corresponds to a coregionalization matrix of the LMCR. Ordinary cokriging also allows us to extract components representing the local mean. The value of this is that it allows us to interpret further the results of our analysis of the structural correlations. For example, Goovaerts and Webster (1994) found that, although different metals in the soil of their study region had weak product-moment correlations, the structural correlations for long-range components were strong. By factorial cokriging they extracted the long-range components and examined them directly. Here, we use factorial (ordinary, standardized) cokriging, as described by Goovaerts (1997), to examine the model predictions and observations after filtering out those elements where agreement is poor. This may help our interpretation of the likely sources of model error.
Mapping Model Error
We are interested in studying the spatial variation of model error since, as we discussed in the Introduction, this may give us insight into its origin. At any location x where we know zo(x) and zp(x) we can estimate the model error, zo(x) zp(x). At unknown locations this variable may be estimated directly by univariate interpolation. Alternatively, it can be estimated as the difference between cokriged estimates
o(x) and
p(x). The advantage of the latter comes from coherence (Webster and Oliver, 2001). By coherence we mean that the ordinary cokriging estimates,
o(x) and
p(x), at unsampled locations are the best linear unbiased predictions, and their difference is also the best linear unbiased prediction of the difference at that site.
Possible Sources of Model Error
One possible source of model error is incorrect parameterization or incorrect formulation of processes (perhaps their omission) (Oreskes and Belitz, 2001). We might, therefore, expect to find model error coregionalized with some other spatial variable. For example, if crop density affects some variable significantly, such as the interception of rainfall in the canopy, but we do not account for this in a model, then the model error may be related to crop density. If, indeed, it is found that model error is coregionalized with another variable, then it suggests a new hypothesis that needs testing. In an extreme case, the model may require reformulation.
Optimal Aggregation of Model Predictions
Let us assume that, having validated a model, we now need to use it to generate predictions for decision-making. We can aggregate predictions from a point-support to a pixel-support in one of two ways: first, we could block krige predictions from a set of points at which the model is run (Heuvelink and Pebesma, 1999); second, we might run the model with aggregated inputs, correcting for bias in the case of nonlinearity, by sectioning (Addiscott and Wagenet, 1985; Lark, 2000a). If a model performs better at coarse scales than at fine scales, we might expect that aggregation of the model's predictions to some coarser support could improve the correspondence to observed values. We propose here an approach based on Krige's relation for the variance of the means of a set of blocks on support B within a region R. This is (for variable v):
![]() | [10] |
v(s,s) is the dispersion variance:
![]() | [11] |
v is the semivariance of v given the lag interval between x and x'. The double integration in Eq. [11] is computed by numerical approximation.
We can extend Eq. [11] to compute covariance terms (see Lark, 2000a), and so we may define an inter-block (B) covariance matrix within region R, that is, the matrix of variances and covariances of block means. This will be:
![]() | [12] |
![]() | [13] |
), the a priori covariance matrix:
![]() | [14] |
Consider a LMCR with a nugget and two or more nested, spatially dependent terms. If the structural correlations (Eq. [9]) increase with the distance parameters, the inter-block correlation will increase as the dimension of a square block is increased. This is because the (less-strongly correlated) short-range variation is smoothed and the longer-range component predominates in the variation between the pixels. To investigate this behavior we present two simulated examples. In both, we assume that observations and model predictions are coregionalized with two spherical variograms (ranges of 50 and 100 units, respectively). The three coregionalization matrices for the first simulation are:
![]() |
![]() |
|
| CASE STUDY |
|---|
|
|
|---|
![]() | [15] |
In our model we predict FCO2, the flux of carbon dioxide from the soil (g ha1 d1), by:
![]() | [16] |
is the proportion of metabolized C that is assimilated into microbial biomass or resistant forms of C in the soil (humus). The parameter
is constant and can take negative values because the measured flux from soil may become zero when there is still some (albeit small) metabolic production of CO2.
The C function is used to determine the amount of active C that is available for metabolism. We follow Falloon et al. (1998), who showed that the amount of active C could be expressed as a nonlinear function of the total organic C content of the soil:
![]() | [17] |
Following Bradbury et al. (1993), we model
as a function of the clay content of the soil:
![]() | [18] |
is the percent clay content of the soil. This ratio implies that there is a decrease in CO2 production as clay content increases since clays can adsorb soil C and protect it from microbial consumption. Note that our model does not account for the effect of temperature. This is because our data are based on laboratory measurements of CO2 emission from soil, so the temperature was maintained at a constant value.
The Study Site and the Data
These data were collected as an adjunct to a study on the spatial variation of nitrous oxide emissions from the soil (Lark et al., 2004). Because we wished to measure emissions from many sites (256), we collected soil in the field, refrigerated the samples, and measured emissions in laboratory conditions. Our data therefore reflect some, though not all, sources of variation in emission rates that would pertain to the field.
The 256 sites were sampled at 4-m intervals on a linear transect across farmed fields and some waste ground at Silsoe Research Institute, Silsoe, UK (52° 0' N, 0° 24' W). Figure 2 shows the location of Silsoe in relation to Great Britain. The soil for sites in the first 200 m or so of the transect is a Dystric Eutrudept, formed in Quaternary deposits of variable texture over the Lower Greensand, a Cretaceous sandstone. The soil on the remainder of the transect can be classed as Typic Eutrudept, formed from a Cretaceous clay (Gault Clay) overlaid by Quarternary material. This soil is variable in texture in some subregions but uniformly heavy (and of high pH) in other subregions. All the soil samples were collected within a period of 7 h. At each site, four cores of diameter 44 mm for depth 0 to 150 mm were collected from as close together as possible. The cores were taken with minimum delay to a 4°C cold room and were kept at this temperature until they were analyzed.
|
The organic C content of the soil was determined from a second, randomly selected core (since other variables had to be determined from the core used for gas measurements). This core was air-dried, passed through a 2-mm sieve, and combusted in a LECO analyzer (LECO Corp., St. Joseph, MI). A correction was made for the carbonate content. The C content of the soil was expressed on an area basis, given the bulk density. The soil volumetric water content (
v) was determined by cutting in half the core used to measure CO2 emissions and then drying one-half to a constant weight at 105°C.
The clay content for each site was not determined directly. However, an experienced soil scientist allocated every 10th site to a textural class in the Soil Survey of England and Wales system (Hodgson, 1976) by hand texturing. We allocated each site on the transect to a textural class by nearest-neighbor interpolation of this subset. We then assumed that the clay content of the soil at each site was equal to the mean value for the textural class, using some unpublished data on particle-size distributions of soil on the fields traversed by the transect. This estimate of the clay content of the soil at each site was then used to estimate
with Eq. [18].
We then divided our data into two subsets by simple random sampling. The first subset consisted of 100 sites, the calibration set. The remainder constituted a validation set. The calibration data were used to estimate
and k in Eq. [16]. Since the terms on both sides of this equation are subject to (unknown) observation error and we are seeking estimates of the parameters of a law-like function, this is not a regression problem (Webster, 1997). We, therefore, formed calibration estimates of the coefficients by the method of unassumed errors, described by Webster (1997). In short, the calibration data were ordered by the values of fc(c)(1
), then divided into three subsets: the smallest and largest each had 33 values (Subsets A and C); while the intermediate had 34 values (Subset B). The coefficients
and k define the straight line that joins the centroids of Subsets A and C. The estimates are 1562.27 g ha1 d1 and 199.79 g d1 (Mg of c)1, respectively.
These coefficients were used to generate model predictions at all of the 156 validation sites. We denote these predictions by Fco2p and the corresponding observations by Fco2o.
Nonspatial Analysis
Figure 3a
shows the histogram for Fco2o and summary statistics. The coefficient of skew indicates a strongly skewed distribution; it is normal practice to transform data when this statistic is outside the interval [1,1]. The octile skew (Brys et al., 2003), denoted "Skew8," is a robust estimate of the skew, and a value >0.2 suggests the data requires transformation before further analysis (Rawlins et al., 2005). The observed CO2 flux responded to transformation to natural logarithms (Fig. 3b). Figure 3c is the histogram for Fco2p. The means of Fco2o and Fco2p are similar, but the model does not reproduce all the observed variation. The variance of the predictions is about 10 times smaller than the observations. The distribution of the transformed predictions is shown in Fig. 3d. These values remain skewed after transformation, although the octile skew is small. The histogram of model error, zo(x) zp(x), denoted
, is shown in Fig. 3e. The coefficient of skew and octile skew suggest that this distribution is contaminated-normal (the coefficient of skew is large, but the octile skew is small), but we decided to work with the error of the transformed values (ln Fco2o ln Fco2p), denoted
ln (Fig. 3f), for consistency with the multivariate geostatistical analysis of ln Fco2o and ln Fco2p.
|
|
The method-of-moments estimator was rejected because, for all models tested,
o was less than the lower limit of the 95% confidence interval. On average,
was closest to the target of 0.455 for models fitted to robust variogram estimates obtained by the MVE estimator. This suggests that, even after transformation, there are some outliers in the data; this is supported by the histograms and summary statistics in Fig. 3.
The best LMCR was an exponential model fitted to MVE experimental variograms. Table 1 presents the details for this LMCR and its cokriging cross-validation, while Fig. 5a through 5c depict the experimental variograms and the LMCR.
|
|
Scale-Dependent Relations
The codispersion function and structural correlations were calculated according to Eq. [8] and [9], respectively, and are presented in Fig. 5d. The codispersion function shows that the model performance is scale dependent, with the correlation between observed and modeled concentrations increasing with spatial scale, before reaching a plateau at approximately 50 m. The structural correlation coefficients also confirm that there is scale-dependence in the model; the nugget variation of the two variables is uncorrelated, but the variation that is spatially dependentto distances of about 175 mis correlated, with
o,p1 = 0.54. The overall (MVE-based) correlation of ln Fco2o and ln Fco2p is weaker than this because it is affected by the nugget variation as well as spatially dependent variation. The spatial analysis shows that the model performance is better than the nonspatial statistics imply. The model fails to reproduce much of the short-range variation of CO2 flux, but captures better the spatially dependent variation. This suggests that the model might be useful for predicting differences in CO2 evolution between fields or subregions within fields, but not for explaining short-range variation. The weak correlation of the nugget term will be partly attributable to measurement error and also to very fine-scale variations between sample cores at a site. However, Lark et al. (2004) showed (using wavelets) that N2O flux and soil C in this same dataset were significantly (and locally, strongly) correlated. This suggests that the weak correlation between fine-scale components of observed and predicted CO2 emissions is, at least in part, due to the poor performance of the model at this scale.
Factorial Cokriging Analysis
The analysis above shows that the observed and predicted CO2 fluxes can be analyzed as components of different spatial scale that show different correlations. Factorial cokriging analysis allows us to estimate the components directly. Figure 6
is identical to Fig. 4 but, following factorial cokriging analysis, we have removed the nugget component. Comparing Fig. 6 to Fig. 4, we see that many outlying values have been removedthe RMSE of the filtered observations and predictions is approximately half that for the original dataand, in a spatial context, only broad patterns remain. Filtering reveals that the model is actually performing much better than the nonspatial analysis suggests. It is much clearer from the filtered data that the model successfully reproduces variations at positions 100 to 300 m and 600 to 700 m. The model overestimates CO2 flux between about 350 to 500 m and 900 to 1020 m. The noise in the original data meant that this was not easily discerned as a consistent pattern.
|
ln. For the latter, we estimated the variogram by the method-of-moments estimator and also by the (univariate) robust estimators described by Lark (2000b). Again, cross-validation was used to select the best combination of variogram estimator and model, which, in this case, was the median absolute pair difference (MAPD) (Dowd, 1984) and an exponential model. These estimates and the fitted model are shown in Fig. 7a
.
|
![]() | [19] |
Figure 6 suggests that the model overestimates soil CO2 emissions when the underlying parent material is heavy-textured drift over Gault Clay. To investigate this further, we used block cokriging to estimate the mean model error for the four intervals on the transect that correspond to different parent material. These were: (i) mixed drift over a Lower Greensand parent material between 0 and 200 m, (ii) clay drift over Gault Clay parent material between 200 and 600 m, and 885 to 1024 m, and (iii) mixed drift over Gault Clay parent material between 600 and 885 m. Results are shown in Fig. 7b. The confidence intervals show that model error is not significantly different from zero, except on clay drift over Gault Clay between 885 and 1024 m.
Possible Sources of Model Error
The significant failure of the model on heavy-textured soil (Fig. 6b) led us to investigate the coregionalizations of model error and the two variables used as model inputs: soil organic C content and soil clay content. It was hoped that these would provide us with information about inadequacies in the model parameterization and/or inadequacies in the input data.
Scatterplots of
ln and organic C, and also of
ln and clay content, are shown in Fig. 8a
and 8b, respectively. The log-normal distribution of organic C is apparent in Fig. 8a, but there is no (nonspatial) correlation with
ln. Figure 8b suggests that
ln becomes more uncertain as clay content increases.
|
ln and organic C, and to
ln and clay content. Results are presented in Table 2. The robust MVE estimator, fitted with exponential models, provided the best descriptions of the joint spatial variation in both cases. The structural correlation between the autocorrelated components of
ln and organic C was zero, and 0.50 between the nugget components. The structural correlation between the autocorrelated components of
ln and clay content was 0.03, and 0.05 between the nugget components. Because the value of
for clay content is only just inside the confidence interval, we suggest that the LMCR of this variable and model error is at the threshold of plausibility.
|
We now examine the coregionalization of
ln and an independent variable (i.e., soil volumetric water content [
v]). Much research has focused on the temporal relationship of soil CO2 flux and soil moisture (e.g., Franzluebbers et al., 1995; Lou et al., 2004), but their relation in the spatial domain has received only limited attention (de Jong, 1981; Rochette et al., 1991). Our objective here is to use the LMCR to investigate whether model error and soil water content are spatially coregionalized since this may indicate that the effect of soil water is not adequately represented in the model.
A scatterplot of
ln and
v is presented in Fig. 8c. The histogram of
v (not shown) suggests that a normal distribution is plausible. A spherical LMCR fitted to estimates of the variograms obtained by MVE was best. Details are presented in Table 3. The overall correlation between
ln and
v is very weak (rMVE = 0.08), but the structural correlation between the autocorrelated components is 0.19, and 0.02 between the nugget components. The correlation is weak, but it suggests that, on wetter soil, the model tends to overestimate the rate of CO2 emission. This could be because the diffusion of oxygen to microbially active micro-sites, and the diffusion of CO2 out of the core, is reduced in wetter soil. However, the weakness of the structural correlations suggests that there is little scope for improving this aspect of the model. It seems likely that, through the soil clay and organic C contents used as inputs, the model has indirectly accounted for most of the effect of soil moisture.
|
|
| SUMMARY AND CONCLUSIONS |
|---|
|
|
|---|