matrices takes less time when a smaller number of

considerations needed for proper lagging. As an

data values are used to make estimates and these

example, for data collected on a uniform grid and

equal-sized bins, fixing an *n *to just satisfy the

efficiencies can be significant when dealing with

minimum *N(h*k) for the smaller lags will yield

large data sets. Little accuracy is lost because the

insufficient data pairs to meet the minimum *N(h*k)

nearest neighbors are the most influential in the

for the larger lags. Fixing an *n *to assure the mini-

kriging weighting scheme.

mum *N(h*k) for the larger lags will generally have

8

data is shown in Figure 4-3 for the sample vario-

ler lags. Therefore, the question of how much data

gram points plotted for lags up to about 32 km (the

is required to adequately compute a variogram

first four points) and for lags beyond about 56 km.

should also address the relative locations of the

The presence of a parabolic shape in the sample

data-collection sites.

variogram points was not surprising, because

8

examination of the data indicates a north-south

gradient in the groundwater levels. The simplest

Saratoga data contained more than 30 data pairs.

polynomial trend, linear in *u *and *v*, was fitted to all

Therefore, the bin width can be decreased to get

8

more points defining the early part of (. These

the data using ordinary least-squares estimation.

8

bin-width adjustments can be made to refine (

Residuals obtained by subtracting this regional

trend surface from the data were used to reestimate

whether it is computed from the data or from the

8

8

( in Equation 4-2 and the sample variogram for the

residuals. A plot of ( for the residuals for the Sar-

residuals is shown in Figure 4-4.

atoga groundwater elevations with the bin width

narrowed to about 6.5 km is shown in Figure 4-5.

uniform grid but occur in a pattern that reflects

(

problem areas, accessibility, and general spatial

specified by points computed from Equation 4-2.

coverage. In the Saratoga data set, nonuniform

In general, the larger *N(h*k) is for any bin or lag

data spacing results in the number of data pairs in

interval *k*, the more reliable will be the points

each bin, although still greater than 30, being

8

defining (*(h*k). Also, the larger *K *is, the greater the

highly variable among the bins. This variability

number of sample variogram points shaping 8 .

(

yields different reliabilities for the points defining

8

(. To establish a balance for *N(h*k) among the

However, *N(h*k) and *K *are competing elements of

8 . Journel and Huijbregts (1978) suggest that

(

bins, variable bin sizes can be used so that each

each lag interval *k *should have *N(h*k) equal to at

bin contains approximately the same (large) num-

least 30 pairs. The American Society for Testing

ber of points. A bin with fewer points can be

and Materials (Standard D5922-96) suggests

coalesced with an adjacent bin to form a wider bin

20 pairs for each lag interval. For small data sets

with a greater number of points. Conversely, a bin

the number of intervals may have to be small to

with an excessive number of points can be sub-

guarantee either number of recommended pairs in

divided into adjacent, narrower bins. The coales-

all intervals.

cing and subdividing procedure is largely trial and

error, until the distribution of the pairs of points is

satisfactory to the investigator.

number of data values *n *needed to satisfy the *N(h*k)

8

requirements for all lag intervals of a sample vari-

are the most critical to define the appropriate (.

ogram. Simple combinatorial analysis can estab-

lish a sample size needed to achieve a given total

Therefore, the trade-off between the number of

number of distinct pairs of items taken from the

bins and the number of data pairs within each bin

sample, but it does not address the spatial

can be varied for different regions of the sample

4-6