This program is for data smoothing and
calibration after measuring y = f(x) at known values of x.
It creates a best fit spline curve by minimising the weighted
sum of squared residuals, WSSQ, or the unweighted sum of squares,
SSQ.
You can use replicates to estimate s, the standard deviation
of y at distinct x, or you can use optional substitutions,
such as s = 1 for unweighted fitting.
Often, for instance, you will have measured V(y), a vector of
y-signals under
the same conditions as the calibration data, and wish to use
these to predict x given y, possibly with approximate 95% confidence limits.
The program requires the following values.
n : total number of calibration measurements x(i) : calibration settings x (i = 1 to n) y(i) : calibration measurements y (i = 1 to n) s(i) : estimated std. dev. y (or else s(i) = 1) (i = 1 to n) m : number of prediction/evaluation measurements v(i) : x-evaluation or y-prediction values v (i = 1 to m)The s-values are for weights (w = 1/s^2) and 95% confidence limits.
Interior spline knots are selected points lying inside the range set by the x-data where the sections of the best-fit cubic spline curves join smoothly together. The number of knots used depends on the number of distinct data x-values. If you use too many spline knots, the calibration curve may be too flexible, allowing turning points.
The default spline type uses cross validation with a knot at each distinct x-value and an algorithm to apply tension by calculating a smoothing parameter. The resulting curve will be an attempt to balance goodness of fit with smoothness as estimated using the second derivatives at the knots. With noisy data or where the calibration data supplied causes numerical difficulties, an IFAIL message will be issued and this could result in failure to create a good standard curve. In such cases you should change the configuration of program CACURVE and select to fit a simple weighted least squares spline curve, where the number of knots chosen is sufficient to create a good standard curve.
In such cases you should use a sparse, medium, or dense option
to control the smoothing by varying the number of interior knots.
With noisy data the dense or solid options may generate wavy standard
curves, when the sparse or medium knot densities should be preferred.
To calibrate with lines, quadratics, etc. use program POLNOM.
To calibrate with your own model, etc. use program QNFIT.
For bioassay, a GLM method is also provided by program SIMSTAT,
and there are many other SIMFIT options for estimating LD50, etc.
Consult the tutorial documents about spline smoothing and
use program SPLINE to understand these issues.
In this program log(x) means logarithm to base 10.
You can only expect a good calibration curve if the settings
for x are sufficiently dense and uniformly spaced over the
range of x to prevent the best-fit curve from oscillating.
Failure to present the program with appropriate x data can
lead to turning points and ambiguous prediction of x if the
calibration curve fluctuates too much about the data.
For instance your data might have been prepared by a process
involving serial dilution of a stock solution, when log(x) is
uniformly spaced but x will be geometrically spaced. Then it
would be better to transform x into log(x) by EDITMT before
input to the program. You could also try log(x) if your data
approach a horizontal asymptote.
There are two distinct techniques for using log(x) instead of x: you can input your data as log(x) directly, but you can also transform to log(x) interactively. If you do this internal transformation, then spline or graphical coordinates saved will be log(x), but predictions from y-input and the calibration curve will be x-values.
If a correct deterministic equation is fitted to calibration data then the sum of squares and best-fit equation can yield exact confidence limits. This approach is not possible with splines since, due to high flexibility and local properties depending on knot placement, sums of squares can be made to become arbitrarily small. Also constant variance is often not appropriate if y covers a large range. This program uses the s values supplied or else w from a weighting option (or SSQ if s = 1, or w = 1) to construct a local variance estimate. The program uses this estimate and constructs 95% confidence curves which are then used to predict confidence limits for x given y and y given x. The confidence limits will only be approximate and should be interpreted with a fair degree of restraint and common sense. Remember that garbage-in equals garbage-out. If you supply ridiculous w values for weighting you must not be surprised to find the program giving equally ridiculous confidence limits. As the values you input or set for s, or %|y| increase, confidence limits will expand.
This technique is used when you have a set of data for a standard
curve that you want to use repeatedly for predicting x given y.
To achieve this, additional values are added to the standard curve data file which will
then always result in the same standard curve being generated from
the same data file. To understand this you should compare the
following two test files provided for program CALCURVE.
calcurve.tf1 demonstrates the EXPERT mode flags added to the end of the data
calcurve.tf2 is the same data without weighting and EXPERT mode flags.
Expert mode is only activated locally when additional data are added such as
with calcurve.tf1, and the standard default is restored when the next standard
curve without additional parameters is input.
When you have used the program for a while you will realise that there are just eight options involved with the standard weighted least squares method and you will have a good idea which you personally require. If you choose expert mode you can paste a trailer onto the end of your files with an integer j, then I1,...,I8 and a number ERR as follows.
... ... ..., x(n), y(n), s(n) (last line of data) j (no. of text lines) I1, I2, I3, I4, I5, I6, I7, I8, ERR (control settings) ... (j - 1 text lines)I1 etc. will then over-ride the default settings. The actual values for I1, I2, ...,I8 are those entered from the menus and ERR is percentage coefficient of variation of y as summarised shortly.
Remember that, if you decide to use the expert mode to run this program, you are living dangerously unless you are quite sure what values to substitute for I1, I2,..., I8 and ERR. One advantage of expert mode, for instance, is when you have found a set of control parameters that work well with your data and you wish to install them, pasted to the end of the calibration files, to use as defaults. However, the great advantage of expert mode is when you have one special data set that you want to use repeatedly to predict x given y, but being quite certain that the same standard curve is always generated.
To use this mode you will have to prepare calibration input files with title and data followed by the values you choose to substitute for the integers I1, I2, ...,I8 and ERR. Print or browse the test file calcurve.tf1 for an example which has the following additional line after the data
2, 2, 2, 1, 2, 3, 2, 2, 5.0The meaning of these values will now be explained.
I1: data input mode 1:=File/File 2:=File/Keyb 3:Keyb/Keyb I2: internal coord. 1:=x 2:=log(x) I3: spline density 1:=Sparse 2:=Medium 3:=Dense 4:=Solid I4: weights 1:=1/s^2 2:=1/%|y|^2 3:=1 I5: graph coord. 1:=x,y 2:=add 95%cl I6: 95% con. lim. 1:=None 2:=Slack 3:=Medium 4:=Tight I7: Reserved for future use I8: Reserved for future use ERR: If I4=1 ERR is not used as the s values supplied will be used. If I4=2 or I4=3 set ERR=percentage coefficient of variation i.e.CV% where CV% = 100(sample standard deviation of y)/|y| ERR is then used to estimate approximate 95% confidence limits for plotting or predicting x from y or y from x. In expert mode I1, I2,..,I8, ERR over-ride the defaults.