Program MMFIT fits similar models, but with dissociation constants instead of association constants, and program SFFIT fits cooperative binding isotherms, while program QNFIT allows users to constrain individual parameters, and also to use the best-fit model as a standard curve for calibration.
If there is no response at zero ligand concentration, then the appropriate binding model for the fractional saturation is
y = K*x/[1 + K*x]
and the response, f(x), would be proportional to the fraction of sites occupied, i.e.
f(x) = A*y, where A is the estimated constant of proportionality for observations as a function of y.
To fix A at an arbitrary constant value, say 1 or 100, then use program QNFIT.
If there is an unavoidable baseline response, C, at zero ligand concentration then we would have
f(x) = A*y + C
where the constant C could be estimated independently and subtracted from the
measured response, or estimated during the curve fitting.
You are recommended to estimate C and subtract it from the data where possible so that y(0) = 0.
If an inhibitor is added at fixed ligand concentration, or cold ligand is added to replace hot, then the modified equation would be
f(x) = B*K/[1 + K*x] + C.
One complication is the widespread use of both association constants and dissociation constants (i.e. the reciprocals) and the meaning of half saturation points when baseline constants C are fitted. Program HLFIT estimates the values for x giving y = 0.5, i.e., the true half saturation points. Note that, if you prefer dissociation constants instead of association constants, use program MMFIT.
If the protein has multiple but independent binding sites, or else there is a mixture of proteins with differing binding constants, the curve fitting is much more difficult, and comparable simple explicit expressions are not available for the apparent half-saturation points. However, at solution points, the apparent Ka values can easily be estimated numerically from the best-fit parameters. The extension of this model to two or more sites is considered next but, in reality, it is seldom possible to justify more than two sites by goodness of fit, graphical, or statistical techniques.
Note that n multiple independent sites on 1 protein, a mixture of n kinetically distinct independent receptors, or 1 protein with n negatively cooperatively linked sites cannot be differentiated by fitting this multi-saturation-curve model, as all give non-sigmoidal saturation curves and uniformly concave down double reciprocal plots. Program SFFIT must be used to fit sigmoid curves, or similar deviations from hyperbolic binding isotherms.
A further point is that the pairwise parameters have no unambigious order, as the model is a sum of independent sub-models. So, in order to facilitate the comparison of parameters estimates with different data sets, the parameters are sorted on output into increasing order of amplitude parameters. In other words, in the sequence
A(1), Ka(1), Ax(2),Ka(2), ..., A(n),Ka(n), ...
the parameters have been sorted so that
A(1) =< A(2) =< ... =< A(n)
Program HLFIT will probably be able to automatically locate starting estimates for n =< 3 but, in the unlikely event of requiring n > 3, you may have to supply starting estimates, or use program QNFIT, with starting estimates and limits appended to the data file using the begin{limits} ... end{limits} technique, or provided as a parameter limits file. Using program QNFIT in this way, the ordering of parameters can be arranged to be unique.
In Normal mode this program fits the binding site models
f(x) = A(1)Ka(1)x/[1 + Ka(1)x] + A(2)Ka(2)x/[1 + Ka(2)x] + ... + A(n)Ka(n)x/[1 + Ka(n)x] + C
where it is assumed that A(i) > 0, Ka(i) > 0, and order n is the number of distinct types of independent ligand binding sites, while x is ligand concentration. You can fit one model by simply fixing n, or fit a sequence by allowing n to vary. Asymptotes and half saturation points are calculated numerically. The coefficients A(i) are proportional to numbers of sites with association constant Ka(i), as determined by the units you have used to express y(i) and x(i) values.
Normally you would only require order n = 1 but, if your data are good, then you can see if order n = 2 gives statistically significant improvement. Only if the data are very extensive and of high quality is there any point in trying n >= 3 and the parameter estimates will be of little value anyway. Data for analysis must be in a formatted file that you can prepare, edit, weight using programs MAKFIL, EDITFL.
Instead of measuring saturation as an increasing function of ligand concentration, you may have measured saturation as a decreasing function of an inhibitor competing with the ligand at fixed concentration. For example, in Isotope mode this program fits the binding site models
f(x) = B(1)K(1)/[1 + K(1)x] + B(2)K(2)/[1 + K(2)x] + ... + B(n)K(n)/[1 + K(n)x] + C
where B(i) > 0, K(i) > 0, and order n is the number of distinct types of independent ligand binding sites, while x is [Cold], i.e. unlabelled ligand and [Hot] = fixed. Note the identities:
K(i) = Ka(i)/(1 + Ka(i)*[Hot]), and
B(i) = [Hot]*A(i).
Maximum saturations and half saturation points are calculated numerically. The coefficients B(i) are proportional to numbers of sites with association constant Ka(i), as determined by the units you have used to express y(i) and x(i) values.
Normally you would only require order n = 1 but, if your data are good, then you can see if order n = 2 gives statistically significant improvement. Only if the data are very extensive and of high quality is there any point in trying n >= 3 and the parameter estimates will be of little value anyway. Isotope mode is used when saturation by [Hot], i.e. labelled is inhibited by displacement with [Cold], i.e. unlabelled.
The program first checks to make sure that the data provided are consistent, i.e. x >= 0 and in increasing order, y >= 0 and s >= the lowest limit allowed for weighting. You will be warned if the data do not satisfy these criteria, or if the units of measurement are too large or too small. You should always choose units such that x and y are of order unity. The program then checks the data to identify replicates and so avoid unneccessary repeat function evaluation before scaling to order unity internally.
The program now does a number of calculations with the data in order to estimate the parameters by algebraic procedures. Then a global random search is carried out to see if better estimates can be located, and finally a local directed random search is done to seek further improvement. However, you can input your own starting estimates if required. From the final starting estimates a diagonal parameter scaling matrix is constructed so that internal parameters are of order unity.
Before accepting a higher order model, the goodness of fit and model discrimination statistics should be consulted. In particular, graphical deconvolution to show the contributions of individual component functions to the overall sum should be inspected, to make sure all components are making an appreciable contribution. Only if a higher order model is both suggested by the statistical tests and supported by a convincing graphical deconvolution should a higher order model be preferred over a lower order one.
With more than one High/Low affinity site it is arbitrary which is Ka(1) and which is Ka(2), etc. since this depends on starting estimates. However A(i),Ka(i) pairs are linked.
It is sometimes required to fit high/low affinity site models with
constraints. For instance:
A(i)>= 0, and A(1) + A(2) + ... + A(n) = k
where we would have k = 1 if the data are normalised to proportions,
or k = 100 if the data are in form of percentages. Program QNFIT can be
used for this purpose, as well as for calibration, and the cases where A(i)
and/or Ka(i) can be fixed at known values, etc.
The data file must have columns of x, y (and s?) where x is the ligand concentration, y is saturation and s the standard error of y (s can be omitted or set = 1 for no weighting). Replicates must be provided, not means of replicates, and x, y and s must be nonnegative. Also x must be in nondecreasing order. Try the test file hlfit.tf4 in normal mode for an example of data for two independent distinct binding sites with the same substrate.
HLFIT terminates when either the relative change in objective function (wssq/ndof) or infinity norm of the projected gradient vector are less than tolerance values (set by factr and pgtol). Finite differences and default tolerance should be used unless you have a special need for analytic gradients and high precision.
If all these options are switched off, the program will simply calculate the fit then display a table of parameter estimates. However the default options also output goodness of fit criteria and plots of the data with best-fit curves. It is possible to use an explicitly calculated gradient vector rather than a finite difference estimation for the iterations, but this can slow the fitting down and is only ever required when investigating the convergence with higher order models. There is also an option to store the estimated parameters and covariance matrices for retrospective invesigation concerning model discrimination. Note that, when fitting models in sequence of increasing order, statistical tests are output for model discrimination. However, the most convincing argument for accepting a higher order model is to plot the contribution of the independent sub-models to the fit, which is loosely described as graphical deconvolution. Note that SIMFIT provides a facility to extract tables from the results log files to import into LaTeX documents or word processing programs.
Sometimes there is a backgound signal and the best way to remove this is to estimate it independently then subtract it from the Y-data before fitting. Program HLFIT also allows the estimation of such background factors by estimating a correction parameter but this should only be requested when absolutely necessary, such as when the background noise changes unavoidably between experiments, as it makes fitting much harder.