Given a polynomial g(x) of degree n defined as follows
g(x) = a(0) + a(1)x + a(2) +..+ a(n)x^n
and a polynomial h(x) of degree m
h(x) = b(0) + b(1)x + b(2)x +...+b(m)x^m
we can define a n to m rational function f(x)
f(x) = g(x)/h(x).
This would appear to have n + m + 2 parameters to vary or estimate but this is not so as the numerator g(x) and the denominator h(x) can both be divided by any one of these parameters that are non-zero with no change in the behaviour of f(x).
One convenient option for instance is to divide by b(0) which is used by this program so that f(x) can never become undefined with non-negative x and parameters a(i) and b(i).
Curve fitting rational functions is extremely problematical and will only be successful if the behaviour of such functions is understood and the orders n an m or not to high, say less than or equal to about 4.
If good starting estimates are used and a large sample of data with high signal to noise ratio is available then the advanced program QNFIT can be used as it has many aditional options.
However this Simfit program RFFIT is restricted to positive n to n rational functions such as are encountered in the analysis of steady state enzyme kinetics assuming that the numerator and denominator have both been divided by b(0).
f(x) = g(x)/h(x),
where g(x) and h(x) are polynomials, i.e.
g(x) = A(0) + A(1)x + A(2)x^2 +...+ A(n)x^n
h(x) = 1.0 + B(1)x + B(2)x^2 +...+ B(n)x^n
and where A(i) >= 0 and B(i) >= 0.
Either one or a sequence of models can be fitted, by choosing lowest and highest order n.
If x is an activator or inhibitor, you may need all terms but, for enzyme kinetics, choose the option to set A(0) = 0, while for substrate inhibition try setting A(0) = 0 and A(n) = 0. Sometimes just A(n) = 0 is required.
Normally you would only require order n = 1 or n = 2 but, if your data are good, then you can see if order n = 3 or n = 4 gives statistically significant improvement. Only if the data are very extensive and of high quality is there any point in using n >= 3 since parameter estimates will be ill-determined. You can prepare, edit, weight a data input file using programs MAKFIL and EDITFL.
First data ranges are estimated and used for re-scaling into internal coordinates. Then analysis of low x and high x points is used to estimate slope and intercept at the origin, and the value and rate of approach to the final horizontal asymptote. These estimates will only be useful if your data approach near to the origin and the asymptote. A wide ranging random lognormal then local search of allowed parameter space attempt to find feasible starting estimates. Then constrained minimisation of the over-determined linear system in the L1 norm from the best random estimates is done to improve these estimates. Since these procedures are not likely to succeed with noisy data or higher order equations, you can input starting estimates directly. Hopefully, the internal parameters and objective function are scaled to order unity at the solution point.
The program checks to make sure that x and y are nonnegative, and that x is in nondecreasing order, so groups of replicates can be identified and starting estimates calculated.
The files rffit.tf1, rffit.tf2, rffit.tf3, rffit.tf4, rffit.tf5 and rffit.tf6 are examples of arbitrary exact data which can be fitted to see typical curves. Program ADDERR can then be used to add error to these files, or else other test files like mmfit.tf4 can be analysed.
RFFIT terminates when either the relative change in objective function (wssq/ndof) or infinity norm of the projected gradient vector are less than tolerance values (set by factr and pgtol). Finite differences with default tolerance should be used, unless you have a special need for analytic derivatives and high-precision.
The greatest problem when fitting positive rational functions of order 2:2 or greater is convergence to local minima. Fit a 2:2 to mmfit.tf4 with short then extensive random search and you may get two fits with very different parameter estimates. The problem is most acute when starting estimates are widely differing in size, as sometimes happens as a result of using a random search or L1 overdetermined fit. The problem is then dominated by a sub-set of model parameters, often the extreme ones, leading to false convergence.
The short, medium or extensive random searches used by SIMFIT employ different strategies, and are likely to locate different starting estimates, which may lead to alternative parameters at a variety of solution points. In extreme cases, you may have experiment to see what happens if you fit from starting values that you set interactively. In all cases of 2:2 or greater, you should run the program several times anyway. With very stubborn problems you should run program QNFIT, then plot WSSQ contours round alternative solution points to see what is happening. Often the default parameters without a random search or L1 overdetermined linear fitting will prove to be the best choice.
For the substrate inhibition type of curve with the final numerator parameter A(n) fixed at zero, the minimum order required is constrained to be at least 2.
If all these options are switched off, the program will simply calculate the fit then display a table of parameter estimates. However the default options also output goodness of fit criteria and plots of the data with best-fit curves. It is possible to use an explicitly calculated gradient vector rather than a finite difference estimation for the iterations, but this can slow the fitting down and is only ever required when investigating the convergence with higher order models. There is also an option to store the estimated parameters and covariance matrices for retrospective invesigation concerning model discrimination. Note that, when fitting models in sequence of increasing order, statistical tests are output for model discrimination. Note also that SIMFIT provides a facility to extract tables from the results log files to import into LaTeX documents or word processing programs.
Sometimes an initial value parameter A(0) can be estimated if the data cannot be normalised to y = 0 at x = 0.
For most model fitting procedure a parameter A(n) can be varied but, to force y = 0 when x tends to infinity, as with substrate inhibition, this can be contrained to A(n) = 0.
An instructive way to run program RFFIT would be to use the default starting estimates to fit rational functions of order one to four for the test file rffit.tf6. This contains exact data for a 4:4 function with three turning points. You will discover that adding even a small amount of random error will result in a data set from which good parameters cannot be estimated. Details of how to do this to see what RFFIT can do with accurate data and the limitations imposed by noisy data follow.
Import the test file rffit.t6, which is the default test file when RFFIT is opened, and choose the options A(0) = 0 and A(n) = 0 and to fit degrees n = 1 to n = 4. Plot the graphs of best-fit curves during this sequence and you will observe that analysis for n = 1 and n = 2 cannot fit at all while n = 3 looks acceptable and n = 4 reveals that turning points are required to fit the maxima and minimum. Now you will see that test file rffit.tf6 is fitted exactly when n = 4 but if you use progran ADDERR to add random error you will find that this fine structure cannot be observed and a fit of n = 4 cannot be justified on statistical grounds with noisy data.