GCFIT: help and advice

GCFIT: help and advice


Consult the reference manual for further details and worked examples.
W.G.Bardsley, University of Manchester, U.K.

Note: sv_gcfit only fits growth curve models and not survival models

Program GCFIT can be used to fit models in the following situations.
  1. Simple monotonically increasing nonlinear growth curves (or
    simple monotonically decreasing nonlinear decay curves).
  2. Simple monotonically decreasing nonlinear survival curves
    with either proportions in (0,1) or arbitrary values.
  3. One sample of possibly right-censored survival times.
  4. Two samples of possible right-censored survival times.
  5. GLM for LD50, LD90, etc.
  6. GLM survival analysis.

Note that program QNFIT can be used to constrain individual parameters (e.g A = 1 if data are proportions, or A = 100 if data are percentages), to fit more complex nonmonotonic models, to fit single differential equations, or to use best-fit models as standard curves for calibration.

Program DEQSOL should be used to simulate and fit systems of nonlinear differential equations.

Samples and times supplied to program GCFIT must be nonnegative.

When the program starts you will be asked to choose the mode required for your current run, as now summarised.

Summary (Mode 1): Fitting growth or decay curves

In this mode, you supply data for size/weight/number/length, etc., as a function of time. Then the program fits selected growth curve models, giving statistics for judging goodness of fit and helping you decide which is the best model. Use this option for monotonically increasing growth or dose response curves where you wish to estimate the median effective dose (EC50), time to half maximal response (t-half), maximum growth rate, or final asymptotic size. See gcfit.tf2 for the data format required.

If the data are decaying rather than increasing as a function of time, program GCFIT simply reverses the order and fits a growth curve model. In this case it should be noted that the best-fit parameter estimates will refer to the reversed data except that all output involving time, such as plots or half-times will be corrected to refer to the original time scale.

Note that this procedure of transforming decreasing decay data into increasing growth data in order to fit monotonically increasing growth curve models is not as robust as fitting data sets which consist of observations that are increasing to begin with. In order to understand this procedure you should compare the results from fitting the test file gcfit.tf2 which has growth data with gcfit.tf7 where the same data have been tranformed into growth data, paying particular attention to the graphs plotted, and the best fit parameters and estimated half-times.

Summary (Mode 2): Fitting survival curves

In this mode, the data are supplied in the form of proportion surviving as a function of time and survival models are then fitted. This mode is chosen when a homogeneous population is observed, and the fraction surving is estimated from samples taken as a function of time. Censoring has to be taken into account when calculating the proportion surviving. Use this option with curves declining monotonically from 1 to zero, where you want to determine the median survival time or median effective inhibition concentration (IC50). See weibull.tf1 for the data format required.

It sometimes happens that there is uncertainty as to whether choosing S(0) = 1 is the best choice because normalising data by a arbitrary constant will introduce bias. Another option is the possibility to use survival models to fit arbitrary decay data. So, in order to analyse data that do not have S(0) = 1, an option can be selected to vary an amplitude factor in order to estimate S(0). This will be identified as parameter S(0) in the output of best-fit parameters.

Summary (Mode 3): Analysing survival times

Data are times to failure, and information about censoring is required. Use this option when you have survival times for one or more groups, but with no covariates, and you wish to perform survival analysis. See survive.tf1 for the data format required.

Summary (Mode 4): Generalized Linear Models for survival analysis

Use this option when you have estimated proportions surving in groups of known size and wish to determine the median lethal dose (LD50) or other percentiles by probit analysis. See ld50.tf1 for the data format required. Also, survival analysis with covariates can be done with data formatted as in cox.tf1.


Mode 1: Fitting growth and decay curve models

This program fits up to ten growth curve models by weighted least squares, then gives statistics for choosing a best-fit model. It is assumed that the data are normally distributed at each time point with expectations equal to one of the determinstic models evaluated at that time point.

Starting estimates are calculated by the program, but direct input can be used with difficult models. The data must be positive and monotonically increasing or decreasing, as the program cannot fit curves with turning points, also, except for model 1, the data must approach a final horizontal asymptote.

Note that model 1 is only useful for log-phase growth and model 2 can not fit sigmoid curves. One of models 3, 4 or 5 will often be sufficient or (only if a baseline correction is required) one of the corresponding models 6, 7 or 8. Models 9 and 10 are for more advanced users and can be difficult to fit. For an illustration of how to use growth curve fitting look at Bardsley W.G. et al, J. Anat. (1995) 187, 181-190. For model 9 see Richards F. J., Journal of Experimental Botany (1959) 10, 290-300, while for model 10 see Preece M. A. and Baines M. J., Annals of Human Biology (1978) 1, 1-24

  1. The exponential model
    dS/dt = kS
    S(t) = A*exp(kt), where A = S(0)
  2. The monomolecular model
    dS/dt = k(A - S)
    S(t) = A[1.0 - B*exp(-kt)], where B = 1 - S(0)/A
  3. The logistic model
    dS/dt = kS(A - S)/A
    S(t) = A/[1.0 + B*exp(-kt)], where B = A/S(0) - 1
  4. The Gompertz model
    dS/dt = kS[log(A) - log(S)]
    S(t) = A*exp[-B*exp(-kt)], where B = log(A/S(0))
  5. The Von Bertalanffy 2/3 model
    dS/dt = eta*S^(2/3) - kappa*S, if A^(1/3) = eta/kappa,
    B = eta/kappa - S(0)^(1/3) and k = kappa/3
    S(t) = [A^(1/3) - B*exp(-kt)]^3)
  6. The logistic model with constant term
    f(t) = S(t) - C, ... df/dt = dS/dt = kf(t)(A - f(t))/A
    S(t) = A/[1.0 + B*exp(-kt)] + C
  7. The Gompertz model with a constant term
    f(t) = S(t) - C
    df/dt = dS/dt = kf(t)[log(A) - log(f(t))]
    S(t) = A*exp[-B*exp(-kt)] + C
  8. The Von Bertalanffy 2/3 model with a constant term
    f(t) = S(t) - C
    df/dt = dS/dt = eta*f(t)^(2/3) - kappa*f(t)
    S(t) = [A^(1/3) - B*exp(-kt)]^3 + C
  9. The Von Bertalannfy variable m (i.e. Richards) model
    dS/dt = eta*S^m - kappa*S, if A^(1-m) = eta/kappa
    B = eta/kappa - S(0)^(1-m) and k = kappa*(1-m)
    S(t) = [A^(1-m) - B*exp(-kt)]^[1/(1-m)]
    If m < 1 then eta, kappa, A and B are > 0
    If m > 1 then A > 0 but eta, kappa and B are < 0
  10. The first model of Preece and Baines (5 parameters)
    f(t) = exp[k0(t - theta)] + exp[k1(t - theta)]
    S(t) = h1 - 2(h1 - htheta)/f(t)

Selecting the best-fit growth model

If unrestricted exponential growth is appropriate then only model 1 should be fitted, as the rest have horizontal asymptotes. Usually only models 2, 3 need to be fitted, because one of these will generally fit and parameters are easy to interpret. If model 3 is too symmetrical to fit well, then models 4 or 5 can be tried as these are rather more flexible.

Note that the added constants in models 6, 7 and 8 to estimate a non zero baseline should be avoided, unless it is necessary to estimate a background in order to transform the data. If the background is known, or can be estimated independently, it is much better to subtract it from the data and avoid estimating it at the same time as the other growth curve parameters.

Models 9 and 10 are very specialised and will only ever be required by advanced users who know what they are doing and when these models are required. The program will attempt to locate starting estimates for models 9 and 10, but it will usually be the case that users will want to input their own. If your data have special features that do not seem to be fitted by models 1 to 8, then you should consider fitting your own models using program qnfit or differential equations using program DEQSOL.


Mode 2. Survival curve models

Survival models can be used for decreasing data where the maximal response may be known so that the data have been normalized to proportions, i.e. fractions of maximum response, with S(0) = 1, if that is possible. Otherwise S(0) can be estimated. Typical test files are weibull.tf1 and gompertz.tf1. It is assumed that only the proportions have been estimated, not the numbers of successes in groups of known size, so that the data are normally distributed at each time point with expectations equal to one of the determinstic models evaluated at that time point. Use model 1 for exponentially decreasing data, or one of models 2, 3, or 4 for sigmoidally decreasing data Note that the survivor function is S(t) = 1 - F(t) for the pdf f(t) (i.e. f(t) = -dS/dt) and the hazard function is given by h(t) = f(t)/S(t). Plots are provided for S(t), f(t), h(t) and log[h(t)]

Survival data normally consists of proportions when S(0) = 1 or percentages when S(0) = 100. Usually the baseline S(0) is estimated from the data to accomodate this but, in special cases, users may wish to normalise their data so that S(0) = 1 and then decide not to treat S(0) as a variable parameter.

  1. The exponential model
    S(t) = exp(-At)
    f(t) = A*S, h(t) = A
  2. The Weibull model
    S(t) = exp[-(At)^B]
    f(t) = AB{(At)^[B-1]}*S, h(t) = AB(At)^[B-1]
  3. The Gompertz model
    S(t) = exp[-(B/A){exp(At) - 1}]
    f(t) = B*exp(At)*S, h(t) = B*exp(At)
  4. The Log-logistic model
    S(t) = 1/[1 + (At)^B]
    f(t) = AB(At)^[B - 1]/[1 + (At)^B]^2
    h(t) = AB(At)^[B - 1}/{1 + (At)^B]

Mode 3. Analysing censored survival times

Censored survival times are written to file in the form of t, c, f where t is the time for failure or right censoring, c is the survival code (0 for failure, 1 if right censored) and f is the frequency of observation. Times t must be in increasing order and replicates can be used. To see how to format your own survival times data examine the test files provided, i.e. survive.tf1.

Analysing one sample

A Kaplan-Meier estimate is constructed for S(t) and standard deviation then a Weibull pdf is fitted with the parameters S(t) = exp[-{exp(beta)}t^B]. Parameters and std. errors are also given for the other usual parameterisations and the survivor function and hazards are plotted. Typical test files are any of the survive.tf? files.

Comparing two samples
If comparison of two sets of survival data is required, it is necessary to input two files of censored survival times. The two sample Mantel-Haentzel (log-rank) test is used for significant differences. Other techniques, such as the Cox regression model, can also be used. Typical pairs of test files are survive.tf1 with survive.tf2, or survive.tf3 with survive.tf4.


Mode 4. ED50 and LD50 by Generalized Linear Models

Percentiles for numbers surviving in treatment groups can be estimated by logistic, probit or log-log regression. Typical test files are ld50.tf1 and ld50.tf2, which illustrate the alternaive data formats. It is assumed that the data are binomially distributed with expectations equal to one of the determinstic models. For survival analysis with covariates, generalized linear models and Cox regression can be used, and the required file format will be evident from browsing cox.tf1 and the other GLM test files.