NORMAL: help and advice

NORMAL: help and advice


Consult the reference manual for further details and worked examples.
W.G.Bardsley, University of Manchester, U.K.
The normal distribution

If X is a normal variable with parameters mu and sigma, i.e. X is distributed N(mu, sigma^2), then the pdf(x) is given by

     pdf(x) = exp{-0.5[(x - mu)/sigma]^2}/[sigma*sqrt(2*pi)]
where mu is the mean of the distribution, sigma^2 is the variance and sigma is the standard deviation.

The cumulative distribution function cdf(x) is the integral of pdf(u) from u = minus infinity to u = x.

Upper and lower tail probabilities are defined for an alpha (0 =< alpha =< 1) and some x-value (say x = x-critical) by

     lower tail probability = P(X =< x-critical) = 1 - alpha,
     upper tail probability = P(X >= x-critical) = alpha,
where P(E) = probability of event E in the sample space. Sometimes percentage points 100(1 - alpha)%, or 100*alpha% are preferred.

Option 1
You input parameters mu and sigma for all the subsequent options and the variance is set equal to sigma squared.
Option 1 is selected to change mu and sigma.

Option 2
You input x-values and the program calculates probability density functions, i.e. pdf(x) values.

Option 3
You input x-values and the program calculates cumulative distribution functions, i.e. cdf(x) values.

Option 4
You input significance levels alpha and obtain x-critical values such that P(X >= x-critical) = alpha, i.e. inverses of the normal distribution.

Option 5
You input measured values V1, V2, ..., VM and the program calculates transforms W1, W2, ... , WM, where Wi = cdf(Vi) under the null hypothesis that V is N(mu, sigma^2). Kolmogorov-Smirnov and chi-square tests are performed on the transforms to test the hypothesis that W is uniformly distributed on the interval (0,1), which is equivalent to the null hypothesis.

If you have some numbers V1, V2, ..., VM and you want to see if it is reasonable to regard these as coming from a normal distribution, e.g. before doing a t test, then use option 5 and check the normal scores plot for nonlinearities.

There are two different ways to use option 5.

  1. You have estimates for mu and sigma that are independent of the sample estimates. Input these independent values using option 1 and do not use sample estimates.
  2. You decide to use sample estimates for mu and sigma. Then current option 1 values will be temporarily over-ridden. The Shapiro-Wilks test automatically does this.
If you use 2., but then 1., with sample estimates in option 1, it is dishonest and alters chi-square degrees of freedom. Option 5 produces estimates of sample mean and variance with 95% confidence limits that are only correct if X is N(mu, sigma^2). Adjust bin sizes in the chi-square test until expected values are at least 5 per bin if possible.

Option 6
You input a sigma then explore power and sample size.

Advice

Always check you are using the correct mu and sigma values. Use options 1, 2, 3 and 4 just as you would look up tables except that you do not need to transform x-values to unit normal values, i.e.

  Z = (X - mu)/sigma.
For an upper tail 5% point use option 4 with 100*alpha% = 5, but for two tailed values for 95% confidence limits set 100*alpha% = 2.5. If unsure about option 5 use sample estimates for mu, sigma but consider using tables for more accurate significance levels. You can input x-values for option 5 from a console, but this is not recommended since it is error prone, and all input is lost after the program has been run. If you prepare a data file, this is a permanent store for the data that can be run repeatedly through the same, or different, programs and can be easily edited to add, delete or change values. Files (like the test file, normal.tf1) must have a one line title, a header (n 1 for n by 1 matrix) then a data column. Use program MAKMAT to prepare data files and program EDITMT for editing such files.