If Y is a normal random variable with mean mu and standard deviation sigma, then the transformed random variable
Z = (Y - mu)/sigmais unit normal, i.e. Z is Normal(0,1). A sum of N of these independent Z values squared, such as
X = Z1^2 + Z2^2 + ... + ZN^2,is chi-square distributed with N degrees of freedom, i.e. X is distributed chi^2(N).
Upper and lower tail probabilities are defined for an alpha (0 =< alpha =< 1) and some x-value (say x = x-critical) by
lower tail probability = P(X =< x-critical) = 1 - alpha upper tail probability = P(X >= x-critical) = alphawhere P(E) = probability of event E in the sample space. Sometimes percentage points 100(1 - alpha)% or 100*alpha% are preferred.
You input N for all of the subsequent calculations except
for chi-square and Fisher exact tests (Options 6 and 7).
Option 1 is selected to change N.
You input x-values and the program calculates probability
density functions, i.e. pdf(x) values.
You input x-values and the program calculates cumulative
distribution functions, i.e. cdf(x) values.
You input significance levels alpha and obtain x-critical
values such that P(X >= x-critical) = alpha, i.e. inverses
of the chi-square distribution.
You input measured values V1, V2, ..., VM and the program
calculates transforms W1, W2, ... , WM where Wi = cdf(Vi)
under the null hypothesis that V is distributed chi^2(N).
Kolmogorov-Smirnov and chi-square tests are performed on
the transforms to test the hypothesis that W is uniformly
distributed on the interval (0,1), which is equivalent to
the null hypothesis.
Use MAKMAT/EDITMT to prepare/edit this V data vector.
Suppose you have observed M frequencies O1, O2, ..., OM and
for the same number of M partitions (bins) you calculate the
expected frequencies E1, E2, ... , EM. Then S defined by
S = (O1 - E1)^2/E1 + (O2 - E2)^2/E2 +...+ (OM - EM)^2/EMis a measure of goodness of fit between the O and E values. In fact, if k parameters are estimated from the O values and used to calculate the E values, then S is approximately chi-square distributed with M - k - 1 degrees of freedom.
A contingency table is an M x N matrix of frequencies F(i,j)
which can be tested for association using a chi-square test
with (M - 1)(N - 1) degrees of freedom.
The input data frequency matrix should be prepared in a file
with a title, then a header with the no. of rows and columns
followed by the matrix of frequencies.Programs MAKMAT/EDITMT
should be used to prepare/edit such files.
This program first produces a reduced matrix (by discarding
rows or columns with zero sums). Then it attempts to shrink
further until all expected frequencies are >= 1 if possible.
There are then just two cases.
chisqd.tf1
Use option 1 to set the number of degrees of freedom equal to
10 then read in these pseudo random numbers into option 5 and
see if they are consistent with a chi-square distribution.
chisqd.tf2 and chisqd.tf3
These are columns of observed and expected values that can be
used to see how option 6 works to test if a set of observed
values are consistent with a corresponding expected set.
chisqd.tf4
This is an example of a data set that can be used with option
7 to perform a contingency table analysis by the chi-square
and Fisher exact procedures.
chisqd.tf5
Another contingency table for option 7 but now there are too
many elements for a Fisher exact test.