If there are x successes in N (independent) Bernoulli trials,
each with probability of success equal to p, the random
variable X is said to
be distributed as a binomial random variable with parameters
N and p, i.e. X is distributed b(N,p).
Clearly N >= 1, 0 =< p =< 1 and 0 =< X =< N.
If the probability mass function is pmf(x), then the cdf(x),
or cumulative distribution function is the sum of pmf(u) for
u = 0 to u = x.
Upper and lower tail probabilities are defined for an alpha (0 =< alpha =< 1) and some X-value (say X = x-critical) by
lower tail probability = P(X =< x-critical) = 1 - alpha upper tail probability = P(X > x-critical) = alpha,where P(E) = probability of event E in the sample space. Note that, since X is an integer, the cdf is a step function. Sometimes percentage points 100(1 - alpha)% or 100alpha% are preferred.
This is selected to initialise or change N, p, or lambda.
You input N and p for all subsequent calculations except
options 5, 6, 7 and 8 where N and x are input directly.
You input x-values and the program calculates probability
mass functions, i.e. pmf(x) values.
You input x-values and the program calculates cumulative
distribution functions, i.e. cdf(x) values.
You input significance levels alpha and obtain A,B-values
where P(X > A) >= alpha, P(X > B) =< alpha (if possible),
sothat A and B define inverses of the binomial distribution.
The binomial coefficient N-choose-X or NCX(x) is defined as
NCX = N!/[x!(N - x)!] for 0 =< x =< N where N! = 1*2*3*...*N NCX(0) = NCX(N) = 1 and pmf(x) = NCX(x)*[p^x]*[(1 - p)^(N - x)].This program calculates NCX(x) and the sum of NCX(k) for 0 =< k =< x and any x =< N.
This is for when you have x events (successes) in N trials.
You supply pairs of N and x values and the program will then
estimate p with an exact non-central 95% confidence range.
Alternatively you can select 90% or 99% ranges.
You input measured values x1, x2, ..., xM and the program
calculates a sample estimate for p (i.e. p-hat) and performs a
chi-square test according to the null hypothesis
H0: X is a binomial random variable with parameters N and p(or else N and p-hat). You can decide how many partitions or bins are used in the chi-square test by controlling the minimum number for observed and expected bin sizes. Use MAKMAT/EDITMT to prepare/edit the X input vector.
This arises when there are many estimates of p(i) = x(i)/N(i)
and it is necessary to test if all the p(i) are the same.
Special procedures are possible for sets of 2 by 2 contingency
tables.
Analysis may be inaccurate if x(i) = 0 or x(i) = N(i), so you
may wish to pool to remove such singular cases before using option 8.
When you input such a set of x(i), N(i) values, the overall p
and confidence limits are calculated and likelihood ratio and
chi-square tests are performed to test H0: p(i) are identical.
Relative risks, Odds, Log Odds Ratios, etc. can be plotted.
You may wish to explore whether p varies systematically, as a
function of some control variable t (e.g. space, time, etc.).
To do this, you add a third column of t values to your sample
and the program will plot p(t) with assorted additional lines
to test for significant trends.
If the t variable is arbitrary (e.g. for spacing) the program
will generate successive integer values from a starting value.
You can also create a curve
fit file then explore parametric models for p(t) by nonlinear
regression or generalised interactive modelling.
This arises when there are many estimates of x, y and z where
x = the number of X-type outcomes (e.g. male hatchlings)
y = the number of Y-type outcomes (e.g. female hatchlings)
z = the number of Z-type outcomes (e.g. infertile eggs), and
N = x + y + z is the total no. observations (e.g. eggs).
There are of course three probability estimates, namely
px-hat = x/N, py-hat = y/N, and pz-hat = z/N, but only two are
independent since
px-hat + py-hat + pz-hat = 1.
You input data in the form of rows of x, y, N (!not x, y, z!)
and the program does a chi-square test and plots selected x,y
parameter-pair confidence regions at set %significance levels
by contouring the X-transpose-A-X chi-square function(not the
approximate ellipse). Disjoint regions indicate significantly
different parameter-pair estimates. Note that some overlap is
still consistent with statistically significant differences.
This allows you to explore the effect of sample size on the
precision with which binomial parameters can be estimated, or
the sample size needed to differentiate between proportions.
If for some reason you do not want the default 95% limits, you
can select 90% or 99% limits.
When N is large and p small the binomial distribution can be
approximated by a Poisson distribution with lambda = Np and
pdf = [(lambda)^x]exp(-lambda)/x!,which is used for the analysis of counting data, e.g. number of cells in apoptosis in a microscope field. You can input x = no. observed then estimate lambda and confidence limits using a chi-square variable, and there are other Poisson options.
binomial.tf1
50 random numbers from a binomial distribution with N = 10
and p = 0.5. Use option 1 to set N and p then read in this
file to test if the numbers are consistent with b(N,p).
binomial.tf2
A set of x, N values to use for analysis of proportions.
binomial.tf3
A set of x, N, t values to use for analysis of proportions
with an indexing parameter (e.g. variation of proportions
with time or some other treatment).
meta.tf1
A set of 2 by contingency tables for meta analysis.
trinom.tf1
Data for x, y, z illustrating the effect of sample size on the
confidence limits for determining trinomial parameters.
trinom.tf2
Data for x, y, z illustrating the technique for identifying
statistically significant changes in trinomial proportions
by observing disjoint regions amongst the set of confidence
contours.