MAKSIM: help and advice
Consult the reference manual for further details and worked examples.
W.G.Bardsley, University of Manchester, U.K.
Note: sv_simfit has a simplified version of this program with fewer options
About program MAKSIM
This program accepts .txt, .csv, .cvs, .xml, .htm, .html, .mht, and .mhtml files created by spreadsheet programs,
such as Microsoft Excel or OpenOffice Calc after selecting rectangular tables of data,
then it outputs a file in Simfit data file format.
It checks the tables supplied for consistency with the Simfit file formatting conventions
and allows selective suppression of rows or columns. The original spreadsheet file is
not altered.
Formats accepted by program MAKSIM
The data must be prepared as a rectangular table with no missing
values or empty cells, but optionally, row 1 can contain column labels
and column 1 can contain row labels. If row and column labels are present,
then cell(1,1) must have a dummy label (e.g., My_Data) which will
be discarded when a Simfit data file is created. Labels with more than 20
characters will be truncated.
The following formats will be accepted
for editing if necessary, then saving as Simfit data files.
- Tables copied to the clipboard from any spreadsheet program, e.g., Microsoft Excel,
or OpenOffice Calc.
- Tables prepared in any text editor, e.g. Notepad, using spaces, commas, semicolons,
or tabs as delimiters.
- Tables written out from any spreadsheet program as either .txt, .csv, .cvs, .xml, or .html files, i.e.,
- Space delimited ASCII,
- Comma delimited ASCII,
- Semicolon delimited ASCII,
- Tab delimited ASCII,
- XML files,
- HTML files, or
- Unicode, after first reading into a text editor, e.g., Notepad then
saving as ASCII text.
- Files already in Simfit format.
Space delimited ASCII files must have no spaces within labels and
formats like 1,234.56 can be employed as long as they are quoted
as in "1,234.56".
Using Simfit with Excel
If you use the Microsoft program Excel, then you will probably
find that using the macro called simfit6.xls is the easiest way to transfer
data between Excel and Simfit.
The macro simfit6.xls is distributed with
the Simfit package and is described in the document
ms_office.pdf.
Summary of program MAKSIM
This program helps you to transfer data into Simfit, in the form of
relatively small tables, using data files, or the clipboard
from a database, or spreadsheet program. There just two possibilities.
-
You select a purely numerical rectangular table, because you are not going
to need the row and column labels. For instance, a curve fitting file.
-
You select a rectangular table where
the first row has column labels and the first column has
row labels. For instance, for a biplot analysis.
The input table must have exactly the same number
of columns in every row, and have no header or trailing section.
Columns, i.e. cells, can be separated by either semicolons, commas,
spaces, or tabs. There are two possible problems if you want to
create a table using a text editor, but these are unlikely to
cause trouble as most spreadsheet programs will automatically
avoid the issues to be mentioned now.
- Spaces within labels
This program will temporarily introduce underscores
where spaces occur within labels to avoid ambiguities, but these will be
restored to spaces when output files are created.
For example:
[TV set] on input will be processed as [TV_set] but be written out as [TV set].
[TV_set] on input will be processed as [TV_set] but be written out as [TV set]
Note that if spaces are used as separators in the file, then the labels should
have no internal spaces, otherwise the labels will be broken up
into separate cells.
- Commas in numeric cells
If you want to use commas for
indicating thousands in numeric data, then the numbers in the file should be surrounded
by quotes. This will all be done automatically by the spreadsheet program, but the
convention must be observed if you want to create a file in a text editor.
For example:
1,234.56 will mean 1 then 234.56, i.e. two distinct numbers, but
"1,234.56" will mean 1234.56
If the clipboard is used, the option provided after using
the [Paste] button on the Simfit file selection control,
to create a Simfit data file should be selected.
Here are some typical examples of when this program would be used.
- You just want to transfer the whole data set into Simfit file format.
- You have data for creatine kinase measured longitudinally
in a group of patients and want to extract values on days 1,
2, 3 and 4 for all males between forty and sixty for ANOVA.
- You have scores for three different antibodies and wish to
do nonparametric correlation analysis on all patients who
lived for more than one year after the start of treatment.
- A stop flow machine records values ten time a second over
five minutes and you want to extract readings twice a second
over two minutes for curve fitting.
- You have times and absorbances measured in triplicate and
want a plot with means and 95% confidence limits.
This program can extract such data from spreadsheet files and create
files that are in the correct format for statistics, fitting or
plotting by Simfit. The principle is simple; every element has
a row and column Boolean, and both must be true for selection.
Details
- Select data columns from your spreadsheet program then copy
to the clipboard, or write to an ASCII comma, semicolon, tab delimited,
or XML file
for use by this program. Note that any headers and trailers are
automatically suppressed if you input Simfit type files,
but labels are preserved.
- Such input data must only consist of a table in which every
row (cases) has the same number of columns (variables). They
must be in ASCII text file format and not Unicode format.
- Columns can be separated by either commas, semicolons, spaces, or tabs.
- Items within any cell must contain no commas, tabs, or spaces.
- Missing variables in the table must be indicated by some
character (e.g. X or *) and not by spaces.
- This program will not perform any editing until the above
conditions are satisfied and it will turn e.g., ;;; into ;X;X;.
- When a table is accepted for editing, each item A(i,j) has
two associated Boolean variables which must both be true
if the item is to be included in the final edited matrix.
- Items can be selected/excluded by kind, size or index.
- Only numerical matrices with no missing values can be analysed by Simfit.
The format required for input files or clipboard data
From any text editor, word processor or database program you
can construct a table with optional row and column labels.
For instance the title could be something like: Patients in
group 3, and the labels might be: Number, Name, Age, Weight,
Height, Blood_pressure, etc. When you have such a table in a
data_base, you can always write an output file with the data
in the form of a table in the ASCII text file format using a
semicolon, comma, tab, or space to separate the columns (variables), and
having each case on a separate row. This program takes in an
input file in such a format and then creates a Simfit output
file after editing. The original file is unaltered.
The editing takes place in three phases. First selected rows
are suppressed until all remaining rows have the same number
of columns. Then rows and columns are selected by the number
to make a sub-set, and finally rows and columns are selected
by attributes, e.g. all rows with M in column 2, or else all
rows with a value between 1 and 3 in column 7. Since display
is restricted, you can use short entries (e.g. F not Female).
The format of the output file
During the first editing process, you toggle between viewing
and editing until there are no remaining clashes due to rows
with different numbers of columns. Rows are then re-numbered.
During the next phases of editing, you toggle between seeing
the main selected (total) matrix and the edited sub-matrix.
Row/column number requests always refer to your total matrix.
At this stage you can write the selected matrix to an output
file then resume editing, or else read in another file.
The output file will have a title, then number of rows and
columns followed by data then any trailing text in the Simfit format.
If the matrix you create has no missing values, it will then
be ready for statistics, plotting, curve-fitting, etc. But if
there are missing values (indicated by X for instance) or any
character variables in the file it cannot be used by Simfit.
You can only read into Simfit for plotting, statistics, etc.
tables or sub-tables that consist entirely of numbers, not names.
Examine the test files MAKSIM.TF? to see what is required.
Advice
- Use your text editor or database program to make an ASCII
data set in which, as far as possible, all rows (cases)
have the same numbers of columns (variables).
- Columns can be separated by commas, spaces, tabs, or semicolons, and so every
comma, space or, semicolon is important. Write 1.5 not 1,5 for one and a half !
- Use ties, e.g. very_high, fairly-low, over^the^top, etc.
to avoid introducing ambiguous spaces in names or labels.
- This program will not let you proceed to editing until each
row has got the same number of columns.
- Once you have gone on to editing, the program keeps a copy
of the accepted table, with renumbered rows, which is then
referred to as the total matrix.
- As you proceed with editing you can check what is going on
by viewing the total and current edited matrix.
- At any stage, you can write the edited matrix to an output
file, then continue editing, etc.
- All items in output files should be numerical (since stats
requires numbers, not names) so deal with missing values.