MAKSIM: help and advice

MAKSIM: help and advice


Consult the reference manual for further details and worked examples.
W.G.Bardsley, University of Manchester, U.K.

Note: sv_simfit has a simplified version of this program with fewer options

About program MAKSIM

This program accepts .txt, .csv, .cvs, .xml, .htm, .html, .mht, and .mhtml files created by spreadsheet programs, such as Microsoft Excel or OpenOffice Calc after selecting rectangular tables of data, then it outputs a file in Simfit data file format. It checks the tables supplied for consistency with the Simfit file formatting conventions and allows selective suppression of rows or columns. The original spreadsheet file is not altered.

Formats accepted by program MAKSIM

The data must be prepared as a rectangular table with no missing values or empty cells, but optionally, row 1 can contain column labels and column 1 can contain row labels. If row and column labels are present, then cell(1,1) must have a dummy label (e.g., My_Data) which will be discarded when a Simfit data file is created. Labels with more than 20 characters will be truncated.

The following formats will be accepted for editing if necessary, then saving as Simfit data files.

  1. Tables copied to the clipboard from any spreadsheet program, e.g., Microsoft Excel, or OpenOffice Calc.
  2. Tables prepared in any text editor, e.g. Notepad, using spaces, commas, semicolons, or tabs as delimiters.
  3. Tables written out from any spreadsheet program as either .txt, .csv, .cvs, .xml, or .html files, i.e.,
    • Space delimited ASCII,
    • Comma delimited ASCII,
    • Semicolon delimited ASCII,
    • Tab delimited ASCII,
    • XML files,
    • HTML files, or
    • Unicode, after first reading into a text editor, e.g., Notepad then saving as ASCII text.
  4. Files already in Simfit format.

Space delimited ASCII files must have no spaces within labels and formats like 1,234.56 can be employed as long as they are quoted as in "1,234.56".

Using Simfit with Excel

If you use the Microsoft program Excel, then you will probably find that using the macro called simfit6.xls is the easiest way to transfer data between Excel and Simfit. The macro simfit6.xls is distributed with the Simfit package and is described in the document ms_office.pdf.

Summary of program MAKSIM

This program helps you to transfer data into Simfit, in the form of relatively small tables, using data files, or the clipboard from a database, or spreadsheet program. There just two possibilities.

  1. You select a purely numerical rectangular table, because you are not going to need the row and column labels. For instance, a curve fitting file.
  2. You select a rectangular table where the first row has column labels and the first column has row labels. For instance, for a biplot analysis.

The input table must have exactly the same number of columns in every row, and have no header or trailing section. Columns, i.e. cells, can be separated by either semicolons, commas, spaces, or tabs. There are two possible problems if you want to create a table using a text editor, but these are unlikely to cause trouble as most spreadsheet programs will automatically avoid the issues to be mentioned now.

If the clipboard is used, the option provided after using the [Paste] button on the Simfit file selection control, to create a Simfit data file should be selected. Here are some typical examples of when this program would be used.

This program can extract such data from spreadsheet files and create files that are in the correct format for statistics, fitting or plotting by Simfit. The principle is simple; every element has a row and column Boolean, and both must be true for selection.

Details

  1. Select data columns from your spreadsheet program then copy to the clipboard, or write to an ASCII comma, semicolon, tab delimited, or XML file for use by this program. Note that any headers and trailers are automatically suppressed if you input Simfit type files, but labels are preserved.
  2. Such input data must only consist of a table in which every row (cases) has the same number of columns (variables). They must be in ASCII text file format and not Unicode format.
  3. Columns can be separated by either commas, semicolons, spaces, or tabs.
  4. Items within any cell must contain no commas, tabs, or spaces.
  5. Missing variables in the table must be indicated by some character (e.g. X or *) and not by spaces.
  6. This program will not perform any editing until the above conditions are satisfied and it will turn e.g., ;;; into ;X;X;.
  7. When a table is accepted for editing, each item A(i,j) has two associated Boolean variables which must both be true if the item is to be included in the final edited matrix.
  8. Items can be selected/excluded by kind, size or index.
  9. Only numerical matrices with no missing values can be analysed by Simfit.

The format required for input files or clipboard data

From any text editor, word processor or database program you can construct a table with optional row and column labels. For instance the title could be something like: Patients in group 3, and the labels might be: Number, Name, Age, Weight, Height, Blood_pressure, etc. When you have such a table in a data_base, you can always write an output file with the data in the form of a table in the ASCII text file format using a semicolon, comma, tab, or space to separate the columns (variables), and having each case on a separate row. This program takes in an input file in such a format and then creates a Simfit output file after editing. The original file is unaltered. The editing takes place in three phases. First selected rows are suppressed until all remaining rows have the same number of columns. Then rows and columns are selected by the number to make a sub-set, and finally rows and columns are selected by attributes, e.g. all rows with M in column 2, or else all rows with a value between 1 and 3 in column 7. Since display is restricted, you can use short entries (e.g. F not Female).

The format of the output file

During the first editing process, you toggle between viewing and editing until there are no remaining clashes due to rows with different numbers of columns. Rows are then re-numbered. During the next phases of editing, you toggle between seeing the main selected (total) matrix and the edited sub-matrix. Row/column number requests always refer to your total matrix. At this stage you can write the selected matrix to an output file then resume editing, or else read in another file.

The output file will have a title, then number of rows and columns followed by data then any trailing text in the Simfit format. If the matrix you create has no missing values, it will then be ready for statistics, plotting, curve-fitting, etc. But if there are missing values (indicated by X for instance) or any character variables in the file it cannot be used by Simfit. You can only read into Simfit for plotting, statistics, etc. tables or sub-tables that consist entirely of numbers, not names. Examine the test files MAKSIM.TF? to see what is required.

Advice

  1. Use your text editor or database program to make an ASCII data set in which, as far as possible, all rows (cases) have the same numbers of columns (variables).
  2. Columns can be separated by commas, spaces, tabs, or semicolons, and so every comma, space or, semicolon is important. Write 1.5 not 1,5 for one and a half !
  3. Use ties, e.g. very_high, fairly-low, over^the^top, etc. to avoid introducing ambiguous spaces in names or labels.
  4. This program will not let you proceed to editing until each row has got the same number of columns.
  5. Once you have gone on to editing, the program keeps a copy of the accepted table, with renumbered rows, which is then referred to as the total matrix.
  6. As you proceed with editing you can check what is going on by viewing the total and current edited matrix.
  7. At any stage, you can write the edited matrix to an output file, then continue editing, etc.
  8. All items in output files should be numerical (since stats requires numbers, not names) so deal with missing values.