 { Print in Elite type (12-pitch), 60 lines-per-page, and margins allowing 90 columns }

          INSTRUCTIONS FOR INSTALLATION AND USE OF THE HYBALL FACTOR-ANALYSIS PACKAGE

                                Date of release:  July, 2003

 Purpose:  To factor-analyze multivariate score arrays with special versatilities not
     currently available elsewhere.

 Hardware requirements:  A PC microcomputer with 486 or higher-performance CPU,
     running under DOS or in a DOS Window.

 Input requirements:  Raw data must be in an ASCII file wherein, after an arbitrary block
     of documentation lines, each line or consecutive block of lines contains numerical
     scores for one case (subject) on the same sequence of variables.  (Letter-coded
     scores are read as missing data.)  Unless scores are separated by blanks or commas,
     each case record must conform to some Fortran I-code or F-code READ format except
     that demarcation of missing data by non-numeric characters or blanks is acceptable.
     Each case record may commence with an ID number up to nine digits, or ten if not
     larger than 2147583647.
         Provision is also made for importing data covariances or factor patterns
     already computed.

 DOS Size limits:  Now virtually unlimited under LF90 Fortran so long as the number
     of data variables does not exceed 999.  All your available extended RAM can be
     used.  (In principal if that doesn't suffice the code resorts to page swapping
     with your hard drive, but execution time under that is intolerably slow.

 Knowledge requirements:  Some acquaintance with factor analysis and maybe Fortran code
     for reading numerical input from ASCII datafiles; also preferably some awareness of

     Rozeboom, W. W. (1991).  HYBALL: A method for subspace-constrained oblique factor
         rotation. Multivariate Behavioral Research, 26, 163-177. (a)
     Rozeboom, W. W. (1991).  Theory & practice of analytic hyperplane detection.
         Multivariate Behavioral Research, 26, 179-197. (b)
     Rozeboom, W. W. (1992).  The glory of suboptimal factor rotation. Multivariate
         Behavioral Research, 27, 585-599.
     Rozeboom, W. W. (    ).  HYBLOCK: A routine for exploratory factoring of block-
         structured data. MS included in the Hyball package in both WordPerfect
         (extension .WP5) and Adobe Acrobat (extension .PDF) code.
     Rozeboom, W. W. (    ).  How well do criterion-optimizing routines for factor
         rotation in fact optimize?  (b); hardcopy available on request.

 Installation: See p. 3.
                                           CONTENTS

         The Hyball distribution package contains compressed Fortran-90 source code
 (extension .FOR) for DOS execution of the programs listed below roughly in the order
 of their use or importance in a Hyball analysis together, more importantly, with
 executable code (extension .EXE) for all these programs compiled by Lahey LF90.
 Each program is self-contained except for need, if you recompile, to append
 subroutine package EIGS to some.  (Recompilation will not be feasible for most
 users.)  They are listed here roughly in the sequence of their usage.

 HYDATA:  This receives an ASCII file of raw scores, checks it for readability (seldom
     a problem unless scores are run together, in which case you also need to enter its
     READ format), and at your option outputs either or both of (a) a Hydata-standard
     transcription of your rawdata file which can be manipulated in useful ways by
     the Hydata-supplement programs described on p. 17ff., and (b) a file of these
     variables' correlations structured for factor extraction by program MODA.  Also
     returned is a SEE-file containing salient information about each variable's
     distribution, notably mean, standard deviation, high, low, skew, kurtosis, and,
     at your option, reports on missing-data and possible nonlinear item relations.
 
                                  -2-

 FIXDATA/MERGE/RESCORE/SELECT.  See p. 18ff., for the Hydata-supplement programs.

 MODA:  This factors a received array of data covariances.  It commences by allowing
     partition of the data variables between a possibly null X-set treated as manifest
     source factors and a Y-set taken to be manifest outputs.  MODA then computes the
     Y-set's regression upon the X-set, and factors the Y-set's residuals for each of
     the common-factor dimensionalities NF chosen by the user in light of the residual
     Y-covariance eigenvalues.  At user option, factor extraction with communalities is
     either by iterated principal factoring or by Minres (unweighted least-squares).
     Data-space principal components (normalized) are also available.

 HYBLOCK:  If wanted, this program intervenes between MODA and HYBALL to impose a
     hierarchical block structure on the initial factor pattern. It is designed for use
     with data that include multiple indicators of source factors in blocks on which a
     causal-path structure is postulated.  It can also fit factor axes/subspaces to
     preselected marker positions.

 HYBALL:  This rotates a received factor pattern of NV variables on NF factors to oblique
     simple structure under adjustable control parameters and subspace constraints, if
     any, either specified at run time or imposed previously by HYBLOCK.  It stores
     arbitrarily many retrievable trial solutions, and allows both hyperplane-count and
     congruence comparisons of these during the run.  (HYBALL can also rotate quadratic
     factoring solutions and, in monte carlo studies thereof, compare outcomes to their
     source structures.)

 HYLOG/TWOLOGS:  All intermediate rotations and their production parameters leading from
     an initial factor pattern to HYBALL's final rotation thereof are saved in a logfile
     HYBUF* from which HYLOG can retrieve the derivational history of any HYBALL solution
     therein and report comparisons among these solutions in considerable detail. TWOLOGS
     compares rotations between different HYBUF archives for whatever variables they have in
     common on best-matching factors not necessarily the same in number.

 HYFAC:  This receives a HYBALL factor solution and computes the factors' regressions on
     all or any subset of the factored variables (items).  It then allows the user to
     study construction of rounded-regression item scales for the factors as follows:
     For each factor in turn, all items are discarded (given zero compositing weight)
     whose regression coefficients for predicting this factor are smaller than a
     user-readjustable minimum and the loss of predictive accuracy incurred by this
     deletion reported to the user.  The regression weights for the nonexcluded items
     are then converted to raw-score weights, rescaled to give the largest weight a
     stipulated value MaxWt, and rounded to the nearest integer again followed by a
     report of accuracy loss and opportunity to reset MaxWt. The effect of small MaxWt,
     say 2 to 5, is to generate item scales that are largely unweighted sums of properly
     reflected items except that a few of the most important items may have integer
     weights up to MaxWt. These rounded raw-score weightings for each factor under each
     tested MaxWt, along with their accuracies, are reported in an ASCII results file for
     later hard-copy study. The program also reports the extent to which items shared by
     scales for the different factors will create a problem of correlated uniquenesses
     if these item scales are included in factor analyses with other variables.

 BOOTSUMM/HYBOOT.  See p. 23ff. for the Bootstraps-supplement programs.

 ORDER/FINDBLK.  These facilitate use of HYBLOCK.  The first converts ordered pairs of
     indices for digraph nodes into a detailed report on the path structure entailed by
     these index pairs. The second solves item subsets for the block structure that
     reproduces these subsets as the unions of item blocks with their path-antecedents.

 ENTRY:  This program allows you to input a pre-computed pattern and/or covariance matrix
     by entering its elements from console keyboard.  The completed file is structured
     for analysis by MODA or HYBALL as appropriate.  [ Warning: Archaic code in ENTRY. ]
 
                                  -3-

 PLOT:  This displays scatterplots of any two variables for which HYDATA has computed
     summary statistics.  The plots can be of residual scores from which any selection
     of other datafile variables has been partialled. [ Not currently operational ]

 HYPICK:  This picks from any HYBUF* logfile loaded as input a reduced logfile containing
     only a selected subset of the rotations archived in HYBUF*.  The original is unaltered.

 SETPRNT:  Run this to create a printer-definition file PRNTR in which you enter the codes
     for pitch, point size, margins, and lines-per-inch listed in your printer manual.
     PRNTR is read by HYBALL and certain other programs in this package to insert printer
     controls in its results report.  But its use is optional.

 PRNTR.EP/PRNTR.HP: These contain printer codes for the Epson FX-80 and HPLazarjet II,
     respectively. [These are obsolete, but remain for possible demonstration value.]

 EIGS:  This is a set of double-precision subroutines that solve for eigenstructure.  If
     you recompile MODA, HYBLOCK, HYFAC, FIXDATA, or RESCORE, this must be included when
     you LINK.

 COVS.xxx:  These are MODA-readable ASCII data correlations on which you can test MODA's
     performance. (Their header lines briefly describe their origins.)  You can compute
     PATS.WAI, the pattern whose HYBALL rotation is reported in Rozeboom, 1991a, by
     MODA-factoring COVS.WAI for three latent factors with variables {2,3,5} omitted and
     variables {1,4} in the X-set.  Copy to your choice of working subdirectory under
     filename xxx.COV.

 PAT.xxx:  These are HYBALL-readable ASCII factor patterns on which you can test HYBALL's
     performance.  (Their origins are briefly described in their headings.)  Copy to
     INHYBL or to a name with extension beginning with H in your choice of subdirectory
     for running HYBALL.

 DATA.RAW/DATA.BAD:  These are small ASCII files of raw data, one a corrupted reduction
     of the other, on which to practice HYDATA.  Start by reading their header texts for
     advice on what to do with them.

                                     INSTALLATION

         The Hyball package is ZIP-compressed into three installation files, PACKx.EXE
 for x = 1,2,3.  To install their contents, enter the hard-drive subdirectory you have
 created to contain this under any brief name you prefer, insert the disk containing
 PACKx.EXE into your drive A or drive B, and type DOS command A:PACKx or B:PACKx as
 appropriate.  The individual files compressed in this pack will be copied into your
 selected subdirectory, after which you repeat the process for the other installation
 files PACKx.EXE.  (The order in which these are unpacked is immaterial.)  All the
 source code is in PACK1 while PACK2/3 contain only .EXE files.  Finally, don't forget
 to include your Hyball subdirectory in your AUTOEXEC search path.

         When uncompressed, this package totals about 9 megabytes of disk space.  But
 once you acquaint yourself with its contents and their operations, much can be deleted
 so long as you save the distribution package.  In particular, you will seldom have need
 anything from PACK1 except the documentation (*.TXT).
 
                                       -4-

                                   SYSTEM-SPECIFIC ADJUSTMENTS

 RECOMPILATION. Hyball source code is now supported only for compilation by Lahey LF90.
     You are welcome to modify and recompile the source code if you have access to
     LF90, but you may need to consult me on some technicalities in order for that
     succeed.  In particular, MODA, HYBLOCK, and some auxillary programs in the package
     need to be linked with subroutine package EIGS.

 PRINTOUT.  Programs HYDATA, MODA, HYBALL, HYLOG, TWOLOGS, and HYFAC write printable
     results to an ASCII SEE-file whose name either starts with "$" or includes "SEE"
     in its basename or extension.  (Other programs in this package also write variously
     named ASCII reports that in some cases you may want to print.) All but HYDATA can
     deliver either 132-column or 80-column print by a switch setting read from printer-
     definition file PRNTR.  The default width is 132 columns; but if PRNTR doesn't
     contain printer code for that, it instructs the program to write 80-column lines
     instead.  (This substitution requires PRNTR to be in the subdirectory wherein this
     program is being run.) If PRNTR is available, these programs also insert into their
     SEE reports the printer codes appropriate for your printer.  For the most part,
     printer details aren't critical so long as you use a font with fixed pitch.  But
     to get decent appearing pattern plots from HYBALL, or bivariate frequency tables
     from PLOT, you must adjust your vertical line spacing, conditional on whether your
     printer can deliver 132-column lines (17 pitch type), with some care.  If the plots
     you first get from your printer code for 12 LPI with 132-column lines or 7 LPI with
     80-column lines don't look quite right, experiment with other settings for this
     feature.  NOTE: Before printing any SEE-file, examine it within a text editor.
     It may well contain material you prefer to delete or edit before printing.


                                OPERATIONS OVERVIEW

         This program package has a modular organization with multiple entry points and
 storage of intermediate results.  Commencing with a received ASCII raw-score file <data>
 in which each line or block of lines lists one subject's scores on the same variables,
 the full sequence of operations leading to a simple-structured factor pattern for these
 data variables is as follows:

 1)  Run HYDATA to learn the salient statistical properties of the <data> scores, to
         transcribe these data into the standard form on which the HYDATA-supplement
         programs operate (optional), and above all to write standardized covariances
         (correlations) among some or all of the <data> variables to a COV-file from
         which MODA can extract factors.

 2)  Run MODA to solve these covariances for an orthonormal common-factor pattern
         possibly followed by HYBLOCK repositioning of these initial factors to span
         subspaces that subsequent rotations are constrained to preserve.

 3)  Run HYBALL to rotate the MODA or HYBLOCK factors to oblique simple structure
         under whatever subspace constraints, if any, are wanted on this rotation.

 4)  Finally, you can run HYFAC on your preferred HYBALL solution to construct
         item scales for estimating some or all of these factors.

 You will probably want to do some stages of this sequence more than once to experiment
 with the different options they make available.  Essentially all your results from
 repeated runs are saved under a file-naming system, described immediately below, that
 allows you to keep track of what these contain with a minimum of memory strain.  The
 naming system has unhappily become rather baroque (suggestions for improvement are
 welcome); but it does largely manage to avoid overwriting results from previous runs
 until you elect to delete those.
 
                                  -5-

                         STRUCTURED FILENAME SUMMARY:

     Lower-case dummy expressions are in angle brackets if also variable in length.

     <data>.D<k>     The kth Hydata-standard datafile written with basename <data>
                       by program HYDATA, FIXDATA, MERGE, SELECT, or RESCORE. (ASCII)
     <data>.LOG      The accumulated record of HYDATA's production of <data>.D1
                       followed by briefer description of the origin and content of
                       each datafile in the D-series having basename <data>. (ASCII)
     <data>i.COV     The ith covariance matrix and related information computed from
                       <data>.<ext> by HYDATA for input to MODA. (ASCII)
     <data>i.SEE     Report of basic statistics found when <data>.<ext> is processed
                       under the controls set for <data>i.COV. (ASCII)
     <data>ij.M<n>   The pattern on n factors, prepared as input for HYBALL, by the
                       jth MODA analysis of covariance file <data>i.COV. (binary)
     <data>ij.K<n>   Same as <data>ij.M<n> except that the factors are data-space
                       normalized principal components. (binary)
     <data>ij.SEE    Report on MODA factor solutions in files <data>ij.M*. (ASCII)
     <dat>ij.B<n>   HYBLOCK shifting of <data>ij.M<n> in which selected factor
                       subspaces are marked for invariance under subsequent HYBALL
                       rotation.  is a sequential letter index. (binary)
     $<dat>ij.B<n>  HYBLOCK's SEE-report on shifted pattern <dat>ij.B<n>. (ASCII)
     $<dat>ij*.h<n>  The latest detailed report of one or more HYBALL rotations of input
                       from MODA or HYBLOCK.  If the input file is <data>ij.M<n>, h is H
                       and * is null; if it is <dat>ij.B<n>, * is  and h is B. (ASCII)
     <data>ij.#<n>   Logfile of HYBALL's rotations of an input pattern <data>ij.*<n>.
        also HYBUF   Same as <data>ij.#<n> when pattern is first rotated. (binary)
     <data>ij.$<n>   HYLOG appraisals of the patterns stored in <data>ij.#<n>.  (ASCII)
     $2<da>.<p>     TWOLOGS comparisons between two HYBUF archives whose respective
                       COV-file origins have names with common beginning <da>.   is
                       a sequential letter index while <p> is either null or, when this
                       SEE-file juxtaposes matched pattern columns, is P. (ASCII)
     FAC<da>ij.H<n>  An abbreviated version of $<dat>ij.H<n>, intended primarily as
                       input to HYFAC but also accepted as input by HYBALL. (ASCII)
     HYF<da>ij.C<n>  The oblique factor correlations reported in $<dat>ij.H<n>,
                       formatted for 2nd-order factoring by MODA. (ASCII)
     SEE<da>ij.F<n>  HYFAC advice on item scales for the rotated factors reported
                       in $<dat>ij.H<n>. (ASCII)
     <data>ij.Wh     The hth item-weight matrix readied for input to RESCORE by the
                       last run of HYFAC; derived from a HYBALL rotation of a MODA
                       factor solution with basename <data>ij. (ASCII)

         In addition to the content files just described, these programs also write an
     assortment of auxiliary files, by name LASTFORM, <data>.NAM, INMODA, BLKREC, and
     BLOKREC.*.  All these will be overwritten when no longer needed and can easily
     be replaced.  And the provisions for bootstrap estimation of sampling noise in
     HYBALL-rotated factor solutions exploit a binary file BOOTDATA (written by HYBALL,
     read by HYBOOT), ASCII files SEEBOOTS and SEEBOOT of results from BOOTSUMM and
     HYBOOT respectively, and binary outputs of HYDATA/MODA/HYBALL from bootstraps
     sampling data identified by inclusion of a bracket character, that is, one of
     ( ) [ ] { }, in their names.

         Every computation of covariances by HYDATA assigns the output COV-file a 6-digit
     random code number that should differ from the code No. assigned by any other HYDATA
     run. This code No. persists throughout subsequent processing of these covariances,
     but reaches HYBALL with a random 2-digit extension appended by MODA to distinguish
     results from the same input covariances under different choices of factoring options.
     If you become confused about what later results were derived from what preceding one,
     these codings may help clarify this for you.
  
                                  -6-

                              OPERATING INSTRUCTIONS

         Each executable program in this package is activated by typing just its basename
 (no extension or arguments) after the DOS prompt.  It can be in any subdirectory on your
 AUTOEXEC search path.  But the files from which its input is chosen from the on-screen
 listing at start of run must be in the subdirectory that is currently active.

   [ Note:  Printing hard copy as described in the next paragraph may be unnecessary ]
   [        or obsolete for you.  So for now, ignore all but its last sentence.      ]

         To initiate use of the Hyball package, run SETPRNT in the subdirectory you have
 chosen for your Hyball operating files to create a printer-definition file PRNTR (or one
 of these saved under a distinct name for each printer you use) that remains in this base
 subdirectory but is to be copied under name PRNTR to the active subdirectory for your
 run.  SETPRNT requires you to enter when prompted the ASCII codings listed in your
 printer manual for various pitches and point sizes of type, margin widths, and lines
 per inch.  (All Hyball outputs intended for inspection are written to ASCII files, not
 direct to printer; so you can forego this printer setup if you prefer to enter printer
 codes for hard copy of these files in your own way.)  You are now ready to begin
 actual data analysis by the following steps, which you should first practice on
 raw-score files DATA.RAW and DATA.BAD in the distribution package after reading
 their header information.

 1.  UNDERSTANDING YOUR FILE STRUCTURE.  You will find it most convenient to dedicate a
     workspace subdirectory to one or a few related rawdata files and the Hyball results
     you get from them.  Copy PRNTR to this (necessary only if you need SEE-file lines
     limited to 80 characters) and begin analysis of any rawdata file by copying it to
     this subdirectory under a name <data>.<ext> whose basename <data> should be no
     longer than 6 characters and preferably not end in a numeral. Blank or RAW is
     suggested for <ext>, but any choice is acceptable so long as you avoid conflict
     with the dedicated extensions described above.  When you run HYDATA to commence
     analysis of these data, you will be invited to transcribe <data>.<ext> into an
     ASCII datafile <base>.D1, format-standardized for HYDATA-supplement processing
     (p. 18ff. below), for your choice of <base> up to 6 characters.  If you accept
     this option, certain modest risks of later confusion will be avoided if you either
     choose <base> different from <data> or forego further work with <data>.<ext> in
     this subdirectory.  Information about the production of <base>.D1 is recorded in
     ASCII file <base>.LOG.  <base>.D1 may well become first in a series of modified
     datafiles <base>.D<i> in which <i> is a sequential numeric index from 1 to 99.

         Running HYDATA on <data>.<ext> or <data>.Di will produce ASCII files <data>i.SEE
     and <data>i.COV for the smallest digit i > 0 that does not overwrite a file already
     in this subdirectory.  (If none of digits 1-9 are free, i = 0 and may overwrite a
     previous file with this name.)  <Data>i.COV contains covariances set for MODA anal-
     ysis, while <data>i.SEE records pertinent archival information about <data>.<ext>.

         Next, running MODA on <data>i.COV (or its unformatted equivalent that HYDATA
     writes to INMODA) produces ASCII file <data>ij.SEE and one or more binary files
     <data>ij.M<n> (or <data>ij.K<n> if the factors are data-space principal components)
     for the smallest digit j>0 that won't overwrite any prior MODA-output file.  Each
     <data>ij.M<n> contains a MODA pattern on <n> factors set for HYBALL rotation, while
     <data>ij.SEE records information about the factor solutions on this MODA run that
     you may wish to save.  If you elect at this point to impose a block structure on
     this space of common factors by running HYBLOCK on MODA solution <data>ij.M<n>, the
     resultant factor positioning is saved in binary file <dat>ij.B<n> accompanied by
     ASCII report $<da>ij.B<n>, where  is is a sequential letter index, and <dat>
     and <da> are as many leading letters of <data> as namelength limit permits.
 
                                  -7-

         Next, when HYBALL is run on <data>ij.M<n>, or on its HYBLOCK shifting
     <dat>ij.B<n>, it produces ASCII files $<dat>ij*.H<n> (* either blank or ) and
     FAC<da>ij.H<n>, each of which contains the rotated factor solution in its own way.
     In these HYBALL-output names, <dat> or <da> is the longest leading part of <data> for
     which there is room, i,j,<n> are the same as in the input file, and  if present is
     the letter index terminating the basename of a HYBLOCK-output pattern which HYBALL
     has rotated.  (CAUTION:  Don't confuse HYBALL's SEE-report $<dat>ij*.H<n> on rotation
     of a HYBLOCK-shifted pattern with the preceding HYBLOCK report having the same
     basename but "B" instead of "H" in its extension.)  $<dat>ij*.H<n> is your main
     interpretive payoff from all this work, whereas FAC<da>ij.H<n> is intended primarily
     as input to HYFAC for construction of item scales for the factors.  Additional files
     written by HYBALL are a binary logfile named HYBUF or <data>ij.#<n> of its successive
     rotations of the input pattern, an ASCII auxillary report LUMP* on results from
     Spin search, and perhaps a binary file HYF<da>ij.C<n> of correlations among this
     rotation's factors which can be read by MODA for 2nd-order factoring.

         Finally, running HYFAC on any HYBALL-output file FAC<da>ij.H<n> will allow you
     to explore estimation of this rotation's factors by variously selected and adjusted
     item scales.  The record of your findings are collected in ASCII file SEE<da>ij.F<n>
     intended for print after editing to your taste.  And ASCII item-weight matrices that
     Hydata-supplement program RESCORE can import to compute scores on these item scales
     are written under names <data>ij.Wh for the sequence h = 1,2,... of weighting
     alternatives examined on this run.  The basename <data>ij of these weight matrices
     is the basename of the MODA or HYBLOCK pattern from which HYBALL derived pattern
     FAC<da>ij.H<n>.  Each <data>ij.Wh is listed with additional detail in HYFAC report
     SEE<da>ij.F<n>.  It is expected that you will not want to keep more than one of
     these weight matrices after you have studied their SEE report.  Any that you do wish
     to retain should be renamed with an extension starting with W (see p. 16).

 2.  COMPUTATION OF D-FILES AND COVARIANCES FROM YOUR RAW DATA.  Type " HYDATA " at the
     DOS prompt in a directory containing ASCII datafile <data>.<ext> to commence analysis
     or transcription of these data.  You will be shown a listing of all local files whose
     extensions do not preclude their being datafiles and asked to pick one of these.
     (If your choice is not in fact an ASCII datafile, you will see unintelligible symbols
     on screen where meaningful text should appear.  If so, hit Ctrl-C to abort; otherwise,
     if you continue your machine will eventually hang, requiring a reboot.)  Once your
     datafile is loaded,  HYDATA needs your reply to assorted requests for information,
     confirmations, and preferences.  Considerable on-screen assistance accompanies these
     queries, but some advance preparation is also prudent.  Above all, if data input
     is to succeed you must insure that <data>.<ext> is an ASCII text file in which,
     following arbitrarily many documentation lines that may but need not include names
     for the variables, each subject's scores are in a line or consecutive block of
     lines with position in the sequence identifying the variable on which this is a
     score.  (Line length is virtually unlimited, so the absurdly long lines written
     by some Windows applications are acceptable.)  Subject IDs, if present, must be
     integers at most 10 digits long (10-digit IDs can begin only with 1 or 2) that
     occur at start of the subject's record. Only base-10 integers or real numbers with
     decimals are accepted as usable data; non-numeric score entries are treated as
     missing.  Presuming that your rawdata file is acceptable (which it will be under
     nearly all conditions of origin), HYDATA's processing of it will then prompt you
     for the following:

     a. Number NV of variables to be read from <data>.<ext>. These needn't be all the
         variables in the file, nor even the first NV in each record if you enter an
         explicit READ format that skips over some.
 
                                  -8-

     b. Format for reading each subject's score record.  You can simply approve default
         option " * " if scores are separated by spaces or commas (; and : are also
         acceptable delimiters) and, in case you have chosen NV to be less than the
         total number of variables in the file, if none of the first NV is to be
         omitted and the NVth is in the record's last line.  Otherwise, you must enter
         Fortran I-code or F-code READ format; but either will do so long as the
         read-fields are correctly specified.  (That is, I-code will also read F-data
         here).  If you are unfamiliar with Fortran, a knowledgeable colleague can
         explain its READ formats to you in a few minutes.
     c. Whether the records begin with ID numbers.
     d. What number, if any, goes proxy for missing scores.  This must be the same for
         all missing data, and cannot also be a real score for some of the variables.
         Non-numeric flags for missing data need not be declared.
     e. Names for each of your data variables.  These can be at most eight characters
         long, must begin with a letter (or, if you have special reason, one of #, $,
         %, &, <, =, >, ?) and can end with one or more numerals so long as no non-
         numeric character occurs after a digit.  You can supply names in several ways:
         (1) They can be entered and revised from keyboard at Name-the-variables time,
         requiring an amount of effort ranging from trivial to tedious depending on the
         number of names and how intricately you choose to differentiate them.  (2) If
         the input datafile lists names for its variables separated by spaces or commas
         in one or more consecutive lines without other text, HYDATA will read the first
         NV of these once you identify the file line where they start.  (You needn't
         remember precisely where that is; a screen display will enable you to locate it
         easily.)  Names longer than 8 characters will be cropped; ones with unwelcome
         composition will be loaded as received but may later cause trouble unless
         revised.  (3) To allow creation or editing of names in a text editor, which for
         complicated namelists you may find more convenient than entry/revision during
         the run, HYDATA looks for an optional ASCII file named <data>.NAM from which
         to load the first NV properly separated words as your variables' default names.
         You can use a text editor to create a NAM-file from scratch, copy and revise
         a namelist from your source datafiles, or revise the NAM-file included in the
         auxiliary output from your last HYDATA run.

     Given this information, HYDATA first tests <data>.<ext>'s readability in the form
     you have stipulated.  If this test is successful, as should usually occur, it will
     with your permission write a D-file transcription of these source data and/or put the
     standardized covariances (correlations) of the input variables into both a transient
     binary file INMODA and an archival ASCII file <data>i.COV. (MODA will accept either
     of these as input.)  And salient facts about this datafile are reported in an ASCII
     file <data>i.SEE that you may want to print.  HYDATA's SEE-file is written only for
     132-column print and does not receive PRNTR code unless you manually copy to its
     head the character string following symbol "%" at the start of file PRNTR.

         When the input to HYDATA is Hydata-standard (a D-file), this procedure is
     much simplified in that steps (a)-(e) are essentially omitted: Apart from initial
     invitation to create a revised D-file, the program jumps immediately to the options
     initiating covariance computation.  Regardless of whether its input datafile is
     Hydata-standard, HYDATA's COV-file production exits with option for one of two
     closing computations: One is to generate up to 126 perturbations of this COV-file
     by bootstraps simulation of sampling noise, each ready for factoring in a different
     COV-file flagged in its basename by "(", ")", or certain other brackets.  The other
     is checking for appreciable nonlinearities in the data relations.  This is done by
     ascertaining, for each ordered pair <Yi,Yj> of data variables, the extent to which
     the variance accounted for by Yj's quadratic regression on Yi is greater than that
     of the corresponding linear regression.
 
                                  -9-

         Note:  When HYDATA's initial run on a received datafile <data>.<raw> computes
     both its Hydata-standard transcription <base>.D1 and covariances <data>.COV, the
     covariances later computed by HYDATA from input <base>.D1 may differ from <data>.COV
     due to score modifications by the transcription, trivially by rounding of scores
     originally recorded in fulsome detail but perhaps not so trivially by your choice
     in trimming outliers.

 3.  FACTOR EXTRACTION.  Run MODA in the directory containing your data covariances and,
     when prompted, enter the following controls.
     a.  Name of the covariance file to be factored, selected from MODA's listing on screen
          the names of all covariance files to which it has current access.  To factor the
          covariances from your last HYDATA run, simply accept default input INMODA.
     b.  Indices of any variables in the input array that are to be omitted from this
          run's analysis.
     c.  Indices of NX "X-set" variables (usually NX = 0), such as dummy-coded experimental
          treatment conditions, that are taken to be manifest sources of the dependent
          variables. The NY variables that are neither in the X-set nor listed for omission
          are the Y-set (the dependent variables).
     d.  Whether to reflect all variables in a subset whose reflection will decrease the
          number of negative correlations by an amount shown.  If accepted, reflections
          are reported in the run's SEE-file and also flagged in the item-identifications
          passed to HYBALL.
     e.  Number of leading eigenvalues wanted for inspection. (Be generous in your call;
          all have already been computed at this point, so the only reason to keep this
          small is avoidance of uninformative display clutter.  When this choice is
          entered, the program partials the X-set out of the Y-set and displays this
          many leading eigenvalues of the Y-set's residual covariances.  In light of
          this eigenvalue sequence, you then select:
     f.  One or more choices of the number NF of factors to be extracted from the Y-set's
          residual covariances.
     g.  Factoring method (iterated Principal Factoring vs. Minres vs. normalized dataspace
          Principal Components), iteration limit IMAX, and convergence tolerance TOL (maximum
          communality shift or fit improvement not calling another iteration). Overriding the
          default settings of IMAX and TOL will seldom have much point. (For explanation of
          TOL, see Postscript 7, p. 31.)  You will later have option to loop back to step
          (e) in this sequence with dataspace eigenvalues replaced by eigenvalues of the
          Y-variables' estimated common-parts covariances.  (Update note: Maximum-Likelihood
          and Generalized-least-squares factor extraction are now also method options.)

     MODA then finds the Y-residuals' factor pattern on NF orthonormal common factors, or,
     alternatively, the leading NF normalized principal components, for each stipulated NF,
     appends to it any pattern columns comprising the Y-set's regression on the X-set, and
     sends this joint pattern together with any X-set covariances to unformatted HYBALL-input
     file <data>ij.M<NX+NF>.  In this filename, "<data>ij" is as described in (1) while the
     number following M in its extension is the number NF of latent factors extracted plus
     the number NX of manifest-input variables.  (If NX > 0, HYBALL will later extend this
     pattern's dependent variables also to include the X-set, yielding an (NY+NX)-by-(NX+NF)
     pattern wherein variable NY+J has nonzero loading just on factor J.)  A report on
     salient details of these factor solutions--most importantly their accuracy of data
     reproduction and common-parts eigenvalues, and the Y-variables' uniquenesses--is
     returned in ASCII file <data>ij.SEE. (Also included is a listing of Y-set indices in
     order of decreasing uniqueness to facilitate identifying the items most dispensible
     if you wish to discard some.)
 
                                  -10-

 3a. IMPOSITION OF SUBSPACE CONSTRAINTS (OPTIONAL).  If your data variables include
     indicators for blocks of factors on which you have posited a causal-path structure, or
     if you want to force certain factors to lie within subspaces spanned by the common parts
     of selected groups of marker variables, running HYBLOCK at this point will reposition
     the initial factors in MODA-output file <data>ij.M<n> to span the hierarchy of subspaces
     that your model wants HYBALL rotation to preserve.  The modified HYBALL-input file
     is named <dat>ij.B<n> where  is a sequential letter index and <dat> is <data>
     after omission if needs be of its last character.  For procedural details, see
     Rozeboom, forthcoming (c).

 4.  ROTATION OF FACTOR AXES.  Run HYBALL to rotate your extracted factor pattern to
     oblique simple-structure under optional fixation of selected factor axes/subspaces.
     HYBALL's first response to a new input pattern is Varimax or Equamax pre-rotation
     of all factor blocks received as orthogonal.  (The input pattern remains retrievable
     by Main Menu Option 6, as first in store, to initiate other rotations should you
     wish.)  Thereafter, the program is highly interactive with too many control options
     to be detailed here. (See Rozeboom, l991b, for a full account thereof.)  But brief
     documentations adequate to guide your choices on these are available on-screen when
     you select Main-Menu Option 1 for parameter adjustments, and you should have little
     reason to override the default settings of most until your experience with HYBALL
     is considerable. Even so, here are some outset clarifications you may find helpful.

     a. Input.  HYBALL does not extract factors; it only rotates the axes of a prior
         solution received as input, mainly though not exclusively M-files from MODA,
         HYBLOCK's B-file modifications of M-files, or HYBUF* logfiles of HYBALL's
         multiple rotations of previous input patterns.  HYBALL begins its run by listing
         all the HYBALL-input files available for loading from the current directory.

     b.  Hyperplane bandwidth BH: Zero +/- BH is the range wherein a factor loading
         (pattern coefficient) is considered to be in that factor's hyperplane; BH between
         .15 and .25 is recommended.  Your choice of BH will appreciably affect the hyper-
         plane counts reported to screen during the run, but will probably not matter much
         for your rotated pattern coefficients so long as it remains in the recommended
         range and parameter CV (see next) is positive.  But under negative CV (no longer
         recommended), varying the setting of BH may indeed have noticeable consequences.

     c.  Within-hyperplane misfit curvature CV:  Under default setting CV = +1 (larger than
         which scarcely matters), the criterion for best hyperplane tries to minimize the
         factor loadings even of data points already within hyperplane band +/- BH.  But as
         CV decreases to -1 (this parameter's lower limit), the rotation criterion becomes
         increasingly indifferent to the size of small loadings so long as they do not
         exceed BH.  So CV = -1 makes the solution for strongest hyperplanes more sensitive
         to hyperplane bandwidth than its sensitivity to BH under CV = +1, and at times is
         surprisingly good at cleaning out weakly salient loadings.  However, CV = -1 has
         proved much inferior to CV = 1 at recovering source patterns from complex
         artificial data.  Unless you want to fool around, leave CV at default.

     d.  Rotation constraints: If your MODA-factoring included an X-set, you will want the
         factors defined by these input variables to remain fixed under rotation.  Or you
         might like to ignore certain factors altogether.  Or you may conjecture a causal-
         path structure on blocks of the factors received by HYBALL that you want the
         rotation to heed.  These rotation constraints are set by your assigning a block
         code (FIX) to each initial factor and, in more complex cases, a partial ordering
         of the blocks as well.  You may not find the on-screen explanation of this FIX
         coding entirely clear; however, you should seldom need to override its default
         settings.  For HYBALL receives X-set factors from MODA, or factors on which a
         block structure has been imposed by HYBLOCK, with the correct FIX codes already
 
                                  -11-

         in place.  Unless you want to mark some factors for omission or to free up the
         received FIX constraints, you can simply step through the list of FIX assignments
         for inspection without change or can ignore them altogether. Although these FIX
         codes can be altered anytime during the run, reason for that will seldom arise.
         But this same Main-Menu branch also gives access to holding the pattern on
         selected factors essentially constant during any sequence of rotations--which
         you may find useful when you are happy with the distribution of loadings on some
         factors and prefer not to jeopardize this when continuing search for pattern
         improvements on the others. (The benefits I have obtained from this option
         have so far been disappointing; but maybe you will have better luck with it.)

     e.  Other rotation-control parameters:  HYBALL solves for simple structure under its
         FIX-coded rotation constraints by iteratively minimizing, for each factor in each
         pattern plane wherein it can move, a hyperplane-misfit measure that increases
         with the size of loadings its positioning imparts to the other factor in this
         plane.  The parameters that select this measure's specific curvature and controls
         are available with documentation under Main Menu Option 1.  However, tuning away
         from default values accomplishes little for most.

              Most important of these control options additional to BH and FIX is the
         solution's MODE, that is, its algorithm for minimizing the hyperplane-misfit
         measure. The primary mode alternatives are Step-down Regression (STEP) vs.
         Brute-force Scanning (SCAN).  STEP, an initial rough solution followed by
         iterated "polish stroke" refinements, is more susceptible than the other to
         global convergence failure, and for complex patterns is more likely to end in
         a local misfit minimum.  But it is also much faster than SCAN (though at modern
         speeds that scarcely matters), and the merely-local minima it sometimes finds
         may well be worth attention.  Complex factor configurations often sustain two
         or more distinct axis placements whose hyperplanes are of sufficiently high
         quality to merit interpretive consideration.  Such solution alternatives, which
         are best identified by Spin search (see Point k, below), tend to be found by
         STEP at a grain of resolution somewhat coarser than SCAN results.  Simulation
         studies indicate that the latter are prevailingly superior to the former,
         but less so than might be expected and not under all circumstances.
              HYBALL's rotation-mode options now also include a second facet, Serial vs.
         Parallel iteration, that cross-cuts SCAN vs. STEP.  Serial-SCAN is appreciably
         faster than Parallel-SCAN (under STEP, the timesaving doesn't really matter),
         and should be just as satisfactory for most applications. But under Spin search
         (see below), Parallel is preferable in principal even though its potential gain
         may not often materialize.  (See Rozeboom, forthcoming (b), for the nature and
         performance appraisals of this distinction.)  And Oblimin rotation, recommended
         for patterns with especially low factor complexity, is now also provided.

     f.  Item omissions: One of your Main-Menu options is on-screen inspection of planar
         pattern plots, in light of which you can fine tune this rotation's continuation
         by stipulating particular items (variables) to be ignored in particular planes.
         Use this option when certain points appear to be pulling a hyperplane away from
         a position marked most strongly by other points.  An item whose pattern point
         is listed for disregard in one pattern plane is not ignored in any other unless
         explicitly listed there as well; a plane's OMIT items can be entered while the
         plane plot is on screen; and once entered, a plane's OMIT list remains in force
         throughout subsequent rotations until you alter it.  (Item omissions are often
         less helpful, or have stranger effects, than one might expect.  Even so, they
         can be interesting to play with when the number of variables is modest.)
 
                                  -12-

     g.  Accumulated records: At each pause in a HYBALL run, you have the option of
         revising your control parameters before continuing rotation from the pattern
         currently attained.  But such changes do not always prove beneficial; so HYBALL
         retains a record (in HYBUF) of the pattern attained at each previous pause,
         together with its control settings when recorded, and allows any one of these
         to be reinstated after you inspect a simultaneous hyperplane-count listing
         for all. Hyperplane counts are always shown for the current hyperplane-width
         setting, so you can compare past results under a variety of BH values by
         repeatedly changing this parameter without calling additional rotations.
         (Hyperplane count proves to be only a rough guide to hyperplane quality,
         however.  See Postscript 6a, p. 30 below.)

     h.  Congruence reports.  Main-Menu Option 8 is comparison of the current factor
         pattern (which can be any recalled from HYBUF store) to other stored patterns.
         The report tells for each column of the current factor pattern how similar it
         is, measured by angle of congruence divergence, to its closest match in the
         other pattern.  (NOTE: The congruence of two conforming vectors of real
         numbers is their uncentered correlation, and Hyball defines their "divergence"
         as the arc-cosine of their unsigned congruence.)  When you have trouble
         deciding which of several near-final solutions to choose, congruence reports
         can show whether there is really much difference among these alternatives and,
         if there is, which factors (or more precisely their pattern coefficients) are
         most unstable across these solutions. For more detailed comparisons among
         your stored solutions, however, suspend HYBALL operation long enough to study
         HYLOG's report on your HYBUF logfile.

     i.  Permutation/reflection of factors:  Should you wish to do so, you can freely
         permute and reflect the axes of your current factor pattern. (Permutations
         extend to the factor-conditional controls in FIX and OMIT, but do not affect
         results stored previously except as stipulated--see below.)  When HYBALL
         receives a pattern of NY dependent variables on factors that include manifest
         inputs (a MODA X-set), it augments the Y-set by markers for the latter in
         original order--i.e., the Jth X-set factor becomes also the NY+Jth dependent
         variable--so that after permutation you can still identify which factors are
         what manifest inputs.  Permutations/reflections whose details are stipulated
         at keyboard are stored in HYBUF as additional solutions.  But without adding
         to the collection, you can permute some or all solutions already stored into
         maximal-congruence alignment with one selected as an ordering template just
         by stipulating the indices of records to be so permuted.  You can also command
         permutation of the active factors into order of decreasing "importance"
         defined for each factor as the Root-Mean-Square of its item loadings.

     j.  Recovery from interaction errors.  Most choices you make during a HYBALL run are
         reversible.  But if you make a non-recoverable error, such as calling printer
         plots for all factor planes when you only wanted one or two, simply abort the run
         by hitting DOS break CTRL-C, restart HYBALL and, when prompted, opt for resumption
         of your previous run.  You can quickly return to the point where you went astray.

     k.  Spin search. For rotation problems supporting a plurality of good-hyperplane
         solutions, which one HYBALL rotation recovers may be significantly dependent on
         the rotation's starting position.  The Main Menu's Spin option allows you two
         forms of flexible control over this. (a) Some or all of your freely moveable
         factors can be aligned with selected variables.  And (b) all factors not FIX-
         constrained or aligned with variables are positioned randomly by SPIN before
         rotation under your chosen rotation parameters.  SPIN repeats random starts
         followed by rotation until it reaches either the repetition limit you have set
 
                                  -13-

         or an option is exercised to discontinue earlier.  These Spin solutions, saved
         in a scratchfile, are then ranked on quality by your current choice of hyper-
         plane misfit parameters, and the best of them that are sufficiently distinct
         from one another (your choice of how many and how distinct) are added to HYBUF
         store.  See Postscript 4, p. 28f., for more detailed comment on this option.

     l.  Rotation pauses.  When you command HYBALL to rotate the current pattern, the
         rotation iterates until either the largest axis shift is less than a small
         adjustable tolerance, or the adjustable limit on number of iteration cycles
         is exceeded.  Unhappily, it has proved difficult to terminate HYBALL's rotation
         iteration prior to these programmed pauses by entering Ctrl-C.  But computation
         has become so fast on modern computers that the time reqired even for for Spin
         search with large patterns is no longer onerous.

     Finishing.  When you elect to stop, HYBALL writes one or more patterns of your
     choice, together with considerable additional information if wanted, to ASCII file
     $<dat>ij.H<n> which you can edit--deleting unwanted material, adding comments,
     inserting page breaks, etc.--before sending it to your printer.  (Use a text-editor
     for this; your word-processor will mess up the file's column spacing if you forget
     to disable right-justification.)  In addition, HYBALL provides two other output
     files (in addition to MODA-input file HYF<da>ij.C<n> of factor correlations) having
     considerable prospective value for you.  One is the unformatted accumulation of
     rotations that you were able to recall during the run.  This logfile is named
     HYBUF<info> for your choice of <info> which initially is null but is best expanded
     to identify the originating pattern.  (When HYBALL prepares to exit with <info>
     still null, it offers to make this an informative fragment of the sourcename; but
     you are free to decline that or later change it using DOS command RENAME or COPY.)
     Unless you delete or overwrite HYBUF<info>, you can later load it into HYBALL for
     resumption of your present run, or into HYLOG/TWOLOGS for detailed comparative
     study of the factor patterns it archives.  The other auxiliary HYBALL-output file,
     FAC<da>ij.H<n>, is an ASCII record of your selected rotation in HYBALL-input
     format.  Although this too can be used to resume HYBALL rotation (albeit with no
     transfer of control settings other than FIX codes) and, unlike binary files, can
     be reliably transferred to other machines, its main use is for HYFAC construction
     of factor scales.  (See Instructions 5, below.)

 4a. APPRAISING ARCHIVED FACTOR PATTERNS (OPTIONAL).  Solution evaluations provided
     during a HYBALL run seldom suffice to identify one rotation as clearly superior
     to all the others.  (HYBALL's hyperplane counts show mainly which solutions are
     inferior; its congruence reports tell only the symmetric degree of difference
     between two solutions, not their asymmetry in quality; and its on-screen pattern
     displays are too dispersed in space and time for perceptive comparisons beween
     similar results.)  Before choosing one or two solutions to be your payoff results
     (that is, your focus of interpretation, journal reporting, and construction of
     factor scales for subsequent research), it is strongly advisable that you examine
     the ensemble of rotations collected in one or more HYBUF archives of factor
     solutions with the same rawdata origins through the eyes of HYLOG and TWOLOGS.
     These report assorted holistic features of the archived solutions, some of which
     are nonrelational measures yielding preference rankings in these particular
     respects, while others expand upon the congruence comparisons afforded by HYBALL.
     And these assessments are recorded in SEE-files that you can study in hard copy
     or a text editor by whatever manipulations and pacing you find most edifying.
 
                                  -14-

         HYLOG examines the solutions in a single HYBUF archive, starting with a
     historgram of the variables' communalities (not features wherein these solutions
     differ, but a holistic view thereof that you may find interesting) and the
     derivational history of this sequence of rotations.  Next it proffers a menu of
     nonrelational pattern appraisals from which you can select repeatedly.  Picking
     one of these with some choice of its parameters produces screen display and
     SEE-file recording of a table that shows for each rotation in this logfile its
     vector of details in this respect.  Five such measures are currently installed
     in HYLOG: Hyperplane counts as in HYBALL with BH (hyperplane bandwidth) as
     parameter; two measures of factor complexity (one with parameter BH, the other
     paramter-free); a measure of gappiness in the distribution of small factor
     loadings that might conceivably implicate a natural hyperplane boundary; and
     hyperplane misfit measured by your choices of a Loss function from the ones
     available for HYBALL rotation to minimize.  (Since HYLOG reports misfit for each
     factor in each rotation scaled in proportion to the smallest within-pattern mean
     misfit in this collection, this measure is not entirely nonrelational.)  Next,
     for each pair of this logfile's solutions in a chosen subset of these (all if
     you wish), HYLOG tables the pair's vector of matched-factor divergences together
     with its mean, the same as shown by HYBALL's Main-Menu Option 8.  And finally,
     derived from these matched-factor divergences are (a) a Factor Centrality table
     of how frequently each of a solution's factors recurs in this solution set, and
     (b) cluster analysis of the similarities among these solutions.

         TWOLOGS has been designed primarily to compare HYBALL rotations between two
     different logfiles HYBUF<a> and HYBUF<b>, though <a> = <b> is allowed and proves
     useful.  It is presumed that HYBUF<a> and HYBUF<b> have the same rawdata origin
     or at least that any item name common to these logfiles points to the same data
     variable.  (HYLOG* does not contain item names, but identifies the COV-file from
     which those can be recovered.)  It is NOT presumed that the number of factors NF(a)
     in the HYLOG<a> solutions equals the number NF(b) in HYLOG<b>, nor even that the
     two logfiles contain patterns for exactly the same subset of the rawdata variables.
     Typing " TWOLOGS " at the DOS prompt in a directory containing HYBUF<a> and HYBUF<b>
     enables you to select these from the list of logfiles accessible there.  (If the
     COV-files from which these originate are elsewhere, you will also be asked to
     identify the directory where those can be found.)  TWOLOGS treats these two
     logfiles as an ordered pair--think of this as <leftlog,rightlog>--which TWOLOGS
     permutes from the provisional order entered if needed to insure that the number of
     leftlog factors is never less than the rightlog number.  Presume that when loaded,
     HYBUF<a> is your leftlog while HYBUF<b> is on the right.  (This will be so whenever
     NF(a) > NF(b); but if NF(a) = NF(b), the one you enter first becomes leftlog.)
     TWOLOGS first displays, for your choice of Loss function, each pattern collection's
     hyperplane misfits relative to the smallest within-pattern mean in both logfiles.
     (At present, misfit is computed over all variables in each pattern even when some
     items in one collection do not occur in the other.)  This information, which you
     should have already studied in HYLOG output, is exhibited here to remind you which
     patterns seem best by your chosen misfit measure and to rank them within each
     logfile in case you want similarity comparisons only between the higher-quality
     solutions.  Next, you list for each logfile which subset of patterns in one you
     want compared to which subset in the other.  (Several different ways to enter
     your lists are provided, one of which should prove effortless for what you want.)
     You also enter how many patterns leading each list are to be primed for possible
     JP-comparison to primed patterns in the other.  ("JP" abbreviates "Juxtaposed
     Patterns", which is clarified below.)  Then for each pattern Ri in the rightlog
     list, a subtable is written to SEE-file $2<da>. showing for each leftlog pattern
     Lj the divergence of each factor in Ri from the best-matching factor in Lj.  And
     if Ri and Lj have both been primed, you are invited to record their JP-comparison
     in companion SEE-file $2<da>.P.  Thereafter, you are asked whether also to record
 
                                  -15-

     the congruence comparisons just tabled by a sequence of histograms showing for
     each pattern pair <Ri,Lj> the distribution of their matched-factor divergences.
     And finally, after screen display of the matched-factors divergence distribution
     over all between-list pattern matchings combined, you are invited to record some
     more holistic between-log similarity comparisons relative to a divergence GAP,
     chosen in light of the divergence distribution still on screen, as threshold for
     regarding two factors as a close match.

     Note 1. JP-comparison of Lj to Ri displays these two patterns in a table comprising
       NF(b) columnar cells, each of which contains on its right one column of Ri while
       while in close proximity on its left is the best-matching column of Lj.  Loadings
       are shown to two decimals with point omitted, with the exception that paired
       loadings for the same variable are blanked if they are both smaller in size than
       a stipulated cutoff.  Such a display makes it easy to perceive precisely where
       and how severely these two patterns differ.  In addition, at the foot of these
       matched-factor columns are summaries telling for each cell of this JP table (1)
       the congruence divergence (degrees of angle) between the matched Lj/Ri loadings
       on this factor, (2) the RMS difference between these matched loadings, (3) the
       RMS size of loadings in this column for Lj, and (4) the same for Ri.  Summaries
       (1)-(4) are given separately (a) for all loadings on this factor including ones
       smaller than the display cutoff, and (b) just for the loadings that pass the
       display cut.

     Note 2. The availability of JP-comparisons among finalists in the contention to be
       your choice of payoff pattern is the main reason why you will want to run TWOLOGS
       on single HYBUF logfiles as well as on pairs thereof.

     Note 3. To enable the names of HYBALL logfiles to encode more identification than
       allowed by headname HYLOG<a>, <data>ij.#<n> also names a logfile of rotated
       factor solutions derived from rawdata with headname <data>.


 5.  ESTIMATING FACTOR SCORES (OPTIONAL).  Run HYFAC to learn how accurately your HYBALL
     factors can be estimated by item scales, and especially if you want RESCORE to
     estimate factor scores for the subjects whose raw scores are the source of your
     HYBALL results.  Once you have selected a HYBALL-rotated factor pattern from the
     list of available FAC* files displayed on screen when HYFAC is called, you will
     first be shown the received pattern in order to assure you that you have the right
     file, followed by a short delay while the item covariances are reconstructed from
     the factor pattern/covariances.  (The program computes idealized regressions using
     these reconstructed covariances rather than their exact data values. There is some
     evidence that this slightly lessens the sampling noise in regression weights.)
     You will be asked to choose a set of the items--some or all--whose regression
     estimates of the factors you want to study, and you can also elect to ignore some
     of the factors. The SMCs (squared multiple correlations) for these regressions
     are displayed on screen and, at your option, are recorded together with the
     corresponding standardized regression coefficients (Beta-weights) in ASCII file
     SEE<da>ij.F<n>.

         Once HYFAC has computed factor regressions on the items or item subset, you
     are invited to convert these into practical item scales for the factors.

            "Practical" here means that on the scale for each factor, you would
            like (a) to discard items that are essentially irrelevant, (b) to
            define the scale in terms of the item raw-scores, and (c) either to
            avoid differential item weights altogether except for sign or to use
            only low-integer weights, say from 1 to 2, 3, or 4 in magnitude.
 
                                  -16-

     HYFAC first lets you observe the effect, for each factor in turn, of discarding the
     items having the smallest standardized regression weights for that factor.  For each
     cut you care to inspect, you are shown the loss in SMC of replacing regression
     weights lower than this by zero.  After you have found the highest cut on each
     factor whose SMC loss you consider tolerable, HYFAC converts the non-zeroed
     coefficients to raw-score regression weights, multiplies them by a scaling constant
     that makes the largest raw-score weight equal to your choice of integer MaxWt, and
     rounds the rescaled item weights to the nearest integer.  The result is a raw-score
     scale whose item weights are +/- integers 1 to MaxWt.  You are allowed repeated
     choices of MaxWt followed by information on the SMC loss so incurred, and you should
     find that you can push MaxWt quite low, say 10 or less, before this loss becomes
     appreciable.  For MaxWt less than 4 or 5, however, strict application of this
     procedure may round to zero the weights of more minor items than this scale can
     afford to lose. So HYFAC also allows you to enter along with MaxWt an equalizer
     digit Eq that shrinks the proportionate differences between the large and small
     regression weights for each factor.  With Eq set high, say 4 or more, you can make
     MaxWt as low as 1 (no weighting at all except for reflections) and still retain a
     goodly number of the items whose regression weights are considerably less than the
     strongest. You may well conclude that the scoring convenience of this is not worth
     its loss in accuracy; but enabling you to search out an optimal balance between
     accuracy and convenience is precisely HYFAC's point. For each revision of MaxWt or
     Eq, the complete listing of raw-score item weights so derived for each factor are
     filed in SEE<da>ij.F<n> along with the SMC information for editing and hard-copy
     study. Each set h = 1,2,... of item weights recorded in HYFAC's SEE-file is also
     stored in an ASCII file <data>ij.Wh, where <data>ij is the basename of the MODA
     pattern rotated into the factors for which HYFAC offers these item scales.  These
     W-files can be imported by RESCORE to compute scores on item composites defined
     by these item weights, and are named on screen when a RESCORE run invokes this
     option.  Be warned, however, that HYFAC always deletes all old files with
     extension .Wi (i any digit) from the active subdirectory before writing new ones;
     so take care to rename the ones you want to save with a non-numeric extension.
     Any whose extension continues to start with W will be included in RESCORE's
     listing of available weight files.

         Finally, be alerted that HYFAC's SEE-report on each selection of item weights
     includes the covariances among the so-weighted item scales that result from the
     unique components of shared items. If these uniqueness correlations are appreciable,
     they will likely degrade factoring of data arrays which include these item scales
     among their variables.  When these uniqueness correlations are known, however, it is
     straightforward in principle to subtract them from the item-scale covariances before
     those are factored. If interest warrants, future updates of this package may include
     a program for making such adjustments on covariance matrices submitted to MODA.


 6.  ALTERNATIVE DATA ENTRY.  There may be times when you would like to factor a
     covariance matrix or rotate a factor pattern whose originating raw scores are not
     available to you.  (E.g., the data may consist of published correlations, or be
     classical test cases such as Thurstone's Box patterns.)  Program ENTRY enables you
     to construct input files for MODA and HYBALL by entering elements from keyboard.
     Although ENTRY -- written when my programming expertise was still primitive -- is
     clumsier than a full-screen text editor or spreadsheet (your best alternatives), it
     lets large matrices be entered piecemeal in smaller blocks; and its provision for
     adding/deleting rows/columns at margin, and transposing blocks of consecutive rows
     or columns to the array's end, permits insertion of rows/columns inadvertently
     omitted, or deletion of ones duplicated, without starting over.  ENTRY also allows
     you to avoid typing decimals; and since its working file (INWORK) is in ASCII, you
     can make spot corrections with a text editor if the file is not overly large. And
     for entering a correlation matrix to factor or a pattern to rotate, its output is
     readied for MODA or HYBALL, respectively.
 
                                  -17-

         Moreover, if you want to factor covariances or rotate patterns already in
     ASCII computer files, only a few lines of code are needed to rewrite those in
     format that MODA or HYBALL can read.  ASCII files MODCODE and HYCODE in the
     distribution package will explain how to do that.


                                   Wm. W. Rozeboom
                                   Department of Psychology
                                   University of Alberta
                                   Edmonton, Alberta T6G 2E9
                                   e-mail:  rozeboom@psych.ualberta.ca























 ======================================================================================

                             HYDATA-SUPPLEMENT PROGRAMS

     In addition to computing normed covariances (correlations) from your raw
 data, HYDATA also offers to transcribe your datafile into a standard format
 presupposed by Hydata-supplement programs that can manipulate your data
 in ways you may well find useful.  Hydata-standard datafiles (D-files) are
 identified by an extension starting with D, and are accepted as input by:
 FIXDATA, which estimates missing scores; MERGE, which combines two or more
 Hydata-standard datafiles; SELECT, which copies scores from some or all records
 in a datafile on some or all of its variables into a separate file; and RESCORE,
 which allows new variables to be derived by various nonlinear rescalings or linear
 combinations of selected old ones.
 
                                  -18-

 HYDATA-STANDARD FORM:  Although datafiles having this form are in ASCII, their size can
     easily exceed the limits of practical text-editing or viewing.  The first few lines
     of any Hydata-standard datafile convey information about its origin and content.
     (Any text-editing of these lines risks making the file unreadable by the Hydata-
     supplement programs.)  Next come one or more lines listing names assigned to the
     variables, and, after that, arbitrarily many data records, one for each subject.
     Each data record begins with a width-K field containing the subject's ID number
     followed by ":", where K-1 is the number of digits in the file's largest ID, and
     concludes with NV fields of width 3 (or 4) containing this subject's scores on NV
     variables, rescaled if necessary by some integer power of 10 to have integer values
     in the interval [-98, 999] (or [-998, 9999]).  Missing data are coded -99 (or -999);
     and outlier scores that deviate from the variable's mean by more than DEV standard
     deviations (your choice of DEV) are also coded as missing.  Each record is split into
     50-score (or 45-score) rows, each line after the first starting with a width-K blank
     that aligns scores down rows.  Variables in Hydata-standard datafiles generally use
     all the fieldwidth allotted to them, so scores on adjacent variables are often not
     separated by spaces.  Hydata-standard datafiles always order their score records in
     increasing ID number (which are assigned consecutively if the source file does not
     provide them); and, when written by HYDATA or MERGE, their variables are by default
     (which you can override) ordered alphanumerically by name.

     UPDATE:  Hydata-standard format has been upgraded to insure that on each data
     variable, the most extreme scores not treated as outliers retain 3-digit accuracy.
     This results in D-files having 3-place fieldwidths when no scores are negative, but
     4-place fields when that is needed to accomodate minus signs.  To achieve this fit,
     scores as received are scaled up (generally when the raw data include decimals) or
     down (when the raw data have more than 3 leading digits) by positive or negative
     powers of 10.  These scaling powers (one for each variable) are listed in the
     D-file's header so that the data's original scaling can be recovered to 3-place
     accuracy should that be wanted.

 HYDATA-STANDARD FILENAMES:  All Hydata-standard datafiles receive a name of form
     <base>.Di for some integer i between 1 and 99.  When file <base>.D1 is created by
     running HYDATA on some source datafile, <base> up to six characters is chosen by
     the user.  Thereafter, by default, each datafile derived by one of the Hydata-
     supplement programs from a preceding <base>.Di with this particular basename for
     some i is assigned name <base>.Dj for the smallest j not already in the extension
     of an extant Hydata-standard datafile with this same basename.  Salient details on
     the origin of each file with that basename are accumulated in ASCII file <base>.LOG.
     Alternatives to the naming default are:  (1) The output file can be given a new
     basename.  And (2), each of these programs except HYDATA and FIXDATA allows its
     input file (or for MERGE the first input file) to be replaced by its output file
     under the same name as the one replaced.


                                   PROGRAM DETAILS

 HYDATA:  Use of this to transcribe source datafiles into Hydata-standard form has been
     explained above (p. 7f.).  But you will also want to run HYDATA on datafiles that are
     already Hydata-standard.  Primarily, this occurs when you need D-file covariances.
     (Take care to appreciate that covariances computed by HYDATA on the same run that
     produces a D-file are not for it but for the source datafile).  But you might also
     do this to revise the variables' names or to impose tighter exclusion of outliers.

 FIXDATA:  When called, this scans the local directory for D-files and if any are found
     invites you to pick one for estimating its missing scores.  Basically, FIXDATA
     does this by regressing the variables on which a given record's scores are missing
     upon the variables for which this record provides good data.  But the computation
     is a variant thereof that is robust against dimensions of data-space containing
 
                                  -19-

     negligible data variance.  This is accomplished by an iteration of imputations
     that solve the latest correlations, computed with estimates of the missing scores,
     for principal axes of which the weakest are ignored when revising these imputations.
     How many axes to discard is a running user option guided by current on-screen
     information about the items' eigenstructure.  This procedure encourages you to
     decrease the number of ignored axes as the iteration proceeds; however, the best
     strategy for pacing this still awaits adjudication.  (Early returns suggest that
     results may be surprisingly insensitive to how its running options are chosen.)

         FIXDATA crops its score estimates, if necessary, to lie within the variables'
     observed ranges.  But information about their uncropped deviancies (Z-scalings) is
     summarized in the dataset's LOG-file and detailed in a SEE-fix report <base>.FXi
     that lists each imputed score, together with their estimated standard errors and
     more holistic information about convergences and sizes of the imputations, on
     each pass of the iterated solution.  SEE-fix extension index i avoids premature
     overwritng of performance information for previous runs; and you can decide which
     sequence of procedure choices seems best for the imputation task at hand by
     examining FIXDATA's performances over a series of dry runs that collects SEE-fix
     reports under varied procedure options without actually generating an imputed
     datafile until you are ready for that.

         For datafiles appreciably afflicted with missing data, program RESCORE can
     create missing-data "shadow" variables which FIXDATA can include in the
     information from which it estimates missing scores.  See Note 3, p. 22 below.

 MERGE:  This allows two or more Hydata-standard datafiles F1,F2,... to be combined into
     a Hydata-standard file <base>.Dj, where <base> defaults to the basename of F1 and
     may or may not also be the basename for any of the other Fi.  (The derivational
     history of any merged file Fi will be included in <base>.LOG only if its basename
     is also <base>.)  The IDs of records and names of variables in <base>.Dj are the
     respective unions of IDs and variable names in the Fi.  For each Fi in order, the
     score in Fi at a given ID/variable address overwrites any score copied into
     <base>.Dj at this same address from any file preceding Fi in the input-file
     sequence.  Addresses at which none of input files F1,F2,.. contain a datum are
     marked as missing.  The combined IDs in <base>.Dj are ordered sequentially, and
     at user option the merged variables are ordered either alphanumerically or, apart
     from overwriting, in the order received.  Variable names cannot be changed in
     MERGE; and the input files F1,F2,... are unaltered unless the user opts for the
     merged file to replace F1 under that same name.

         When D-files with different fieldwidths are merged, their merge has the larger
     fieldwidth.  You can use this to increase the fieldwidth of a D-file F3 from 3 to
     4 by merging a small 4-width D-file G4 with F3 and then running SELECT on that
     merge to delete the unwanted excess.  One easy way to get such a G4 is to Z-scale
     one of the variables in F3 using RESCORE (which allows the wanted merge without
     invoking MERGE.

 SELECT:  This allows scores on any selection of records from Hydata-standard datafile
     <base>.Di, crossed with any selection of its variables, to be copied into a new
     <base>.Dj with the same basename.  When <base>.Di needs correction but is too large
     for normal editing, a section of it can be copied by SELECT to <base>.Dj for the
     lowest available j, edited as appropriate, and then written over the defective
     original by running MERGE on <base>.Di,<base>.Dj in that input order.  If the aim
     of editing <base>.Di is simply to delete a portion of this that will never again
     be wanted, the user should opt for SELECT's output to overwrite the input file.
         WARNING: Some older versions of DOS include a command SELECT.COM that takes
     search-path precedence over SELECT.EXE.  If you do have this it is in all
     likelihood of little use to you and can just as well be deleted or renamed.
 
                                  -20-

 RESCORE:  This allows selected variables in Hydata-standard file <base>.Di to be
     combined or nonlinearly rescaled, yielding a datafile <base>.Dj that contains
     scores on the stipulated derived variables for each ID in the input file.
     When run, RESCORE first receives the input-file name and then accepts entry of
     a set of Jobs.  Each Job is specified in four stages:

              First, choose the type of operation to be performed from four alternatives:
         (1) A file of item weights written by HYFAC for estimating HYBALL factors can
         be imported to compute scores on these factor scales.  (2) Arbitrarily many
         selected variables can be combined by one of several concatenation operations
         provided.  (3) Selected variables can be combined or transformed by any formula
         you choose from a list of formulas you have constructed from a provided ensemble
         of primitive functions/operators.  (4) Missing-data shadows can be created for
         any selection of the variables.  (See Note 3, below, for explanation.)
              Second, specify the Job's details.  (Not needed for Operation 1, and just
         barely by Operation 4.)  For Operation 2, you pick one of concatenators Sum,
         Maximum, Minimum, Length, Deviance, or Outlie.  For Operation 3, you choose
         from your list of created formulas after accepting or declining the option to
         extend or revise this list.
              Third, identify the variables on which this operation is to be performed
         (not needed for Operation 1) and choose its treatment of missing input terms.
         There are three missing-data options: (a) the defined variable can be scored
         as missing for any subject lacking any of its constituent terms; (b) each
         missing input constituent can be replaced by that input variable's mean; or
         (c) missing constituents can be estimated by item-appropriate scalings of the
         subject's average deviancy on the remaining input constituents.  (Operation 3
         does not allow missing-data Option c.)
              Finally, specify a name for each variable defined by this Job.  Default
         names are created for them from names of the variables from which these derive,
         but you will not likely want to use these defaults except as temporary place-
         holders.  Under Operation 1, each item composite's default name identifies the
         constituent item having the largest raw weight on this scale. (Operation 4
         bypasses name creation; its default assignments are not adjustable during
         the run.)

         Once Job entry is complete, values of all variables defined by these Jobs are
     computed for each record in <base>.Di and listed by ID in file <base>.Dj either for
     the lowest free j or, at user option, for j = i.  By default, <base>.Dj receives
     only the derived variables; but these can alternatively be appended to the input
     variables either in a new file or as replacement for the original <base>.Di.  The
     new variables are ordered in the sequence of their creation without regard for name.

         Note 1. The yield of concatenations Sum, Maximum, and Minimum under Operation 2
     is plain from their names.  Concatenation "Length" is by default the Euclidian norm
     (root-mean-square distance from origin) of each subject's vector of scores on the
     selected input variables; but length defined by an alternatively powered distance
     metric can also be chosen.  Concatenation "Deviance" is Euclidian Length after scores
     on the selected variables are normed as deviancies, that is, sigma-distances from the
     variables' respective means, while "Outlie is Deviance on these items' principal axes.

         Note 2a.  RESCORE reads/writes an unformatted archival file named FORMULAS that
     contains Operation-3 formulas created previously.  It loads this file at startup if
     present, and before exiting writes to it the most recently active list of created
     formulas while backing up its previous version under the name FORMULAS.OLD if any
     changes have been made.  You can build a library of created-formulas lists for
     different purposes by saving these under suitably different extensions, and copying
     them back to FORMULAS when wanted.  Whenever you initiate Operation 3, the following
     panel of information on formula creation is available for display:
 
                                  -21-

   Ŀ
        Function creation here follows the standard Fortran format for       
    line-function definitions.  Each formula is written as the righthand     
    side (you can omit lefthand part " Y = ") of some equation having form   
                                                                             
                           Y = f(Di,Dj,...)                                  
                                                                             
    wherein f( , ,...) is a schema consisting of names for primitive oper-   
    ators/functions, parentheses, and occasionally commas to separate the    
    arguments of binary functions; while its D-terms are dummies that must   
    always be written as X, M, S, H, L, or Z followed by a positive integer. 
    X-dummies will become raw-score variables from your input file; dummies  
    Mi, Si, Hi, Li are respectively the Mean, SD, High, and Low of raw-score 
    variable Xi; and Z-dummies are deviancy scores on the corresponding raw  
    variables.  That is, Zi = (Xi-Mi)/Si.  In applications, each dummy term  
    Dj's nominal index j is replaced by the jth real index in a list thereof 
    selecting the particular input variables for this derivation.  Dummy     
    terms can have nominal indices in arbitrary order with any number of     
    repetitions; but an application must list as many real input indices as  
    the largest nominal index in the formula. (The computation ignores any   
    listed input not picked by a dummy index.)                               
                                                                             
        When composing formulas, standard rules hold for parentheses and     
    precedence among arithmetic operators. The latter can never be written   
    consecutively.  For example, -X1+X2*X3 is the same as (-X1)+(X2*X3)      
    while X1/-X2 is illegal but can be acceptably entered as X1/(-X2).       
                                                                             
        The admissible primitives and their required notation are ordinary   
    arithmetic operators +, -, *, /, and ** (or ^ if you prefer), which are  
    placed between (or for unary minus in front of) their arguments, and the 
    functions named MAX( , ), MIN( , ), MOD( , ), KUT( , ), ABS( ), SQRT( ), 
    EXP( ), LOG( ), LN( ), SIN(), COS( ), TAN( ), ASIN( ), ACOS( ), ATAN( ), 
    NINT( ), and INT( ), which must always be followed in parentheses by     
    one or two argument expressions as indicated.  Function names can be     
    entered in lower case, but their spelling must be exactly as shown apart 
    from blanks (which are ignored) and omission of trailing letters not     
    needed to distinguish that name from the others.  All these but one are  
    standard Fortran functions, using generic names (no distinction between  
    Integer and Real), and writing LN and LOG respectively for natural       
    logarithms and logs to base 10. All the trigonometric functions measure  
    angles in degrees.  The nonstandard function is KUT(x,y), which takes    
    value 0 if x < y and 1 if x  y.  (As explained in this program's        
    documentation, KUT can be used to create multistage step-functions.)     
                                                                             
        Each Operation-3 job applies a particular formula chosen from the    
    created-formulas list to a selection of the datafile's input variables.  
    Let n be the largest nominal index explicit in the chosen formula's      
    dummies.  Then execution of this Job defines a new variable by applying  
    this formula to the first n selected variables in the order listed, a    
    second new variable by applying the formula to the next n selection-list 
    variables, and so on until fewer than n selected variables remain.  You  
    will probably find this provision for deriving several new variables in  
    one Operation-3 job to be useful mainly for rescaling large groups of    
    old variables by the same single-argument transformation.                
                                                                             
         Note.  Each derived variable is finally rescaled by some integral   
      power of 10 to optimize discrimination in integer range [-999, 999].   
      The rescaling multipliers are reported in this dataset's logfile.      
   
 
                                  -22-

     We note for the record that the operations entered by expressions X+Y, X-Y,
     X*Y, X/Y, and X**Y (equivalently X^Y) are respectively X-plus-Y, X-minus-Y,
     X-times-Y, X-divided-by-Y, and X-to-power-Y.  These and all the named functions
     but KUT are described in any Fortran manual, except that the present LN and LOG
     will be respectively called LOG and LOG10 there, while the present NINT and INT
     are ANINT and AINT in technical Fortran.  (That is, NINT(X) here is the whole
     number closest to X, while INT(X) is the whole number closest to X not exceeding
     it in magnitude.)  Also note that angles are measured here in degrees, not
     radians.  To appreciate what can be done with basic function KUT, consider

                (a)  Y = a1 + a2*KUT(X,c1) + a3*KUT(X,c2) + a4*KUT(X,c3),
                (b)  Y = X*(KUT(X,c1) - KUT(X,c2)),

     wherein X is an input variable while a1,a2,a3,c1,c2 are constants such that
     c1 < c2 < c3. Under (a), Y is a step-function of X that is level at value a1 for
     X < c1, jumps up (or down if a2 is negative) to level b2 = a2+a1 for X in interval
     c1  X < c2, next jumps to level b3 = a3+b2 in X-interval c2  X < c3, and finishes
     at level b4 = a3+b3 for X  c3.  And under (b), Y equals X for X from c1 up to but
     not including c2, and is zero otherwise.  Extravagant use of KUT is curtailed by
     by RESCORE's present single-line limit on formula length.  However:

         Note 2b.  Although RESCORE's present limit on formula length (which can be relaxed
     if desire arises) is just one line, this can be up to 139 characters long even though
     on screen it wraps after 80.  And all basic functions can be entered as just one or
     two leading characters; specifically, only the caps in   ABs, ACos, ASin, ATan, Cos,
     Exp, Int, Kut, LN, LOg, MAx, MIn, MOd, Nint, SIn, SQrt, Tan   are needed.

         Note 2c.  RESCORE does not always do the final rescaling described at end of the
     information panel above.  If a created function maps integers just into integers, new
     variables defined by this function are rescaled only if needed to keep scores within
     the [-98, 999] range.  Even when its derived scores all lie within a minor fragment
     of this range when rounded, they are NOT rescaled to increase score differences,
     as is done for variables derived by functions whose fractional values computed from
     from integer input would suffer appreciable information loss if rounded to whole
     numbers.  If you are using KUT(X,c) to define transformations of an integer-valued
     X that you do not want to be rescaled, take care to use an integer for c.

         Note 2d.  You can edit your list of created formulas only when initiating or
     reviewing an Operation-3 Job.  If you elect to edit formulas before picking one for
     a Job, you can add new formulas to the list and also revise, delete, or move to end
     of list any of the old ones.  But once any formula has been assigned to a Job, you
     can continue to augment the list but may not alter formulas already in place.

         Note 2e.  Although exponentiation is denoted by ** in standard Fortran, ^ is an
     increasingly familiar alternative to which keyboard entries of ** are here converted.

         Note 3.  The missing-data "shadow" of a variable X is the binary variable whose
     value is 0 for subjects whose scores on X are missing and 1 otherwise.  Since
     failure of data harvest may well in part reflect the subject's response to an
     assessment situation, shadow variables are in principal empirically meaningful and
     may merit study as additional raw dimensions of the information obtained on these
     subjects.  (Cf. Cohen, J., & Cohen, P. (1983), Multiple Regression/Correlation
     Analysis for the Behavioral Sciences, 2nd ed.)  In particular, they can be appended
     to a D-file to assist FIXDATA estimation of its missing scores.  RESCORE constructs
     missing-data shadows only for variables on which the percent P of missing scores is
     not less than a user-chosen minimum ENUF.  Since the shadow deviancy z of a subject
     lacking score on any such variable is z = -SQRT[(100-P)/P], it seems prudent not
     to set ENUF much below 10%.  For the same reason, shadowing of variables having
     good-score percent less that ENUF is also disallowed.
 
                                  -23-

 =====================================================================================
 [ NOTE: These programs have not yet been upgraded to include Comp2 weighting. ]

                          BOOTSTRAP-SUPPLEMENT PROGRAMS

       Hyball's extended family of factoring routines has now been enriched by
two routines, HYBOOT and BOOTSUMM, for appraising the sampling noise in Hyball
solutions.  Both ascertain the central tendency and variation in results obtained
by factoring many bootstrap samples from the same rawdata file.  But they conduct
this inquiry in rather different ways.  BOOTSUMM (named by contraction of
"bootstrap summary") provides the finer control of factor rotation.  However, it
is labor-intensive and permits only a limited number of bootstrap repetitions.
In contrast, at cost of slightly decreased rotation flexibility which should
seldom matter, HYBOOT summarizes results from unlimitedly many bootstrap
repetitions while requiring scarcely any user effort or storage space.  HYBOOT
may take considerable computer time if you want many repetitions, but you can
interrupt its run whenever you wish and resume this later when your computer
has nothing better to do.

PRELIMINARIES.

       You are presumably familiar with the three or advisedly four main stages
of Hyball factor analysis, starting with an ASCII datafile--call it DATA.RAW--
that contains scores for each of some number NS of subjects on the same array
of variables.  (1) Stage 1, which is optional but strongly recommended, is
transcription of DATA.RAW by program HYDATA into a Hydata-standard ASCII datafile
containing the same scores reformatted and possibly rescaled to fit the READ
presumptions of other Hyball programs that operate on raw data.  Call this
transcribed datafile DATA.D1.  (All Hydata-standard datafiles have names whose
extensions are "D" followed by a numerical index.)  (2) Secondly, HYDATA is
applied to DATA.D1 (or to DATA.RAW) to compute the standardized covariances
(correlations) among some or all of these variables over all NS subjects.  These
correlations are recorded in a file, say DATA1.COV, whose basename ends with a
numerical index and is always followed by extension "COV".  (3) Next, DATA1.COV
is factored by program MODA for an extraction pattern of the data variables on
however many factors you choose.  (4) Finally, this extraction pattern is loaded
into HYBALL for rotation of factor axes to a positioning you judge best.

       Hyball's bootstrap appraisals also follow this basic computation sequence
with, however, certain modifications of which production of data covariances is
most fundamental:  Unlike normal computation of covariances from DATA.RAW or
DATA.D1, which uses each of its NS subjects exactly once without differential
weighting, Hyball's bootstrap covariances are computed from this datafile for
the same count NS of subjects drawn from it RANDOMLY WITH REPLACEMENT.  (If
interest ever warrants, the size of these bootstrap samples can easily be made
a user's-choice parameter.)  Differences among the covariance arrays produced
by repetitions of this procedure estimate the sampling noise in the data
covariances in fact obtained by this study, and allow you to study the
consequences of that uncertainty for analyses performed upon them.  More
specifically, if you compute bootstrap covariances repeatedly from these data,
and factor each bootstrap covariance matrix in the same manner you choose for
normal covariances from this datafile, you thereby obtain the approximate
sampling distribution underlying our preferred normal factor solution from these
data.  In particular, you can compute the mean, standard deviation, and other
moment information about the distribution over bootstrap repetitions of each
rotated pattern coefficient and factor correlation while also comparing
these to your favored solution from the normal data covariances.

                                  -24-

       This bootstrap procedure is entirely straightforward in principle, but
its technical implementation requires concern for (a) the many procedure options
that intervene between computation of data covariances and the terminally rotated
factors taken from them, and (b) proper alignment of the factor solutions to be
compared.  Concern (a) acknowledges the obvious point that variation in choices
such as number of factors extracted, the method of extraction (MODA's options
are principal common-factors, Minres common-factors, and dataspace principal
components), and HYBALL's many options for producing and selecting from a diverse
array of rotated solutions, can make considerable difference for final result.
And concern (b) recognizes that two factor solutions cannot be meaningfully
compared until their pattern columns have been permuted and reflected to optimize
similarity.  BOOTSUMM and HYBOOT differ mainly in how they deal with these
practicalities:


BOOTSUMM.

       Whenever HYDATA computes and records a normal COV-file, say DATA1.COV,
it now inquires whether the user would also like some bootstrap covariance
productions from the input datafile using the same selection of variables
(usually all) and the same treatment of missing data used for DATA1.COV.
If this option is accepted, HYDATA then asks for the number (up to 156) of
bootstrap COV-files wanted and writes these under the same name as DATA1.COV
except for insertion of "(", ")","[","]","{", or "}" (where "" is a
sequential alphabetic index) before the digit ending its basename.  Thus
up to 156 bootstrap COV-files produced to accompany DATA1.COV will be
successively named DATA(A1.COV, DATA(B1.COV, ..., DATA(Z1.COV, DATA)A1.COV,
..., DATA}Z1.COV. (Unlike HYDATA's regular COV-files, which are written in
ASCII, these bootstrap COV-files are binary.)  Also, the real covariances
in DATA1.COV are written to a master bootsource DATA(-1.COV.  The user can
then factor these by MODA/HYBALL under more or less the same procedure
options used to factor DATA1.COV to collect HYBALL-output files written
in the same ASCII format as HYBALL's FAC-file outputs under names starting
with bootstrap flag "(", ")", "[", "]", "{",  or "}".

       When BOOTSUMM is run in a subdirectory containing such a collection of
bootstrap factor solutions including a master derived from the collection's
master bootsource, it first lists to screen the names of all local files with a
leading bootstrap identifier and, if more than one master solution is present,
asks the user to pick the one having the same origin as the bootstrap results
you wish to summarize.  BOOTSUMM then (1) permutes/reflects the axes in each
boot-solution matching chosen master in origin to best alignment with the master
solution, (2) computes the mean and standard deviation of each (aligned) pattern
loading and factor correlation over the matching bootstrap solutions, and (3)
writes this information to an ASCII file named SEEBOOT.  (Higher-moment summary
statistics may be added to BOOTSUMM's output at a later date if interest
warrants.)

       Advantages of BOOTSUMM:  Given commitment to a particular selection of
variables to be factored and choice of NF, the user is free to develop each
bootstrap factor solution by whatever interactive parameter adjustments and
intuitive quality judgments (notably, after HYLOG study of the solutions in
HYBUF store) would be exercized were these bootstrap covariances the real thing.
And since full records of each bootstrap solution's production remain on disk
until the user chooses to delete them, a solution that deviates interestingly
from the norm can be analyzed in detail for how this deviancy came about.  Also,
HYBALL bootstrap solutions can be passed to HYFAC for appraisal of the sampling
noise in your preferred derivation of item weights for estimating these factors.
Finally, BOOTSUMM can appraise the sampling noise in factor solutions on which
HYBLOCK has been imposed rotation constraints, which is not feaslble for HYBOOT.

                                  -25-

       Disadvantages of BOOTSUM.  Producing a decent collection of bootstrap
factor solutions will generally be a great deal of work.  You will seldom be
willing to persist at this for more than a small number of repetitions, though
ten or twenty may be enough to yield all the bootstrap sampling information you
really need.  Also, the files generated by many bootstrap factor solutions
collected for BOOTSUMM summary will occupy considerable disk space.  To be sure,
space should be no real problem for a modern PC:  Even if you save everything
from 50 bootstrap factorings of 150 variables, the total accumulation won't run
over ten megabytes.  But you still have to think about space management when
doing a BOOTSUMM study.

HYBOOT.

       Whenever in the course of normal Hyball factoring you find a rotated
pattern that interests you enough to prompt concern for its sampling uncertainty,
you can generally initiate HYBOOT assessment of this by calling one of HYBALL's
Main Menu options.  This is because the list almost always includes opportunity
to write the currently active factor pattern to the BOOTDATA startup file needed
to guide HYBOOT study of the pattern it specifies as the bootstrap "target".
This Main Menu option (No. 12) is unavailable only if the rawdata file from
which the currently active pattern originates is not Hydata-standard, if HYBALL
has been unable to retrieve the variables' names, or if the currently active
pattern is under rotation constraints beyond an X-set stipulated during factor
extraction. In addition to its selected target pattern, the binary BOOTDATA file
so written includes the name of the D-file which HYBOOT is to sample, which of
its variables are in the target pattern, the treatment of missing scores and
factor-extraction method preceding the target pattern, and all rotation controls
stored with the target pattern in HYBUF archive.  (These include all in force
at end of this pattern's production, although that does not fully identify the
target pattern's derivational history.)  You don't have to remember any of this
information when writing your choice of target pattern to BOOTDATA:  After
loading whichever one you want from store (Main Menu Option 6), simply enter
"12" at the Main Menu query and you're done.

       Once the startup BOOTDATA has been set, all that remains to launch the
bootstrap study it controls is to enter "HYBOOT" at the DOS prompt in a
directory containing both this BOOTDATA and the D-file it instructs HYBOOT to
sample.  Once started, HYBOOT requests your preference on a couple of control
options and thereafter runs through the full bootstrap production from data
sampling to factor rotation with no user involvement except occasional decisions
to stop or continue.  Raw summary results are accumulated in a storage bin
appended to BOOTDATA whose modest size does not increase with the number of
repetitions.  This bin is updated after each bootstrap repetition, preserves the
accumulated results if the program is interrupted before final summaries are
printed from it, and permits break/resumption of the run to continue indefinitely
so long as BOOTDATA is not erased.  The program can be stopped anytime without
harm to BOOTDATA by hitting Ctrl-C; but it also pauses periodically to advise
how many repetitions are in hand together with average repetition time, and
prompts the user either to print results or to declare how many more repetitions
are wanted before the next programmed pause.  Results are written to ASCII file
SEEBOOT, which can be inspected at any pause without precluding resumption of
the run.

       When HYBOOT is running, the screen scrolls messages on the current
repetition's state of progress.  This information is mainly for reassurance
that matters are progressing as they should and will pass by too rapidly for
more than fleeting impressions.  But if wanted, these progress reports can
be captured in flight by agile use of the PAUSE key.

                                  -26-

       With one small group of exceptions, HYBOOT's bootstrap repetitions are
fully controlled by the production parameters, recorded in BOOTDATA, by which
the target pattern was generated.  The exceptions center on HYBOOT's restriction
to rotation by Spin search, regardless of whether the target pattern was also
found by Spin: First of all, you need to stipulate HYBOOT's thoroughness of
Spin search by choice of two parameters, MAXTRY and NUFF.  (Details on these
parameters are given both in HYBALL's documentation and on screen during HYBOOT
start-up.)  Each programmed pause during a HYBOOT run allows revision of
MAXTRY/NUFF in light of the reported mean repetition time.  And secondly, you
get to choose whether the pattern selected from each Spin series for bootstrap
accumulation is (a) the one having the highest rating under the current
parameterization of HYBALL's pattern-quality measure, or (b) the pattern that
recurs most frequently under Spin search with these rotation-control settings.
If (b) is elected, a parameter controlling how finely pattern differences are
discriminated (GAP) must also be chosen.

       The target pattern's only direct role in HYBOOT's production of its
bootstrap repetitions is to serve as the template for pattern alignment.  That
is, the pattern columns of each bootstrap solution are permuted and reflected
into closest match with the target pattern before it is added to the running
accumulation of results.  But the target pattern is also salient interpretively
in that one of the summary tables in results file SEEBOOT is the difference
between elements of the mean bootstrap solution (coefficients, communalities,
and factor correlations) and the corresponding target elements.

       Advantages of HYBOOT.  Scarcely any pre-planning or running effort is
required for its effectve utilization.  You need only remember to start Hyball's
normal factoring with Hydata-standard datafiles and, when HYBOOT is running, to
instruct it occasionally to print results or to continue with more repetitions.
Morever, the only limit on the size of HYBOOT's sampling collection is the
computer time you can spare for this job.  And when you save the full array of
intermediate and production files leading to the target pattern, notably, its
COV-file, MODA extraction pattern, and the HYBUF rotation archive in which the
target pattern is contained--which you are strongly advised to store on a floppy
disk dedicated to this solution if it is one you intend for interpretation--only
modestly more space will be required to store with these the bootstrap accumu-
lation in BOOTDATA as well.  You should, however, modify the archival name of
this BOOTDATA and the SEEBOOT you take from it to distinguish them from other
BOOTDATA and SEEBOOT files that you want to save.

       Disadvantages of HYBOOT.  The bootstrap rotations collected by HYBOOT are
produced exactly like the target pattern only if that was a Spin solution picked
either as best by criterion or most commonly recurrent at the same GAP grain
picked for the HYBOOT run.  But even when the target pattern's production is not
by Spin or has been chosen from the Spin collection by consideration, say, of
substantive interpretability, it is hard to think of circumstances under which
this production difference would significantly degrade the bootstrap summary as
an estimate of sampling noise in the taret pattern.  More importantly, HYBOOT
appraisal of large block-structured patterns is not feasible within DOS limits.

    Note 1.  At present, SEEBOOT reports only nonrelational parameters of
             each solution element's bootstrap distribution, namely, mean,
             standard deviation, skew, and kurtosis.  But expansion to include
             mixed-moment information will be undertaken if interest warrants.

    Note 2.  In case you wish to delete results already accumulated in your
             current BOOTDATA's collection bin and start anew without calling
             HYBALL to set BOOTDATA again, simply call utility program FIXBOOT
             whose executable code is bundled with that of BOOTSUPP and HYBOOT.
             (Your operating system must, of course, know where to find this.)

                                  -27-

 =====================================================================================

                                 POSTSCRIPTS

     Although it is straightforward to run the Hydata-supplement programs described
 above by following their on-screen instructions, you will find that their dexterous
 use as a system gets better with practice.  In this section I shall record, in no
 particular order, fragments of my experience with this and the main-package programs
 that you may find helpful.


         l. When you first transcribe source data into Hydata-standard form, it is
 advisable to waive use of HYDATA's outlier-suppression option and, if these data are
 originally in two or more files, to MERGE them (so long as they total no more than
 500 variables) into a single file <data>.D1 to be regarded as your data base.  Then
 save a compression of <data>.D1 and <data>.LOG to floppy backup.  (PKZIP can shrink
 datafile size by over 80%.)   And either transfer this base file, its log, and
 (optionally) a copy of PRNTR to an empty subdirectory or, if these are already in the
 workspace you intend for this analysis, clean out everything else therein.

     It is now time to look for bad data.  For in addition to scores already identified
 as missing, some unflagged entries in <data>.D1 may be suspiciously deviant.  Inspect-
 ing the variables' Highs, Lows and their deviancies recorded in <data>.LOG will show
 if this is a problem:  Highs above a variable's scoring ceiling or Lows below its
 floor obviously signal errors in scoring or record-keeping; but even if not out of
 measurement range, scores that are several SDs from their distribution's mean are
 likely corrupt in some fashion.  If you do observe extreme outliers here that you
 prefer to suppress, you can flag as missing all scores in <data>.D1 more deviant
 than your choice of cutoff by running HYDATA on <data>.D1, declaring this cutoff
 value for parameter DEV, and writing the output to this same filename.  Or you can
 apply different deviancy cutoffs to different variables by (a) running SELECT on
 <data>.D1 to pick out a subset of variables whose outliers are to be flagged as
 missing at the same DEV level, (b) running HYDATA on this selected D-file with this
 choice of DEV, (c) using MERGE to overwrite <data>.D1 with the output from (b), and
 (d) repeating this process with different DEV for as many other selections of variables
 as you wish.  Finally, FIXDATA can replace these outliers now flagged as missing with
 estimates of what they really should have been.

     It will probably be seldom if ever that your data will warrant such intense
 cleansing; but thinking through how you can do this, or better, trying it out just
 for practice on some small dataset which doesn't really need it, will enrich your
 understanding of the HYDATA-supplement programs.


        [2. Deleted: This comprised notes on FIXDATA no longer cogent.]

        [3. Deleted: This comprised notes on problem-size limits now obsolute. ]


         4. HYBALL's SPIN option is a recent addition that has proved to be extremely
 valuable for data variables whose factor composition is complex.  It was motivated by my
 discovery that the factor pattern to which HYBALL rotation converges is often apprec-
 iably affected by the iteration's starting position.  This news is both good and bad:
 The bad news is that from an orthodox start (the initial-extraction solution or its
 Varimax pre-rotation), HYBALL may not find the best possible hyperplane count or,
 better, minimal hyperplane misfit; the good news is that rotations from some starting
 positions may converge to solutions that well merit interpretive consideration even
 
                                  -28-

 though their hyperplane quality is analytically suboptimal.  SPIN gives every locally
 optimal solution some chance of recovery by rotating from random starts.  A SPIN run
 comprises repeated Tries, each of which first randomly positions all factors not under
 FIX constraints and rotates to locally optimal hyperplanes under your choice of
 rotation controls.  (Until you familiarize yourself with SPIN behavior, you are
 advised to rotate by STEP rather than by SCAN even though you should eventually come
 to prefer SCAN results.)  You can watch the SPIN tries scroll by on screen, but
 completion of the Try series does not require your intervention.  [ Deletion of
 remarks on the time of Spin search that at modern speeds is no longer tedious. ]

     The solutions produced on a SPIN run are initially saved in a scratch file while
 an internal array records their quality evaluated by the hyperplane-misfit measure that
 the current choice of control parameters has picked for rotation to minimize.  At the
 run's completion, the smaller (i.e. better) misfit ratings are listed in increasing
 size as proportions of the best misfit achieved on the current measure by any pattern
 already in HYBUF store.  You may then choose to add the best new solutions (your choice
 of how many) to HYBUF store subject to the proviso that no new pattern is saved if it
 is within similarity distance GAP of any solution, or if you prefer just of any new
 solution, already in store.

         The similarity distance between two patterns is defined in one of two ways
         as follows:  First their axes are paired for maximal congruence of loadings.
         Next, the angular divergence of each matched pair is computed to be the
         unsigned arc-cosine of their loading congruence.  Finally, at user option,
         the overall distance between the two patterns is taken to be either (a) their
         mean (AV) or (b) their maximum (HI) matched-column divergence in degrees.
         Option HI is recommended, since distinctive SPIN solutions often diverge
         appreciably from ones found already only on a small number of factors,
         whence AV is generally much less than HI especially when the total number
         of factors approaches or exceeds 10.

 For the user's choice of GAP, one pattern is judged to be acceptably dissimilar to
 another just in case the two patterns' similarity distance (either by AV or by HI as
 selected) is not less than GAP degrees.  For SPIN solutions to be interestingly
 distinctive under option HI, GAP should be at least 20 degrees and perhaps as large
 as 30.  (Even so, I prefer HI with GAP of 5.)  As the distinctively best new patterns
 are added to HYBUF, new patterns too GAP-wise similar to the former for retention
 are tabulated in a Lump-count file showing how often each newly saved pattern was
 matched within distance GAP by other patterns not saved.  This Lump Count is displayed
 on screen at end of each SPIN run, and is recorded in a no-frills ASCII file named
 LUMP that you can view after you exit HYBALL.  Resumption of a HYBALL run does not
 recall the preceding Lump Count, and will overwrite the last LUMP file unless that
 has been saved under a different name.

     If you would like to practice SPIN on a classic factor problem that is small enough
 (26 variables, 3 factors) that even a large-limit SPIN search takes only a short time
 to execute, yet also sustains an amazing diversity of solutions with locally optimal,
 meaningfully interpretable hyperplanes, I recommend Thurstone's empirical Box problem.
 Its initial factor pattern found by Centroid factoring is included in this package under
 name PAT.BOX.  Simply copy this to a name with extension starting with H in a temporary
 dedicated subdirectory, and type HYBALL at the DOS prompt in this subdirectory.

         5.  Sometimes, HYBALL's rotation iterations under STEP and, rather less often,
 under SCAN get stuck in a cyclic change dynamic that does not converge--a nuisance that
 is especially annoying under large-limit SPIN.  For patterns wherein this is prevalent,
 the problem can usually be alleviated by judicious choice at the Convergence-parameters
 panel under Main Menu Option 1. Specifically, consider reducing search window B0 in
 
                                  -29-

 such cases from its 60 default value to 45 or 50, and/or shift-damping fraction DF
 from its default value .6 to .5 or perhaps even less.  Note also that some combinations
 of these changes may be effective when others are not.  To see a demonstration of this
 peculiarity, copy Holtzinger pattern PAT.HOL in the distribution package to a name
 whose extension starts with H, and run HYBALL's Spin search on this.  A high proportion
 of its rotations under default parameters are nonconvergent, but some albeit by no
 means all combinations of appreciable reductions in B0 and DF converge as wanted.
 (Incidently, the Holtzinger pattern is remarkable in that virtually all convergent
 rotations of it yield nearly the same solution regardless of starting position.)


         6.  The documentation provided above for program HYLOG does insufficient
 justice to its importance as an adjunct to HYBALL.  HYLOG was originally conceived
 as a way to make possible a complete retracing of steps from start to finish in a
 HYBALL run, enabling users who have arrived at interesting results by repeated
 rotations under varied parameters to have on record how they got there.  However,
 extension of HYBALL runs to include Spin solutions precludes their full reproduci-
 bility, since attempting to repeat the randomizations would be pointless even if
 possible at all.  HYLOG's main value now lies in the detailed comparisons it provides
 among the solutions stored in HYBUF.  Some of this information, notably hyperplane
 counts and congruence reports, is available during the HYBALL run but cannot be studied
 conveniently there.  And several recent additions to HYLOG's output, in particular
 factor-complexity and factor-centrality reports, are not in HYBALL at all. HYLOG's
 Factor-Complexity tables tell, for the user's choice of hyperplane width BH, for
 each stored pattern P, and for each K = 1,...,NF, what proportion of the variables
 have loadings in pattern P larger than BH on just K of the NF factors.  And its
 Factor-centrality tables tell, for the user's choice of congruence divergence GAP
 and each factor J in each pattern P, how many of the other stored patterns contain
 a factor whose loadings differ from the loadings on J in P by at most GAP degrees of
 congruence divergence.  Factor Centrality can easily be misleading if the user has
 repeatedly continued the last rotation with minor parameter variations.  But if HYBUF
 contains mainly unretouched SPIN solutions, this can strikingly reveal which factors
 (more precisely, which pattern columns) are found most consistently under random search
 and which solutions contain them.  If the solutions with the highest average factor
 centrality also have the smallest factor complexities and (going outside of the HYLOG
 report) the highest Lump counts, there can be little doubt about which solutions to
 prefer for interpretation after final refinement even if--as may well occur--these
 do not include the highest hyperplane counts.

        6a. HYLOG's nonrelational pattern appraisals have recently been much expanded in
 response to my discovery from simulation studies with noisy source patterns that maximal
 hyperplane count is not reliably diagnostic of optimal axis placement.  In addition to
 (a) hyperplane counts and (b) factor complexity at selected values of the hyperplane-
 bandwidth parameter, solutions can now also be separately evaluated on (c) a parameter-
 free appraisal of factor complexity, (d) a measure of gappiness in the pattern's scatter
 of minor loadings, and (e), using the hyperplane-misfit measure picked by a choice of
 rotation parameters, the pattern's misfit rating expressed in proportion to the lowest
 misfit of any pattern in store.  Measures (e) have proved impressively superior to the
 others for identifying the best approximation to the source pattern in noisy simulations,
 but how well this will generalize to real data remains unclear.  In all likelihood,
 measure (d) will turn out to be largely useless; but we won't know that until we give
 it a go.

        6b. HYLOG is now supplemented by program TWOLOGS for comparing solutions between
 two different HYBUF files.  These must be rotations of patterns originating from the
 same data covariances as judged by the Code No. passed through MODA to HYBALL, but can
 be factorings of different subsets of the data variables for possibly different numbers
 
                                  -30-

 of factors.  If the variables are not fully the same in both log files, congruences are
 computed just over factor loadings on the variables in common.

     To use this program, type TWOLOGS at the DOS prompt in a subdirectory containing
 the log files to be compared.  The program commences by listing all this subdirectory's
 files whose names have form or HYBUF*.*, and requests selection of two.  The program
 checks whether the ones selected contain patterns with the same covariance origin; if
 they do, the HYBALL-input patterns they respectively rotate are identified, and
 comparison is allowed to proceed.  This program's output comprises congruence reports
 (also congruence centrality and congruence similarity counts) between solutions in the
 different logfiles just like those provided by HYLOG among solutions within a single
 logfile.  When a pattern A from one logfile is compared to a pattern B from the other,
 the <A,B> order picked by TWOLOGS insures that B has no more columns than does A.
 A best-matching column of A is found for each column of B (if A has more factors than
 B, not all of its columns will be matched); and for each B-column, its congruence
 divergence from the matching A-column is reported together with the latter's index in
 A if wanted.
     The names of TWOLOGS SEE-files differ in structure from HYLOG SEE-names (the latter
 being simply SEELOG with the same extension, if any, as the analyzed HYBUF log).  The
 HYBALL-input files from which the two logfiles compared by TWOLOGS respectively derive
 will have been written by MODA or HYBLOCK under names of form <dat>*.* with a beginning
 segment <dat> (up to six characters) that is the same for both.  TWOLOGS' comparison
 report is written to file $2<dat>. for the alphabetically lowest letter index  that
 will not overwrite any other SEE-file in this subdirectory.


         7.  Although program MODA ("Multiple-Output Dependency Analysis") for initial
 factor extraction contains very little innovation beyond its provision for treating
 selected variables as fixed-input factors, a couple of its features deserve mention.
 Most important is its reporting eigenvalues not merely of the data variables' observed
 correlation matrix but also of their common-parts covariance matrix entailed by each
 factor solutions's communality estimates. The variables' common-parts eigenvalues
 exhibit their variance structure much more cleanly than do their data-space eigen-
 values, and should be used to confirm or revise scree-based decisions about the number
 of factors to retain.

      MODA also has a novel criterion for termination of principal-factoring iterations
 which, however, has no theoretical merit but merely compromises between two purer
 measures of solution convergence.  MODA's early programming took for its convergence
 criterion the largest shift in any one communality becoming less than a small TOL
 setting.  However, I found that communality estimates can often be many cycles away
 from convergence even when their subsequent changes produce virtually no further
 improvement in the standard error SE of the solution's data-covariance reproduction.
 On the other hand, SE's insensitivity to changes in individual solution elements when
 the solution is near asymptote makes SE alone a dubious measure on which to judge
 convergence unless its shift tolerance is set very low.  So MODA's current version
 considers both the shift S1-SE in standard error of reproduction (where S1 is SE of
 the preceding cycle) and the maximal change SHIFT in communality by stopping either
 when the number of iteration cycles exceeds an adjustable limit IMAX, or when

                 (SHIFT.LE.TOL*10 .AND. (S1-SE.LE.TOL/10 .OR.SHIFT.LE.TOL)) ,

 where adjustable tolerance TOL has default value .001. I am reasonably happy with this
 criterion's behavior--it usually converges in a half-dozen or so iterations--but I
 would be even happier if I or anyone else could develop a decent theory of stopping
 criteria for iterated factor extraction.  If such a theory already exists, I would
 very much appreciate its being brought to my attention.


