Best-subset analysis with size-correction: SPSS 12-13 script
Pavel Klimov©
Description. This VBA script will generate all combinations or subset of
combinations of the independent variables and performs either Canonical
variates analysis (Discriminant function analysis) or Logistic regression. The output
will be saved in the working directory and can be analyzed further in any
spreadsheet application. You can modify this script for your own needs.
Installation.
1. Open SPSS ver. 12 or above.
2. Go to menu File, select Script.
3. Paste the content of this file
4. Save the script
You can test the script using this data file
Use of the script
1) Name your independent variables as
"var00001 ... var00010 ... " (default for SPSS)
2) Name your dependent variable as
"depend" (no quotation marks)
3) Define the following variables:
Nvar - Number of independent variables
LNTr - Do logarithmic (base e) transformation? (True/False)
ExternValid - Perform external validation?
(True/False)
SubSetMin - if you want to obtain subsets of particular size, enter
its lower limit, otherwise enter 1
SubSetMax - if you want to obtain subsets of
particular size, enter its upper limit, otherwise enter the number of you
independent variables (should be equal to Nvar)
4) If you have a large number of
independent variables (>12), SPSS may experience a memory problem
terminating your analysis. To avoid this, your large analysis is divided onto
several analyses each performing a smaller number of iterations (on my computer, it is about 8000). Define the following variables
Prt =True activates this option, "False" turns it off
StartRange=1 (from 1 to n) starts with specified number of iterations*
EndRange=8000 - stops after specified number
of iterations and writes results to disk*
* These settings will perform 8000
analyses. To conduct another 8000 analyses set the variables again:
StartRange=8001 and EndRange=1600
CVA
Define the following variables:
LR=False (tells the script to run CVA
instead of Logistic regression)
if ExternValid=true you have to
define another variable: SelectSet
By default SelectSet=vbCrLf &
"/SELECT=val(0)" , where "0" is
the code for your analysis subset; "val" is a variable name of the
variable defining the internal and external datasets (must be created in your
datamatrix)
Logistic regression
Define the following variables:
LR=True (tells the script to run
Logistic regression instead of CVA)
if ExternValid=true you have to
define another variable: SelectSet
By default SelectSet=vbCrLf & "/SELECT = val EQ 0" where
"0" is the code for your analysis subset; "val" is a
variable name of the variable defining the internal and external datasets (must
be created in your datamatrix)
Output
processing
The following VBA script will
process your output leaving only variable names and hit ratio value
Installation:
1. Open MS Word
2. In menu select Tools-Macro-Macros
and press "Create" and give a name of the macro
2. Paste the content of the this
file
Use:
1. Open SPSS output file as text
2. Run the script: in menu select
Tools-Macro-Macros and select the name of the macro from step 3 (Installation)
An outdated page that generates command syntax for datasets
with small number of independent variables can be found here.