Input data

The procedure examines as input a variety of tables. Here we will explain what name they must have and how they must be structured.
All these datasets, with the exception of the first listed below (&in), may be subject to updates during the process.
So, if you don't want to recreate it, we recommend to make a backup of the files before starting.
All tables have a macro prefix: this common root will be passed by user with the input parameter in.

&in table
The &in table must contain the input data.
This data (column names and formats) must correspond to what is specified in the other tables.
By default, this table will be called a: the name can be changed simply by changing the corresponding input parameter.

&in._var table
&in._var table is used to describe the columns in the input file and define their uses and formats.

The structure results directly from the output file of sas proc contents. The columns are:

The last column is the only one not resulting directly from the sas proc contents. Its value varies as described in the page on the types of variables accepted.
If the input parameter takes the default value a, this table will be named a_var.

An example of this table (namely, estimating weight given height and age) may be as follow:

name type format formatl formatd utilizzo
First name 2   0 0 i
Surname 2   0 0 i
height 1   5 0 x
date_of_birth 1 DATE 9 0 x
weight 1   1 0 r

&in._cond table
&in._cond table is used to identify user-defined classes of a variable that must be grouped by the process (clearly, it must pertain to K, O and X variables).

In particular, the columns of the table (character) are:

If the input parameter has a (default) as its value, this table will be named a_cond.

An example of this table is:

variabile condizione classe
height height <= 100 and height ^= . Under 100
height height >= 200 Over 200
... ... ...
date_of_birth date_of_birth <= '01JAN1900'd Date of birth missing

&in._esccon table
&in._esccon table is designed to allow the user to force the program so that it considers two variables correlated: thus, the procedure adopts a policy of conditional exclusion of the variables involved.

For example, in order to be sure that the variables height and date_of_birth can't be used at the same time in the model, we'll insert a row like that:

var1 var2
height date_of_birth

Therefore, if the variable height is added to our regression, then the procedure will exclude the variable date_of_birth from the list of potential variables in the next steps (and vice versa).
If the input parameter takes the default value a, this table will be named a_esccon.

As in our example, the two columns (character) in the table are:

&passi table
&passi table (defined by the corresponding input parameter) indicates the preferred sequence of variables that the program uses to build the model.
This table is not mandatory: if the macro parameter passi is valued as NIENTE (default value), the procedure does not use any table of "preferred variables".

The columns of the table are:

This structure originates directly from the output table &in._passi: so you can simply start from a pre-exhisting output to generate a new model.

In our example, the table could look like that:

passo modello
1 cl_height
2 cl_height cl_date_of_birth

In this case, the program tries to insert the variable cl_height (derived from the variable height, as described here) in the first step of regression and then the variable cl_date_of_birth (derived from date_of_birth). Only at this point the process considers other input variables.

For a practical example on the use of the tables described above, you can read the relevant page; in particular, if you have any doubts on &passi table, you can look up its specific section.

  Main index     Programs index     Autoreg index  
Vai alla versione Italiana

Creation date: 17 Sep 2010
Translation date: 30 Dec 2012
Last change: 17 May 2013

Translation reviewed by Giulia Di Lallo