&in table
&in is the table of input data.
This table is not modified by the process.
&in._var table
&in._var table is directly generated from the same name
input table: it can be different only in the column
"utilizzo" because of the exclusion of variables with only one class or for the presence
of new rows (result of columns generated in case of "K"
variables).
&in._cond table
&in._cond table, as the previous one, derives from the same name
input table: it differs only for the presence
of new lines, when we have "K" type variables.
&in._esccon table
&in._esccon table is a copy of the same name input table
and, also in this case, we can find new rows when we have "K" variables.
&passi table
&passi table (optional) is exactly the same table given in
input.
&in._pt table
&in._pt table is a copy of the input data:
the program adds to it some columns used during classification.
In particular, the process generates new columns for K,
O and X
variables with the following criteria:
&in._dcorr table
&in._dcorr table is a copy of previous dataset: it differs only in some more columns
derived from the final regression model.
There you find the value estimated by the model (column predetti) with its 95%
confidence interval (columns inf and sup), the standardized Pearson
residual (column residui) and the estimated value of the pre-link function (column
xbet).
&in._mcorr table
&in._mcorr table contains estimates of model parameters.
The columns are:
- Parameter, indicating the variable of the model,
- Level1, which contains the class of the variable (for qualitative variables),
- DF, which indicates the degrees of freedom for the parameter,
- Estimate, which contains the estimate of parameter,
- StdErr, which contains the standard error of estimate,
- LowerWaldCL, indicating the lower limit of Wald confidence interval
for the estimate,
- UpperWaldCL, indicating the higher limit of Wald confidence interval
for the estimate,
- ChiSq, which contains the value of the Chi-square statistic
used to determine the significativity of the parameter,
- ProbChiSq, which contains the reciprocal of the cumulative distribution
of Chi-square described above.
&in._smcorr table
&in._smcorr table contains some statistical indicators used to measure the goodness of the model
(Log-Likelihood,
AIC, ...).
The columns that compose this table are:
- Criterion, which contains the calculated indicator,
- DF, which contains the degrees of freedom for the indicator,
- Value, which contains the value of the statistic,
- ValueDF, which contains the value of the indicator divided by degrees of freedom.
&in._corr4 table
&in._corr4 table reports the correlation and pseudo-correlation values calculated
between different variables with the indication of values that exceed the threshold set by
the user (to determine whether two variables are correlated).
Note that two variables analyzed using the
derived value of Simpson concentration index will appear in
this table only if they are correlated.
The columns of this table are:
- v1, which contains the first variable to analyze,
- v2, which contains the second variable,
- corr, which is the correlation value calculated,
- tipo_corr, which indicates the type of correlation used (for a decoding
of this field, have a look of types of variables),
- ut_v1, which contains indication about the
type of the first variable,
- ut_v2, which contains indication about the
type of the second variable,
- corr2, that is a dummy variable, set to 1 if the two variables
are correlated (that is, if the calculated correlation is bigger than the threshold set
by the user), set to 0 if the two variables are not correlated.
&in._kcl table
&in._kcl table contains numeric values used in grouping process of
K variables.
These variables are originally character fields; the procedure converts them in numbers
(classified by concentration) in order to process the new variable as an X
or O type.
The columns are:
- var_orig, indicating the variable,
- cl_orig, which indicates the original value of the new variable,
- cl_nuova, indicating the numerical value associated to the old character value.
&in._kvar table
&in._kvar table contains the details of categorized variables that were introduced in the model.
The columns of this table are:
- var, containing the name of the column,
- giro, indicating in which step the program did the compression,
- kvar, listing the non-SAS category of the original value,
- kvar_b, which indicates the starting value of the class in numeric format,
- kvar_c, which contains the post-grouping class in numeric format,
- kvar_d, containing the post-grouping class in natural language (non-SAS).
&in._mod table
&in._mod table contains the variables that were introduced in the model.
The columns of this table are:
- nome, containing the name of the variable,
- utilizzo, indicating how the column was used.
&in._po table
&in._po table contains a list of the variables that could be used in the model
(probably not introduced because of low significativity).
The columns of this table are:
- utilizzo, which describes the type of the variable,
- nome, which contains the variable name,
- po, that is a dummy: it is valued at 1 if the column is a potential
new variable of the model, equal to 0 if this column cannot be introduced in the model.
&in._passi table
&in._passi table contains the steps (stepwise-backwise) did by the engine
to estimate the final model.
The columns of the table are:
- passo, which identifies the step,
- modello, which contains the alphabetical list of variables
used in each step.
N.B.: Please note that variables in the modello column could be different from
the variables given in input to the process, as explained previously.
&in._zgri table
&in._zgri table is useful to summarize the model to help in using it on new data.
You can find an example on the page with sample code.
The columns of this table are:
- nome, which contains the variable name,
- level1, which contains the category of the variable
(valued only for qualitative variables),
- kvar_d, which contains a non-SAS indication of category,
- estimate, that is the estimated value of this variable,
- condizione, which defines the SAS condition to identify the class
in the dataset,
- df, which indicates the degrees of freedom of this variable,
- utilizzo, which describes the type of variable.
N.B.: in case of user-defined classes (set by user in the appropriate
table), the conditions might not be univocal.
For this reason, the order in this table is not random: the user's classes should be at
the end of each group.
Main index | Programs index | Autoreg index |
Vai alla versione Italiana |
Creation date: 17 Sep 2010
Translation date: 30 Dec 2012
Last change: 18 May 2013
Translation reviewed by
Giulia Di Lallo