5.9.3 Prediction and residual values
All regression variants found in microdata.no have associated commands that generate, among other things, residual and prediction values. These are values that can be used to analyze the data spread and for testing regression models. Prediction values can also be used as input for further analyses.
The commands have the same name as the corresponding regression command plus -predict
.
Syntax:
poisson-predict <variable> <variable list> [if <condition>] [, <options>]
negative-binomial-predict <variable> <variable list> [if <condition>] [, <options>]
The variables are entered in the same way as for the associated regression model which is run with respectively the poisson
and negative-binomial
commands.
The following values can be retrieved: Prediction values and residual values
You decide for yourself which values you want to generate through the use of options. The result of the runs is a set of variables containing the various values. By default, the first-mentioned value type is generated, but it is still recommended to specify this through options, as you can then also determine the name of the generated variables inside a parenthesis as shown in the syntax example below. If you run several "predict commands", you must create new names for the automatically generated variables.
The automatically generated variables can be used as input for further analyzes or to be displayed graphically. Current graphical commands are hexbin
and histogram
. By running histogram
on the residual variable, you can check how the residual values are distributed. The hexbin
command can also be used to create anonymized scatterplots where two sets of values are combined.
Examples of retrieval and display of actual prediction and residual values:
Note that the residuals are not normally distributed as in an OLS regression. This is as expected for count regressions.
For more details, it is recommended to use the commands help poisson-predict
and help negative-binomial-predict
.