Using instrument variables in regression analysis
ivregress can be used to specify instrument variables. This is relevant if there is a suspicion of multicollinearity (correlation between at least two of the independent variables). Instrument variables are defined inside the parentheses expression. In the example below, the instrument variable wealth and the instrument age are being used. But you can use as many instruments as you want. For example, if you also think that place of residence (= Oslo) affects wealth, you can use the parentheses expression
(wealth = age oslo). But in principle,
ivregress treats all independent variables as instruments, except the instrument variable.
require no.ssb.fdb:1 as fdb1 create-dataset ivtest import fdb1/INNTEKT_WLONN 2013-05-05 as salary import fdb1/BEFOLKNING_FOEDSELS_AAR_MND as birth import fdb1/BEFOLKNING_KJOENN as sex generate age = 2013 - int(birth /100) drop if age < 0 generate male = 0 replace male = 1 if sex == '1' import fdb1/INNTEKT_BRUTTOFORM 2013-05-05 as wealth //First a regular linear regression regress salary age male wealth //Suspects correlation between age and wealth. Use instrument variabel (wealth) ivregress salary male (wealth = age) //In addition to comparing the to outputs, we need to check for multicollinearity and normal distribution correlate wealth age regress-predict salary age male wealth, residuals( res1) ivregress-predict salary male (wealth = age), residuals( res2) histogram res1 histogram res2