Using instrument variables in regression analysis

The command ivregress can be used to specify instrument variables. This is relevant if there is a suspicion of multicollinearity (correlation between at least two of the independent variables). Instrument variables are defined inside the parentheses expression. In the example below, the instrument variable wealth and the instrument age are being used. But you can use as many instruments as you want. For example, if you also think that place of residence (= Oslo) affects wealth, you can use the parentheses expression (wealth = age oslo). But in principle, ivregress treats all independent variables as instruments, except the instrument variable.

require no.ssb.fdb:1 as fdb1
create-dataset ivtest
import fdb1/INNTEKT_WLONN 2013-05-05 as salary
import fdb1/BEFOLKNING_FOEDSELS_AAR_MND as birth
import fdb1/BEFOLKNING_KJOENN as sex
generate age = 2013 - int(birth /100)
drop if age < 0
generate male = 0
replace male = 1 if sex == '1'
import fdb1/INNTEKT_BRUTTOFORM 2013-05-05 as wealth

//First a regular linear regression
regress salary age male wealth

//Suspects correlation between age and wealth. Use instrument variabel (wealth)
ivregress salary male (wealth = age)

//In addition to comparing the to outputs, we need to check for multicollinearity and normal distribution

correlate wealth age
regress-predict salary age male wealth, residuals( res1)
ivregress-predict salary male (wealth = age), residuals( res2) 

histogram res1
histogram res2