How to use parental information in analysis

The database contains variables for measuring fathers and mothers personal id-number which can be utilised in order to merge parental information into an individual level dataset.

The commando merge can be used to merge datasets. The identification key variabel given by the source dataset is used by default. This can be overruled by using an on-option.

In the example below, separate datasets are made for fathers and mothers. These are merged into the main individual level dataset via the key variables idnr_father and idnr_mother respectively.

//Connect to databank
require no.ssb.fdb:1 as fdb1

//Create a main dataset with links to fathers and mothers
create-dataset persondata
import fdb1/INNTEKT_WYRKINNT 2010-01-01 as income
import fdb1/BEFOLKNING_KJOENN as sex
import fdb1/NUDB_BU 2010-01-01 as edu
import fdb1/BEFOLKNING_FAR_FNR as idnr_father
import fdb1/BEFOLKNING_MOR_FNR as idnr_mother

//Import data on fathers and merge into main dataset
create-dataset fatherdata
import fdb1/INNTEKT_WYRKINNT 2010-01-01 as income_father
import fdb1/NUDB_BU 2010-01-01 as edu_father
merge income_father edu_father into persondata on idnr_father

//Import data on mothers and merge into main dataset
create-dataset motherdata
import fdb1/INNTEKT_WYRKINNT 2010-01-01 as income_mother
import fdb1/NUDB_BU 2010-01-01 as edu_mother
merge income_mother edu_mother into persondata on idnr_mother

//Perform basic linear regression analysis to test for covariation with parental income
use persondata
generate male = 0
replace male = 1 if sex == '1'

destring edu, force
generate highedu = 0
replace highedu = 1 if edu >= 700000 & edu < 999999
replace highedu = edu if sysmiss(edu)

destring edu_father, force
generate highedu_father = 0
replace highedu_father = 1 if edu_father >= 700000 & edu_father < 999999
replace highedu_father = edu_father if sysmiss(edu_father)

destring edu_mother, force 
generate highedu_mother = 0
replace highedu_mother = 1 if edu_mother >= 700000 & edu_mother < 999999
replace highedu_mother = edu_mother if sysmiss(edu_mother)

summarize income income_father income_mother
histogram income_father, percent
histogram income_mother, percent
correlate income_father income_mother
tabulate highedu_father highedu_mother

regress income male income_father income_mother highedu highedu_father highedu_mother