Restructure datasets from long to wide format

The script below demonstrates how to use reshape-from-panel in practice.

First, a regular data set of the wide type is created, consisting of a 1% random sample of everyone who was registered resident in Norway as of 1/1 2019. The fully converted long data set will include the variable date@panel which contains the value at the sublevel of each single unit. In this case the values 19 and 20. See separate analysis example for explanation of what happens when you use reshape-to-panel.

Then reshape-from-panel is used to convert the data set to wide format again. This will then give us the same data set that we started with. Note that no variables or prefix are specified when using this command, and that all variables are converted to wide format with a suffix based on the associated sublevel. This also applies to gender, although this is fixed information. As in the example below, you can remove the redundant gender variables and rename them to “gender”.

The last part of the example creates a classic panel dataset using the command import-panel, and then uses reshape-from-panel to convert the data to wide format. Note that here the suffixes are different. When using tabulate-panel or summarize-panel it will look like the sublevel has values of the type “YYYY-MM-DD”, but this only applies as a display format. The actual values for date@panel in this case use reference dates as value format (number of days measured from 1/1 1970). This can be solved by renaming the variable names with the command rename. Note that the converted dataset will be identical to the result of the conversion of the dataset created using reshape-to-panel.

 require no.ssb.fdb:23 as db

//Create a wide dataset consisting of 1% of all residents per 1/1 2019
create-dataset wide
import db/BEFOLKNING_STATUSKODE 2019-01-01 as regstat19
keep if regstat19 == '1'
sample 0.01 333
import db/BEFOLKNING_STATUSKODE 2020-01-01 as regstat20
import db/SIVSTANDFDT_SIVSTAND 2019-01-01 as civstat19
import db/SIVSTANDFDT_SIVSTAND 2020-01-01 as civstat20
import db/BEFOLKNING_KJOENN as gender
import db/INNTEKT_WLONN 2019-01-01 as wage19
import db/INNTEKT_WLONN 2020-01-01 as wage20

tabulate regstat19, missing
tabulate regstat20, missing
tabulate civstat19, missing
tabulate civstat20, missing
tabulate gender, missing

summarize wage19 wage20 

//Restructure to panel/long format
reshape-to-panel regstat civstat wage

tabulate-panel regstat, missing
tabulate-panel civstat, missing
tabulate-panel gender, missing
summarize-panel wage

//Restructure back to wide format
reshape-from-panel
drop gender20
rename gender19 gender

tabulate regstat19, missing
tabulate regstat20, missing
tabulate civstat19, missing
tabulate civstat20, missing
tabulate gender, missing
summarize wage19 wage20


//Create a new panel dataset for the same population through import-panel
clone-units wide paneltest
use paneltest
import-panel db/BEFOLKNING_STATUSKODE db/SIVSTANDFDT_SIVSTAND db/INNTEKT_WLONN db/BEFOLKNING_KJOENN 2019-01-01 2020-01-01
rename BEFOLKNING_STATUSKODE regstat
rename SIVSTANDFDT_SIVSTAND civstat
rename INNTEKT_WLONN wage
rename BEFOLKNING_KJOENN gender

tabulate-panel regstat, missing
tabulate-panel civstat, missing
tabulate-panel gender, missing
summarize-panel wage

//Restructure to wide format
reshape-from-panel
drop gender18262
rename gender17897 gender

rename regstat17897 regstat19
rename regstat18262 regstat20

rename civstat17897 civstat19
rename civstat18262 civstat20

rename wage17897 wage19
rename wage18262 wage20

summarize wage19 wage20