Perform a random sub-selection from a total population

The example below demonstrates how to perform a random subselection from a total population (large population). The commando sample can be used for such purposes.

The first input parametre defines the sample size. If this is a decimal number (0.0-1.0), a percentage share will be extracted. By specifying a positive integer > 1000, it is possible to extract a random sample consisting of this particular number of individuals.

The last input parametre is a custom positive integer, i.e. a seed number. This ensures that the sample individuals are identical when performing consecutive sample executions. By choosing a new seed number, a new selection of individuals will be randomly extracted.

//Creates a dataset consisting of all Norwegian residents as of 1/1 2015, and extracts a 10% random sub-sample
create-dataset totalpop
import BEFOLKNING_REGSTAT 2015-01-01 as registerstatus15
keep if registerstatus15 == '1'
sample 0.1 999

//Creates a dataset consisting of all Norwegian residents as of 1/1 2015, and extracts a random sub-sample consisting of 5000 individuals
create-dataset totalpop2
import BEFOLKNING_REGSTAT 2015-01-01 as registerstatus15
keep if registerstatus15 == '1'
sample 5000 888

//Creates a dataset consisting of all Norwegian residents as of 1/1 2015, and extracts a new random sub-sample consisting of 5000 individuals (different from the previous sample)
create-dataset totalpop3
import BEFOLKNING_REGSTAT 2015-01-01 as registerstatus15
keep if registerstatus15 == '1'
sample 5000 950