Skip to main content

RDD analysis

The script below demonstrates how the rdd command can be used for RDD analyses.

Ideally, an RDD model is best suited to cases where you can observe a clear jump in a variable if another continuous variable moves above a threshold value. Natural data that perfectly fits an RDD analysis is not so easy to find. Therefore, in the example below, we have manipulated a wage variable by multiplying by a factor of 1.5 if the wage value for the previous year (wage_1) exceeds 600,000. We also do a manipulation to adapt the data to a Fuzzy model (makes the jump at the threshold value less deterministic).

The hexbin command is used to visualize the plots for wage vs wage_1.

 require no.ssb.fdb:30 as db

create-dataset rddtest
// Importing the outcome variable wage and narrowing down the population
import db/INNTEKT_LONN 2022-12-31 as wage
keep if wage > 200000 & wage < 1000000
histogram wage

// Importing the cutoff variable wage_1 (wage the year before)
import db/INNTEKT_LONN 2021-12-31 as wage_1
hexbin wage wage_1

// Running RDD analysis on wage vs wage the year before and setting a cutoff point to 600000 (gives non-significant effect)
rdd wage wage_1, cutoff(600000)

// Manipulating the wage variable so that it jumps from and including wage_1 = 600000
replace wage = wage*1.5 if wage_1 >= 600000
hexbin wage wage_1

// Running the same RDD analysis again on data that fits an RDD model (gives significant effect)
rdd wage wage_1, cutoff(600000)

// Standardizing values for cutoff variable by setting zero point to the value 600000
replace wage_1 = wage_1 - 600000
hexbin wage wage_1

// RDD uses the value 0 as standard for cutoff, and you don't have to use the cutoff option since we have now standardized the cutoff variable. The two alternatives below then give the same result
rdd wage wage_1
rdd wage wage_1, cutoff(0)

// Running a Fuzzy RDD model on regular standardized data (gives the same result as a regular RDD model). Fuzzy models require that you create a treatment variable that takes the value 1 from and including the cutoff point and 0 before.
generate treatment = 0
replace treatment = 1 if wage_1 >= 0
rdd wage wage_1, fuzzy(treatment)

// First making the data more fuzzy by making the jump at the cutoff point non-deterministic (more fluid transition). The estimates then change, but still give a significant effect as we use a fuzzy model
generate treatment2 = 0
replace treatment2 = 1 if wage_1 >= 0
replace treatment2 = 1 if wage_1 > -10000 & wage_1 < -5000
replace treatment2 = 0 if wage_1 > 5000 & wage_1 < 10000
rdd wage wage_1, fuzzy(treatment2)

// Example of using covariates (extra explanatory variables) - note that estimates for explanatory variables are not shown in the result for RDD analyses
import db/BEFOLKNING_KJOENN as gender
import db/BEFOLKNING_KOMMNR_FAKTISK 2021-01-01 as municipality
generate county = substr(municipality,1,2)

generate oslo = municipality == '0301'
generate bergen = municipality == '4601'
generate trondheim = municipality == '5001'
generate stavanger = municipality == '1103'
generate tromsø = municipality == '5401'

rdd wage wage_1 i.gender oslo bergen trondheim stavanger tromsø

// Example of cluster-estimation (regular vs cluster)
destring county
rdd wage wage_1
rdd wage wage_1, cluster(county)

// Example of using options
rdd wage wage_1 i.gender
rdd wage wage_1 i.gender, cluster(county) level(90)
rdd wage wage_1 i.gender, cluster(county) level(90) polynomial(2)