Skip to main content

Find the number of fossil, hybrid and electric cars per family

The example below shows how to count the number of fossil, hybrid, and electric cars per family. It has some common features with the example [Find the number of passenger cars per family](i18n\en\docusaurus-plugin-content-docs\current\eksempel\Tema - Kjøretøyanalyser\Finne antall personbiler per familie.md). But in this case, a finer division of types of passenger cars is made based on the three main categories of fuel.

First, the number of passenger cars per individual of the types fossil, hybrid, and electric are counted in separate datasets using the collapse(count) command, and then the information is linked into a person dataset which among other things contains a variable with family identifiers. Then the data is aggregated to the family level using the collapse command, now using the family as the aggregation measure.

To find numbers for families that do not own a passenger car, it is important to use a person dataset (total population) as a basis, i.e., to link the number of passenger cars into a person dataset (and not the other way around), and then aggregate to the family level. Individuals for whom there is no link from any of the respective passenger car datasets will by definition be individuals who do not own a passenger car.

 require no.ssb.fdb:32 as db

// Creating person dataset
create-dataset population
import db/BEFOLKNING_KOMMNR_FORMELL 2024-01-01 as municipality
import db/NUDB_BU 2023-08-01 as education
import db/INNTEKT_LONN 2022-12-31 as salary
import db/BEFOLKNING_REGSTAT_FAMNR 2023-01-01 as famnr

generate education_level = substr(education,1,1)
destring education_level


// Creating fossil car dataset
create-dataset fossil_cars
import db/KJORETOY_KJT_GRUP 2023-12-31 as vehicle_group
import db/KJORETOY_DRIVSTOFF_OMK 2023-12-31 as fuel_type
keep if vehicle_group == '101' & inlist(fuel_type,'01','02')
drop vehicle_group
import db/KJORETOY_KJORETOYID_FNR 2023-12-31 as idnr

// Aggregating to individual level and finding the number of fossil cars per individual
collapse (count) fuel_type -> num_fossil_cars, by(idnr)

// Linking information about the number of fossil cars to the person dataset
merge num_fossil_cars into population


// Creating hybrid car dataset
create-dataset hybrid_cars
import db/KJORETOY_KJT_GRUP 2023-12-31 as vehicle_group
import db/KJORETOY_DRIVSTOFF_OMK 2023-12-31 as fuel_type
keep if vehicle_group == '101' & inlist(fuel_type,'07','08')
drop vehicle_group
import db/KJORETOY_KJORETOYID_FNR 2023-12-31 as idnr

// Aggregating to individual level and finding the number of hybrid cars per individual
collapse (count) fuel_type -> num_hybrid_cars, by(idnr)

// Linking information about the number of hybrid cars to the person dataset
merge num_hybrid_cars into population


// Creating electric car dataset
create-dataset electric_cars
import db/KJORETOY_KJT_GRUP 2023-12-31 as vehicle_group
import db/KJORETOY_DRIVSTOFF_OMK 2023-12-31 as fuel_type
keep if vehicle_group == '101' & fuel_type == '05'
drop vehicle_group
import db/KJORETOY_KJORETOYID_FNR 2023-12-31 as idnr

// Aggregating to individual level and finding the number of electric cars per individual
collapse (count) fuel_type -> num_electric_cars, by(idnr)

// Linking information about the number of electric cars to the person dataset
merge num_electric_cars into population


// Aggregating person dataset up to family level and creating statistics over the number of fossil, hybrid, and electric cars per family
use population
collapse(count) municipality -> family_size (sum) num_fossil_cars num_hybrid_cars num_electric_cars salary -> sum_salary_family (max) education_level -> max_education_level, by(famnr)

generate num_cars = num_fossil_cars + num_hybrid_cars + num_electric_cars

// Cloning variables for number of cars, and combining number over 2 into a combined category for one variant
clone-variables num_fossil_cars -> num_fossil_cars_raw
recode num_fossil_cars (missing = 0 'None')(3/max = 3 '>')
recode num_fossil_cars_raw (missing = 0 'None')

clone-variables num_hybrid_cars -> num_hybrid_cars_raw
recode num_hybrid_cars (missing = 0 'None')(3/max = 3 '>')
recode num_hybrid_cars_raw (missing = 0 'None')

clone-variables num_electric_cars -> num_electric_cars_raw
recode num_electric_cars (missing = 0 'None')(3/max = 3 '>')
recode num_electric_cars_raw (missing = 0 'None')

clone-variables num_cars -> num_cars_raw
recode num_cars (missing = 0 'None')(3/max = 3 '>')
recode num_cars_raw (missing = 0 'None')

textblock
Number of fossil cars per family with at least one fossil car
endblock
piechart num_fossil_cars if num_fossil_cars > 0
tabulate num_fossil_cars if num_fossil_cars > 0, freq cellpct
tabulate num_fossil_cars if num_fossil_cars > 0, summarize(family_size)
tabulate num_fossil_cars if num_fossil_cars > 0, summarize(sum_salary_family)

textblock
Number of hybrid cars per family with at least one hybrid car
endblock
piechart num_hybrid_cars if num_hybrid_cars > 0
tabulate num_hybrid_cars if num_hybrid_cars > 0, freq cellpct
tabulate num_hybrid_cars if num_hybrid_cars > 0, summarize(family_size)
tabulate num_hybrid_cars if num_hybrid_cars > 0, summarize(sum_salary_family)

textblock
Number of electric cars per family with at least one electric car
endblock
piechart num_electric_cars if num_electric_cars > 0
tabulate num_electric_cars if num_electric_cars > 0, freq cellpct
tabulate num_electric_cars if num_electric_cars > 0, summarize(family_size)
tabulate num_electric_cars if num_electric_cars > 0, summarize(sum_salary_family)

textblock
Total number of passenger cars per family with at least one passenger car
endblock
piechart num_cars if num_cars > 0
tabulate num_cars if num_cars > 0, freq cellpct
tabulate num_cars if num_cars > 0, summarize(family_size)
tabulate num_cars if num_cars > 0, summarize(sum_salary_family)

textblock
Combinations of the number of fossil, hybrid, and electric cars per family
endblock
tabulate num_fossil_cars num_electric_cars
tabulate num_fossil_cars num_hybrid_cars
tabulate num_electric_cars num_hybrid_cars
tabulate num_electric_cars num_hybrid_cars


//Creating summary variables for combinations of car types
generate car_combination = 1 if num_electric_cars_raw == 0 & num_hybrid_cars_raw == 0 & num_fossil_cars_raw > 0 //only fossil
replace car_combination = 2 if num_electric_cars_raw == 0 & num_hybrid_cars_raw > 0 & num_fossil_cars_raw == 0 //only hybrid 
replace car_combination = 3 if num_electric_cars_raw > 0 & num_hybrid_cars_raw == 0 & num_fossil_cars_raw == 0 //only electric 
replace car_combination = 4 if num_electric_cars_raw == 0 & num_hybrid_cars_raw > 0 & num_fossil_cars_raw > 0 //fossil and hybrid
replace car_combination = 5 if num_electric_cars_raw > 0 & num_hybrid_cars_raw == 0 & num_fossil_cars_raw > 0 //fossil and electric
replace car_combination = 6 if num_electric_cars_raw > 0 & num_hybrid_cars_raw > 0 & num_fossil_cars_raw == 0 //electric and hybrid
replace car_combination = 7 if num_electric_cars_raw > 0 & num_hybrid_cars_raw > 0 & num_fossil_cars_raw > 0 //all types 
replace car_combination = 8 if num_electric_cars_raw == 0 & num_hybrid_cars_raw == 0 & num_fossil_cars_raw == 0 //none 

define-labels carkomblbl 1 'Only fossil car(s)' 2 'Only hybrid car(s)' 3 'Only electric car(s)' 4 'Fossil and hybrid car(s)' 5 'Fossil and electric car(s)' 6 'Electric and hybrid car(s)' 7 'All types' 8 'No passenger car'
assign-labels car_combination carkomblbl

recode max_education_level (0/3 = 1 'Not completed secondary')(4/5 = 2 'Secondary education')(6 = 3 'Lower academy education')(7 = 4 'Higher academy education (master level)')(8 = 5 'Research education')(9 = 6 'Not stated')

textblock
Combinations of fossil, hybrid, and electric cars
endblock
piechart car_combination
tabulate car_combination, freq cellpct missing
tabulate car_combination, summarize(num_cars_raw)
tabulate car_combination num_cars
tabulate car_combination num_cars, rowpct
tabulate car_combination max_education_level, rowpct missing

//Creating geographical divisions

//Retrieving the most recently updated place of residence at the individual level and linking it to the family dataset via the family number (which is equal to the identifier of the oldest person in the family) - the residence information will then point to the place of residence of the oldest person in the family. This will, in practice, also be the family's place of residence.
create-dataset residence
import db/BEFOLKNING_KOMMNR_FORMELL 2024-01-01 as municipality
merge municipality into population on famnr

use population
generate county = substr(municipality,1,2)

define-labels county_label '03' Oslo '11' Rogaland '15' 'Møre og Romsdal' '18' Nordland '31' Østfold '32' Akershus '33' Buskerud '34' Innlandet '39' Vestfold '40' Telemark '42' Agder '46' Vestland '50' Trøndelag '55' Troms '56' Finnmark
assign-labels county county_label

//Cloning municipality variable and extracting cities from one variable (the rest are placed in a combined category)
clone-variables municipality -> city
replace city = '9999' if !inlist(city,'0301','1103','3201','3301','4204','4601','5001','5501')
recode city ('9999' = '9999' 'Other municipalities')

textblock
Combinations of fossil, hybrid, and electric cars distributed by counties and cities
endblock
tabulate county car_combination, rowpct
tabulate city car_combination, rowpct