Description

Feature Name Description Form1 Stat2 Text3
locale CV Language Language used to write CV. Derivation
country Country Country of residence. Derivation
age_group1 Broad Age Group Broad age groups with three categories. Derivation
age_group2 Narrow Age Group Narrow age groups with five categories. Derivation
gender Gender Female, male or missing value. Derivation
nationality Nationality Nationality. Derivation
mother_tongue Mother Tongue Native language. Derivation
job_applied_for Job Applied For ISCO level 3 classification for job declared in the type of application. Derivation
latest_job_isco1 Latest Job ISCO 1 ISCO level 1 classification for latest job declared. Derivation
latest_job_isco2 Latest Job ISCO 2 ISCO level 2 classification for latest job declared. Derivation
latest_job_isco3 Latest Job ISCO 3 ISCO level 3 classification for latest job declared. Derivation
is_employed Employment Status Estimation of employment status based on recruitment and termination year. Derivation
num_jobs Number of Jobs Number of jobs declared. Derivation
total_work_years Total Work Years Sum of all work years. Derivation
eqf_highest EQF Highest Highest qualification reported with respect to EQF level (including ongoing). Derivation
is_student Student Status Estimation of student status based on enrollment and graduation year. Derivation
respondents Respondents Number of respondents with a particular combination of values for the above variables. Derivation
1 Variable derives directly from a Europass CV;
2 Variable is a statistical transformation of one or more Europass CV variables;
3 Variable is a result of information retrieval using text mining

Summary Statistics

locale

Feature Result
Variable type factor (nominal)
Number of missing values* 0 ( 0.00% )
Number of unique values 29
* SkillsPassport.Locale missing.

country

Feature Result
Variable type factor (nominal)
Number of missing values* 12825 ( 4.35% )
Number of unique values 185
* SkillsPassport…Country.Code missing.

age_group1

Feature Result
Variable type ordered (ordinal)
Number of missing values* 112853 ( 38.31% )
Number of unique values 4
* SkillsPassport…Birthdate.Year missing, or over 65 yea
rs old.

age_group2

Feature Result
Variable type ordered (ordinal)
Number of missing values* 112458 ( 38.18% )
Number of unique values 6
* SkillsPassport…Birthdate.Year missing.

gender

Feature Result
Variable type factor (nominal)
Number of missing values* 132972 ( 45.15% )
Number of unique values 3
* SkillsPassport…Gender.Code missing.

nationality

Feature Result
Variable type factor (nominal)
Number of missing values* 145338 ( 49.34% )
Number of unique values 165
* SkillsPassport…Nationality.Code missing.

mother_tongue

Feature Result
Variable type factor (nominal)
Number of missing values* 59805 ( 20.30% )
Number of unique values 63
* Skills.Linguistic.MotherTongue.Description missing.

job_applied_for

Feature Result
Variable type factor (nominal)
Number of missing values* 191940 ( 65.17% )
Number of unique values 126
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

latest_job_isco1

Feature Result
Variable type factor (nominal)
Number of missing values* 48180 ( 16.36% )
Number of unique values 11
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

latest_job_isco2

Feature Result
Variable type factor (nominal)
Number of missing values* 48180 ( 16.36% )
Number of unique values 43
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

latest_job_isco3

Feature Result
Variable type factor (nominal)
Number of missing values* 48180 ( 16.36% )
Number of unique values 126
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

is_employed

Feature Result
Variable type factor (nominal)
Number of missing values* 16987 ( 5.77% )
Number of unique values 3
* No work experience entries filled, or both WorkExperien
ce.Period.From.Year and WorkExperience.Period.To.Year m
issing.

num_jobs

Feature Result
Variable type ordered (ordinal)
Number of missing values* 16987 ( 5.77% )
Number of unique values 11
* No work experience entries filled.

total_work_years

Feature Result
Variable type ordered (ordinal)
Number of missing values* 25487 ( 8.65% )
Number of unique values 6
* No work experience entries filled, or both WorkExperien
ce.Period.From.Year and WorkExperience.Period.To.Year m
issing.

eqf_highest

Feature Result
Variable type ordered (ordinal)
Number of missing values* 90313 ( 30.66% )
Number of unique values 6
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

is_student

Feature Result
Variable type factor (nominal)
Number of missing values* 15670 ( 5.32% )
Number of unique values 3
* No education entries filled, or both Education.Period.F
rom.Year and Education.Period.To.Year missing.

respondents

Feature Result
Variable type integer
Number of missing values* 0 ( 0.00% )
Number of unique values 111
Min. 1
1st Qu. 1
Median 1
Mean 1.2
3rd Qu. 1
Max. 569
* No missing values permitted.