Description

Feature Name Description Form1 Stat2 Text3
education_field Education Field ISCED FoET classification of a free-text qualification response. Derivation
eqf_granted EQF Granted EQF classification of a free-text qualification response. Derivation
institution_country Institution Country Country of instutution where qualification was granted. Derivation
study_years Study Years Years of studying on a specific qualification. Derivation
enrollment_year Enrollment Year Year of enrollment. Derivation
graduation_year Graduation Year Year of graduation. Derivation
locale CV Language Language used to write CV. Derivation
country Country Country of residence. Derivation
age_group1 Broad Age Group Broad age groups with three categories. Derivation
age_group2 Narrow Age Group Narrow age groups with five categories. Derivation
gender Gender Female, male or missing value. Derivation
nationality Nationality Nationality. Derivation
mother_tongue Mother Tongue Native language. Derivation
responses Responses Number of responses with a particular combination of values for the above variables. Derivation
1 Variable derives directly from a Europass CV;
2 Variable is a statistical transformation of one or more Europass CV variables;
3 Variable is a result of information retrieval using text mining

Summary Statistics

education_field

Feature Result
Variable type factor (nominal)
Number of missing values* 161774 ( 13.70% )
Number of unique values 117
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

eqf_granted

Feature Result
Variable type ordered (ordinal)
Number of missing values* 506190 ( 42.86% )
Number of unique values 6
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

institution_country

Feature Result
Variable type factor (nominal)
Number of missing values* 265799 ( 22.51% )
Number of unique values 186
* Education.Organisation…Country.Code missing.

study_years

Feature Result
Variable type numeric
Number of missing values* 109689 ( 9.29% )
Number of unique values 16
Min. 1
1st Qu. 1
Median 2
Mean 3.11
3rd Qu. 4
Max. 15
NA’s 109689
* Education.Period.From.Year missing, or value less than
0 years, or value more than 15 years.

enrollment_year

Feature Result
Variable type ordered (ordinal)
Number of missing values* 88051 ( 7.46% )
Number of unique values 76
* Education.Period.From.Year missing.

graduation_year

Feature Result
Variable type ordered (ordinal)
Number of missing values* 274806 ( 23.27% )
Number of unique values 72
* Education.Period.To.Year missing.

locale

Feature Result
Variable type factor (nominal)
Number of missing values* 0 ( 0.00% )
Number of unique values 29
* SkillsPassport.Locale missing.

country

Feature Result
Variable type factor (nominal)
Number of missing values* 68085 ( 5.77% )
Number of unique values 182
* SkillsPassport…Country.Code missing.

age_group1

Feature Result
Variable type ordered (ordinal)
Number of missing values* 399683 ( 33.85% )
Number of unique values 4
* SkillsPassport…Birthdate.Year missing, or over 65 yea
rs old.

age_group2

Feature Result
Variable type ordered (ordinal)
Number of missing values* 398438 ( 33.74% )
Number of unique values 6
* SkillsPassport…Birthdate.Year missing.

gender

Feature Result
Variable type factor (nominal)
Number of missing values* 514292 ( 43.55% )
Number of unique values 3
* SkillsPassport…Gender.Code missing.

nationality

Feature Result
Variable type factor (nominal)
Number of missing values* 557949 ( 47.25% )
Number of unique values 164
* SkillsPassport…Nationality.Code missing.

mother_tongue

Feature Result
Variable type factor (nominal)
Number of missing values* 248946 ( 21.08% )
Number of unique values 63
* Skills.Linguistic.MotherTongue.Description missing.

responses

Feature Result
Variable type integer
Number of missing values* 0 ( 0.00% )
Number of unique values 199
Min. 1
1st Qu. 1
Median 1
Mean 1.5
3rd Qu. 1
Max. 926
* No missing values permitted.