Description

Feature Name Description Form1 Stat2 Text3
term Term Unigram or bigram of keywords in a free-text response. Derivation
skill_type Skill Category Broad category of a classified skill response. Derivation
locale CV Language Language used to write CV. Derivation
country Country Country of residence. Derivation
birth_year Birth Year Year of birth. Derivation
gender Gender Female, male or missing value. Derivation
headline_job Headline Job Job-related type of application. Derivation
headline_isco Headline ISCO 3 ISCO level 3 classification for job declared in the type of application. Derivation
responses Responses Number of responses with a particular combination of values for the above variables. Derivation
1 Variable derives directly from a Europass CV;
2 Variable is a statistical transformation of one or more Europass CV variables;
3 Variable is a result of information retrieval using text mining

Summary Statistics

term

Feature Result
Variable type factor (nominal)
Number of missing values* 0 ( 0.00% )
Number of unique values 149900
* No missing values permitted.

skill_type

Feature Result
Variable type character (nominal)
Number of missing values* 0 ( 0.00% )
Number of unique values 5
* No missing values permitted.

locale

Feature Result
Variable type character (nominal)
Number of missing values* 0 ( 0.00% )
Number of unique values 29
* SkillsPassport.Locale missing.

country

Feature Result
Variable type character (nominal)
Number of missing values* 610878 ( 5.61% )
Number of unique values 183
* SkillsPassport…Country.Code missing.

birth_year

Feature Result
Variable type character (nominal)
Number of missing values* 2919239 ( 26.82% )
Number of unique values 83
* SkillsPassport…Birthdate.Year missing.

gender

Feature Result
Variable type character (nominal)
Number of missing values* 3817049 ( 35.07% )
Number of unique values 3
* SkillsPassport…Gender.Code missing.

headline_job

Feature Result
Variable type character (nominal)
Number of missing values* 4973369 ( 45.70% )
Number of unique values 4
* SkillsPassport…Headline.Type.Code missing.

headline_isco

Feature Result
Variable type character (nominal)
Number of missing values* 5707200 ( 52.44% )
Number of unique values 126
* Missing features required for estimation, or unidentifi
ed from classification algorithm.

responses

Feature Result
Variable type integer
Number of missing values* 0 ( 0.00% )
Number of unique values 1014
Min. 1
1st Qu. 1
Median 1
Mean 1.72
3rd Qu. 1
Max. 14055
* No missing values permitted.