Europass CV Skills Analysis

Abstract

We produce a report on the different skill groups defined in the new ESCO (v1.0.5, May 2020) hierarchy of skills, like soft, digital and hard skills.

Analysis results

Distribution of skill groups

  • Users of the Europass CV editor application enter four types of skills across four fields: 1) Communication, 2) Organisational, 3) Job Related, and 4) Computer. For every user in the database, each skill type field is associated with a free-text entry that has been matched to the ESCO model using the text-mining algorithm developed. One or more ESCO skills may be matched to a single free-text depending on its length.
  • Skills in the ESCO model are organized in four broad categories: A - attitudes and values, K - knowledge, L - language skills and knowledge, and S - skills. Except for category L, categories include three levels of 1, 2 and 3-digit skill groups each, similar to the ISCO hierarchy. The ESCO skills themselves are leaf nodes of the 3-digit skill groups. Including the broad categories and the leaf nodes, there are five levels in total.
  • The graph below displays the top non-linguistic skills observed overall and individually for each skill type. The number shown represents the total number of CV skill entries matched to each ESCO skill group. Reporting is done with respect to 3-digit skill groups.

Overall

Communication

Organisational

JobRelated

Computer

Browse and filter data

  • The most commonly reported skills across the board are soft skills such as working in teams, management skills, supervising a team or group, and demonstrating good manners. Those tend to be included as Communication, Organisational and Job Related skills in varying frequencies.
  • The most frequently included hard skills are computer use and software and applications development and analysis, which are most often included in the field of Computer skills.
  • Some other concrete professional skills commonly mentioned include assure quality of processes and products and law, most commonly included in the Job Related field.

Age group

  • Approximately 59% of Europass users have included a birth year and are aged between 15 and 64 years old. Three broad age groups of 15-24, 25-49 and 50-64 have been defined and skills reported by users are aggregated by age group.
  • The graph below reports the age group break down of each 3-digit skill group overall and individually for each skill type for the 30 most common skill groups. The number reported represents the percentage each age group represents respectively on each skill group.

Overall

Communication

Organisational

JobRelated

Computer

Browse and filter data

  • The skill groups most commonly referenced by older users are hard skills relevant to a specific sector, such as database and network administration, law, and management and administration. On the other hand, younger users tend to reference more generic skill groups that may be considered soft skills, such as caring for children, demonstrate good manners, and communication, collaboration and creativity.
  • With regards to Organisational skills and skills related to Communication, older users tend to include concrete skills and knowledge, such as on matters related to law, building and developing teams and management and administration. Conversely, younger users more commonly include skills acquired through experiences in sports, language acquisition, and soft skills like demonstrating good manners and caring for children.
  • Likewise, Job Related skills included by older users are more likely to be specific to a job sector like law, medicine, management and administration and database and network design and administration, in contrast to those by younger users which are comperatively more generic, like working in teams, demonstrating good manners and communication, collaboration and creativity.
  • Younger users tend to focus on Computer skills related to using digital tools for processing sound and images and using word processing, publishing and presenting software. Older users report database and network design and administration, setting up computer systems and software and applications development and analysis skill groups.

Gender

  • Approximately 53% of Europass users have disclosed their gender on their CV. The skills of users reporting male and female genders have been aggregated by gender.
  • The graph below reports the gender break down of each 3-digit skill group overall and individually for each skill type. The number reported represents the percentage each gender represents respectively on each skill group.

Overall

Communication

Organisational

JobRelated

Computer

Browse and filter data

  • Skills groups follow different patterns of inclusion between the two genders. Male users tend to report more technical skills both across the board and for each individual skill type, while female users tend to report more skills where human interaction is required.
  • Male users more commonly report skills related to repairing and installing mechanical equipment, assembling electrical and electronic products, and designing electrical or electronic systems or equipment.
  • Female users more frequently mention skills related to child care and youth services, caring for children and care of the elderly and of disabled adults.

Country

  • Approximately 96% of users have reported their country of residence at the time of CV creation. Skills have been aggregated by the users’ country of residence.
  • The graph below reports the overall distribution of 3-digit skill groups for the top 10 countries observed in the data. The number reported represents the total number of CV skill entries matched to each ESCO skill.

IT

PT

RO

ES

HU

HR

DE

EL

LV

MT

Browse and filter data

  • The most common skill groups observed are common between the different countries. In particular, working in teams, management skills, computer use and supervising a team or group appear relatively consistently among the top skill groups of all countries.
  • Note that some of the differences noted might be a result of biases on the algorithm, which different across the different language. As the level of aggregation increases (ie. a higher level in the Skills hierarchy is utilized), the biases’ effect decreases.

Skills yield by locale

  • As documented in the methodology section, the process developed for matching the free-text included by users to ESCO skills is language agnostic. The quality of the classifier’s results mainly depends on the quality of the ESCO taxonomy’s corpus for each language.
  • As the dataset is unbalanced with regards to the language of the CVs, the diversity of ESCO skill matches is different across the different languages.
  • The relationship between the total number of unique ESCO skills (out of the total of roughly 13.5K) matched for each language with this language’s total CVs on the dataset is shown in the graph below. Each circle corresponds to one locale.

ESCO skills

3-digit group

2-digit group

Browse and filter data

  • The relationship between the number of unique ESCO skills matched and the total number of CVs for a particular locale is approximately logarithmic.
  • As data is more aggregated towards 3 and 2-digit codes, coverage of all skill group levels is reached earlier.
  • Points above the fitted line have more skills matched than expected, while those below have less.
  • Less matches than the expected, as in the case of Hungarian and Greek, may be an indication of linguistic peculiarities, disparities in the quality of ESCO models of different languages, and imbalances of Europass users of these locales with respect to representation.

Occupations and skills

Skills by occupation

  • Work experience entries have been matched to the ESCO model using the text mining algorithm developed in the previous phase of the analysis. The chronologically last reported work experience of each user has been determined as their latest job.
  • The graph below reports the overall distribution of 3-digit skill groups for the top 10 ISCO 3 occupations based on users’ latest jobs. The number reported represents the total number of CV skill entries matched to each ESCO skill.

Waiters and bartenders

Administrative and specialised secretaries

Sales, marketing and public relations professionals

Software and applications developers and analysts

Shop salespersons

Engineering professionals (excluding electrotechnology)

Other teaching professionals

Physical and engineering science technicians

Client information workers

Administration professionals

Browse and filter data

  • The top reported skill groups are mutual across the top occupations, namely working in teams, management skills, supervising a team or group.
  • Skills specific to occupations, such as law for Legal, social and cultural professionals and software and applications development and analysis for Information and communications technology professionals appear among the top skill groups, but soft skills are the most common across the majority of occupations.

Importance to occupations

  • As shown on the previous section, certain skill groups are equally represented across the majority of the ISCO occupations. In order to identify what skills that are truly relevant to each occupation, we need to measure how over- or underexpressed each particular skill group is for each occupation.
  • We make use of the revealed comparative advantage (RCA) index, which on the context of skill - occupation pairs is defined as: \[ RCA(o, s) = \frac{cv(o, s) / \sum_{s' \in Skill\ groups}^{}cv(o, s')}{\sum_{o' \in ISCO\ 3}^{}cv(o', s) / \sum_{s' \in Skill\ groups, o' \in ISCO\ 3}^{}cv(o', s')} \]
  • Where \(cv(occup, skill)\) is the number of CVs including a particular skill and occupation.
  • \(RCA > 1\) implies a comperative advantage, that is to say, an increased likelihood of inclusion of a particular skill in CVs of people who profess a particular occupation.
  • This is equivalent to lift in association rules mining.

Waiters and bartenders

Administrative and specialised secretaries

Sales, marketing and public relations professionals

Shop salespersons

Engineering professionals (excluding electrotechnology)

Software and applications developers and analysts

Client information workers

Other teaching professionals

Retail and wholesale trade managers

Administration professionals

Browse and filter data

  • The RCA index reveals a lot of the skills expected for each job, with serving food and drinks having over 8 times a comparative advantage for Waiters and bartenders and database and network design and administration being over 6 times more overexpressed among Software and applications developers and analysts.
  • Skills that were shown to be ubiquitous, such as working in teams have received a low RCA score across the board and are therefore not characteristic of any occupation in particular.

Skillscape

  • Using the RCA index, occupations can be distinguished based on their “effective use” of skills, which can be expressed as \[ e(o, s) = \begin{cases} 0, if rca(o, s) > 1 \\ 1, otherwise \end{cases} \]
  • Through \(e(o, s)\) we proceed to define skill complementarity, as a measure of the frequency of a pair of skills being effectively used by the same occupation: \[ \theta(s,s') = {\sum_{o \in ISCO3}{e(o,s) \cdot e(o,s')} \over {max(\sum_{o \in ISCO3}{e(o,s)}, \sum_{o \in ISCO3}{e(o,s')})}} \]
  • Each node in the graph below represents a 3-digit ESCO skill and they are linked according to skill complementarity, using a threshold of \(\theta(s,s') > 0.55\).
  • The colour of each node represents the ISCO 1 occupation for which the equivalent skill has the biggest comperative advantage (ie. \(argmax(rca(o,s))\)).
  • The Fruchterman-Reingold layout algorithm based on skill complementarity is utilized to create the following Skillscape.

Overall

Communication

Organisational

JobRelated

Computer

Browse and filter data

  • Two main clusters are observed: one composed of skills primarily included on CVs of people whose latest job belongs on ISCO 1 occupations Professionals and Managers, and another of skills from occupations belonging on ISCO 1 occupations Elementary occupations, Craft and related workers, Clerical support workers, and Service and sales workers.
  • Skills from the first cluster are relatively ubiquitous.
  • Note also the small cluster of three service-type skills at the bottom of the graph.

Skills of NEETs

Characteristics of NEETs

  • Europass CVs may include information on the demographic characteristics, work experiences, qualifications, and skills of users. Using the temporal fields of work experiences and qualifications, it is possible to determine the current status of users with regards to education, employment and training.
  • Users have been determined as employed or in education if they have included an ongoing work experience or qualification respectively. If status on either area is not included, users are classified based only on the one disclosed. If neither employment nor education status is included, users are marked as “Unknown”. Category “NEET” is composed respectively of users that shared both education and employment status, with NEETs between 15 and 29 marked separately from the rest.
  • The graphs below compare the demographic characteristics of users based on their status at the time of CV creation.

Overall

Gender

Age

Country

  • Approximately 33% of users are classified as NEET, of which 71% are between 15 and 29. 56% are in employment, education, or both, while not enough information are provided to determine the status of the 11% of users. About half of users in education are also employed.
  • Noting the roughly 5% overreprentation of males in the dataset, both male and female users are roughly equally likely to be classified as NEET. Active employment is slightly more likely for male users, while active education is slightly more like for female users.
  • The age breakdown of NEETs is slightly skewed towards younger ages compared to users in education, employment, or training.
  • Europass users classified as NEET are over 40% in Portugal and Ireland. The lowest percentages are seen in Estonia and Finland, where it is under 25%.

Distribution of skills

  • A total of 123.5K users of Europass participating in the survey with age between 15 and 29 have included information about their education and employment status. Of those, approximately 46.7K are not in education, employment, or training (NEET), while the rest are in education, employment, or training (EET).
  • The top 10 3-digit skill groups of users classified as NEET overall and on each skill type are displayed in the graph below. The number shown is the percentage of NEET CVs that included each skill. The equivalent number for users in education, employment, or training is also included for comparison.

Overall

Communication

Organisational

JobRelated

Computer

Browse and filter data

  • The most common skills seen in CVs created by NEETs are roughly the same as those seen in users in education or employment. However, the average frequency of inclusion of skills is lower for NEETs.
  • The frequency is especially lower for skills that suggest experience in the work force, such as supervising a team or group, and specialization in one particular area, such as software and applications development and analysis.
  • On the other hand, inclusion of more generic skill groups such as demonstrate good manners and providing information to public and clients is slightly more frequently included.

Most deviated skills

  • It is possible to measure the difference in the relative frequency of inclusion of skills between users that have been classified as NEET and those that are in education, employment, or training. To do this, we calculate the ratio of the two categories (NEET or not) for each skill.
  • The graph below reports the break down by status in education, employment or training of the 20 3-digit skill groups with at least 200 observations displaying the biggest relative difference overall and individually for each skill type. The number reported is the percentage each category represents respectively on each skill group.

Overall

Communication

Organisational

JobRelated

Computer

Browse and filter data

  • Overall, users currently in education, employment, or training tend to report skills requiring more professional or academic specialization. For example, advising or business or operational matters, providing financial advise and dental studies are much more often reported by these users.
  • On the other hand, users determined as NEET have a higher frequency reporting more generic skills or skills related to manual labour. For example, serving food and drinks, storing goods and materials, and demonstrating willingness to learn are relatively more common among these users.

Methodology

ESCO skills pilar

  • ESCO is the multilingual classification of European Skills, Competences, Qualifications and Occupations.
  • The ESCO skills pillar distinguishes between i) skill/competence and ii) knowledge concepts.
  • As of version 1.0.8, it contains over 13,485 concepts with preferred labels and a number of associated meta-data across 27 languages.
  • Starting with version 1.0.5, these concepts are structured in a hierarchy which contains four sub-classifications, specifically: a) Knowledge, b) Skills, c) Attitudes and values and d) Language skills and knowledge.
  • For Knowledge, Skills, and Attitudes and values, four levels of hierarchy are defined in total on to which the 13K concepts are leaf nodes.
  • The hierarchy is in a continuous process of improvement and not considered finalized.
  • Note that concepts in the skills pillar are also linked through associations defined in previous versions of the ESCO model, which however do not compose a strict hierarchy, but a graph.

Free-text matching

  • Matching of users’ free-text entries to the ESCO model is performed following a similar approach as in the previous phase of the project, but with modifications enabled by the newly-introduced skills hierarchy.
  • The hierarchy is exploited in two main ways: a) to enrich the corpus of each ESCO skill concept, resulting in better matches on the lower level and, b) aggregating these matches to derive matches for each individual level in the hierarchy.
  • The vocabulary of the multilingual ESCO skills classification is initially brought into tidy format.
  • Specifically, the vocabulary of each of the 13K concepts is derived from their preferred and alternate labels, as well as the preferred and alternate labels of their parents in the ESCO hierarchy.
  • The vocabulary of the ESCO skills as well as users’ free-text entries is then cleansed and used to generate the necessary numerical statistics.
  • An optimized language agnostic matching method is used to identify free text with the ESCO vocabulary.
  • A text mining algorithm retrieves information about the association between free text and ESCO skills using the precalculated numerical statistics.
  • Each free-text entry can be equivalent to one or more skills in ESCO. The number of maximum matches per free-text entry is proportional to the size of the entry, with longer ones matching to more ESCO skills.
  • Finally, a voting algorithm leads to a suggested skill in a higher level of the hierarchy by aggregating the matches on the lower level.

June - September 2019