Cross checking with official statistics

Abstract

We produce associations and cross-references of the aggregated Europass CV data with other data sources and official statistics (notably, Eurostat’s European Union Labour Force Survey and Cedefop’s online job vacancy data).

Data sources

Europass database

  • Includes data for over 10M users who used the application between Q1 2017 and Q2 2020.
  • For details on information retrieval and the data processing pipeline see Europass Survey report: Europass CV Insights Report: Methodology
  • For details on weighting with the aim to calibrate the dataset based on Eurostat’s demographic data: Occupations Analysis, Trends and Correlations: Weighting
  • Note that jobs in ISCO group Armed forces occupations are sparsely included in official statistics provided by Eurostat. For this reason, they have also been excluded from the Europass data for the context of this report.

European Union Labour Force Survey

The European Union Labour Force Survey (EU LFS) is a large-scale household sample survey regarding the labour participation of people aged 15 and over. It is conducted across all Member States of the European Union, as well as 4 candidate countries and three 3 EFTA countries. The national statistical institutes are responsible for data collection on each country, and data processing is carried out by Eurostat. The centralized guidelines for the surveys conducted by national institutes and the adherence to common standardized classifications allow for the harmonization of data at the European level. Data availability is in the form of indicators specific to the topic covered.

Specifically, the following indicators have been used:

Cedefop’s Skills in online job advertisements

Cedefop’s Skills in online job advertisments project brings insights on jobs vacancies based on online sources of job postings. It covers 28 European countries and is a result of analysis of more of more than 100 million online job ads between July 2018 and December 2020. Similar to the Europass Data Science project, this project is meant to serve as a complement to other sources of official statistics. The online job vacancy data used in this report were acquired via the respective Skills Panorama dashboard.

Employment (LFS)

Employment by occupation per age group and gender (EA19, 2019)

  • The distribution of occupations observed in the Europass database for users reporting employment at the time of the creation of their CV is compared with the distribution of occupations of the general population reported by the Labour Force Survey. Note that Europass users are marked as employed if they have not included an end date on their most recent work experience.
  • Specifically, the indicator reporting on employment by sex, age, professional status and occupation [lfsa_egais] is utilized. Statistics for people indicated as employed persons in their activity and employment status are considered. This indicator reports on quarterly data, so the mean of the four quarters has been used to derive an estimate for the year.
  • The following charts report on the percentage each occupation (ISCO 1) represents in its respective sample. 2019 has been selected as the year of focus as it is the latest complete year available in the Europass database (which has data up to Q2 2020). The statistics bellow concern an average derived for the Euro area (EA-19) as defined by Eurostat.

Y25-49 (Total)

Y25-49 (Male)

Y25-49 (Female)

Y15-24 (Total)

Y15-24 (Male)

Y15-24 (Female)

  • Managers and Professionals are significantly overrepresented in the Europass data. The former is reported almost twice as commonly among Europass users in the age group Y25-49 compared to what is seen in the general population, while the latter is seen over 50% more commonly.
  • On the other hand, Clerical support workers, Craft and related trades workers and Skilled agricultural, forestry and fishery workers are significantly underrepresented, with Europass users reporting these occupations roughly half as commonly as they appear in the Labour Force Survey.
  • With a few exceptions, these imbalances are even more pronounced among users in the age group Y15-24, where additionally Elementary occupations also tends to be reported three times less commonly than in the general population.
  • Most patterns of differentiation between males and females observed in the Labour Force Market indicator are generally reflected in the Europass data. For example, Plant and machine operators and assemblers are more likely to be male in both sources, while Service and sales workers are more likely to be female.
  • A likely hypothesis for the statistical deviation encountered lies on the fact that Europass is an online CV creation tool, and is therefore not evenly used by people across different occupations. People working as managers are more likely to use the application than workers in craft and related trades, for example.

Goodness of fit test

  • Pearson’s chi-squared test has been employed to determine whether there is a statistically significant difference between the distribution of occupations as it is measured from the employed Europass users versus the general population.
  • The following table displays the results of the test for each population category with respect to gender and age group. The χ²-statistic has been calculated for both the weighted and the unweighted data, with a lower statistic signifying a closer relationship between the two distributions. Degrees of freedom (d.f), and the sample size are also shown.
Gender Age group d.f χ²-statistic (Raw) Sample size (Raw) χ²-statistic (Weighted) Sample size (Weighted)
Total Y25-49 8 5841.721 33,897 4396.787 24,123
Male Y25-49 8 1855.593 13,007 1727.257 9,498
Female Y25-49 8 3318.991 15,215 2058.086 10,151
Total Y15-24 8 29138.182 18,215 19514.383 10,483
Male Y15-24 8 6297.851 6,174 4144.083 3,200
Female Y15-24 8 22362.010 9,464 14332.630 5,497
  • For 8 degrees of freedom and the χ² statistic calculated, the null hypothesis is rejected for every population category. This means that the distribution of occupations observed in the Europass database does not derive from the distribution of occupations in the general population.
  • The χ² statistic is significantly lower for the age group 25-49, which means that the observed and expected distributions are closer for older users than for younger ones, as also noted qualitatively.

Employment by occupation per country (2019)

  • The above results are presented for each distinct country. Male and female population aged between 15 and 49 are included in the comparison. Countries selected are those for which the Europass sample exceeds 800 users among the EA-19 countries defined by Eurostat.
  • The following charts report on the percentage each occupation (ISCO 1) represents in its respective country’s sample in 2019.

EA19

IT

PT

ES

EL

FR

SI

DE

AT

MT

  • The tendency for Managers and Professionals to be overrepresented in the Europass database is observed for users from most countries. Additionally, Technicians and associate professionals are strongly overrepresented among users from certain countries like Portugal, Spain and Greece, where this occupation appears twice as often as its respective statistic reported by the Labour Force Survey. A notable exception to that is Germany, where they are underrepresented.
  • Likewise, Clerical support workers, Craft and related trades workers and Skilled agricultural, forestry and fishery workers are underrepresented across the board, with the third one displaying a very small percentage even among countries with a robust primary sector like Greece, where 8.5% of the population does this occupation, but only 0.4% of Europass users have reported it. Elementary occupations are also underreported across most countries, especially Spain and France.
  • Patterns specific to certain countries noted by the Labour Force Survey are often also reflected in the Europass database. For example, countries in southern Europe such as Italy, Portugal and Greece have more users working as Service and sales workers than the EA-19 average according to both LFS and the measurements on the Europass database. Note that this does not apply in every case however, as the Europass userbase’s characteristics and behaviour may differ from country to country.

  • Trends observed in the Europass database for the age group Y25-49 generally align well with the trends inferred by statistics in the Labour Force Survey. Yearly changes for most ISCO 1 occupations are relatively small, but upward and downward trends can be detected. The odds ratio change between the two sources aligns within less than 1% deviation for Managers, Professionals, Plant and machine operators and assemblers, and Service and sales workers. For most other occupations, the trend’s magnitude is captured with a larger deviation, but its direction still aligns on most cases.
  • An exceptions to this is Skilled agricultural, forestry and fishery workers, whose trends are highly deviant between Europass and official statistics. This is among the occupations that are relatively rare in the Europass database, so it is anticipated that its behaviour can not be measured very accurately.
  • The younger population displays different trends between Europass and LFS. This is likely because there is a more inherent instability in the career path of younger individuals, which is also noted on the comparison of the distributions of occupations. Moreover, the sample size is relatively small for individuals aged between 15 and 20 especially on 2017, and it is also possible that early users of the Europass application might be biased towards specific occupations, and this imbalance may even out in later years as the sample size increases.

Gender ratio by occupation (EA19)

  • Information about the gender breakdown of each occupation is available on both the Europass database and the LFS indicator [lfsa_egais]. We define the gender ratio of an occupation as the ratio of males to females for that occupation.
  • We apply regression analysis to measurements for the Eurozone (EA-19) in order to calculate how the proportion of males on each occupation has changed in the period between 2017 and 2020 according to each source. As with the trends in employment, we derive an estimate for the trend and present it as the percentage change of the gender proportion (males to total) for one unit of time (1 year) \((e^{estimate} - 1) \times 100\%\).
  • The time series reports the gender ratio by occupation (ISCO 1), while the results of the regression analysis are presented in the form of a barplot, comparing the two sources.

Y25-49 (Time series)

Y15-24 (Time series)

  • Barring some exceptions, the gender ratio reported by the Labour Force Survey generally reflects on the gender ratio inferred by the analysis on the Europass database. There are over three times more males working as Skilled agricultural, forestry and fishery workers than there are females, while females working as Service and sales workers are almost twice as many as males. Plant and machine operators and assemblers as well as Craft and related trades workers are overwhelmingly male according to both sources, although the ratio difference is not as big in the Europass data as it is in Labour Force Survey. Two major exceptions to this are Managers and Clerical support workers.
  • Managers appear to be evenly distributed between males and females on Europass, even though the Labour Force Survey reports twice as many males than females. This may be partly explained by the fact that Europass data, even after weighting, are biased towards younger people (due to weight threshold) and it can be observed that the gender ratio is trending down year-on-year.
  • Trends inferred by both Labour Force Survey data and the Europass database display very small differences on the gender proportion of each occupation year-on-year. Those trends do not align very well between the two sources, and in most cases the Europass trends are biased in favour of male growth in the proportion of each occupation. A possible expanation for this is that there may have been more female early adopters for the Europass application, a difference that has evened out in later years.

Unemployment (LFS)

Unemployment by most recent occupation per country (2019)

  • Europass users that did not specify active employment at the time of creation of their CV have been regarded as unemployed. The distribution of the most recent occupation of these users is being compared with the respective statistics of the general population as reported by the Labour Force Survey. Note that Europass users are marked as unemployed if all of their work experiences include an end date.
  • Specifically, the indicator reporting on previous occupations of the unemployed, by sex [lfsq_ugpis] is used. This indicator reports on quarterly data, so the mean of the four quarters has been used to derive an estimation for the year. No distinction is made with regards to age group on the LFS indicator, so the measurements are compared with the entire age range of the Europass database.
  • The following charts report on the percentage each occupation (ISCO 1) represents in its respective sample. Once again, 2019 has been selected as the year of focus as it is the latest complete year available in the Europass database. The statistics bellow concern countries for which the Europass sample exceeds 8000 users among the EA-19 countries defined by Eurostat.

EA19

IT

PT

ES

EL

DE

FR

SI

AT

  • The distributions of the most recent occupations of the unemployed display major differences between the two sources. Managers, Professionals and Technicians and associate professionals are even more overrepresented in this case than they were in the case of employed individuals, with managers appearing 5 times more often.
  • The most underrepresented group in the Europass database are those in Elementary occupations, who appear 5 times less often than on the general population. Service and sales workers and Clerical support workers are also majorly underrepresented by over 75%.
  • These patterns are consistently noticed up to a different degree for most country. A few exceptions to this are Portugal and Belgium, where percentage of Managers are close between the two sources. It should be noted that the Labour Force Survey does not report statistics for all occupations on every country. In those cases, the distribution displayed represents the occupations measured.
  • Biases towards the reported occupations once again has to do with Europass’s nature as an online CV creator which is not evenly used by people of different occupations. Moreover, it may also have to do with the amount of time workers from different occupations stay unemployed. Managers and Professionals may jump from job to job relatively fast compared to people doing Elementary occupations, meaning that the LFS is less likely to measure them as unemployed. Finally, the Europass data are biased towards younger population, which may also play a role in the difference between the two distributions.

Correlation between the distributions of the employed and the unemployed in Europass CV’s

  • It is noticed that the distribution of occupations of employed and unemployed Europass users is similar. We calculate the Pearson correlation coefficient (r) as a measure of the linear relationship between the two distributions. r can be between 0 and 1, with 1 signifying perfect correlation.
Gender Source r t d.f p-value
Total Europass (Weighted) 0.9957607 28.64208 7 0e+00
Male Europass (Weighted) 0.9945066 25.13727 7 0e+00
Female Europass (Weighted) 0.9978480 40.26372 7 0e+00
Total Europass (Raw) 0.9948595 25.99261 7 0e+00
Male Europass (Raw) 0.9910250 19.61448 7 2e-07
Female Europass (Raw) 0.9965291 31.67219 7 0e+00
  • r is close to 1 for every gender and source. This suggests a strong correlation between the distributions of employed and unemployed users of Europass. It can be said that the distribution of occupations in Europass does not measure explicitly employed or unemployed population, but rather the distribution of occupations of people in the job search market.

Education level of the long-term unemployed by age group per country (2019)

  • Eurostat defines long-term unemployment as unemployment for 12 months or more. This statistic has been measured for Europass users whose most recent employment ended more than 1 year before the creation of their CV and those who have reported no previous employment altogether.
  • The Labour Force Survey compares this statistics with educational level (ISCED-11) on its respective indicator long-term unemployment (12 months and more) by sex, age, educational attainment level and NUTS 2 regions [lfst_r_lfu2ltu].
  • The following charts compares this indicator with measurements on the Europass database for the two available age groups in 2019 per country. Note that in this case the Europass measurements are based on users aged from 25 to 49 years.

EA19

IT

PT

ES

EL

DE

FR

SI

AT

BE

  • Compared to the general population as reported on the Labour Force Survey, unemployed Europass users report a higher level of education on average. Specifically, unemployed users on the age group Y25-64 are over 4 times more likely to have to have received education level equivalent to 5-8 on ISCED.
  • A likely explanation for this is the fact that Europass is generally used by people who are applying for job position that require specialization that may necessitate at least some tertiary education. This is indeed the case when observing the distribution of occupations, where Managers and Professionals especially are overrepresented compared to Service and sales workers and Elementary occupations.
  • This imbalance becomes even more prominent for younger users on the Y15-24 group, which is anticipated, since people who have just finished university education, and are thus unemployed, are likely to use an online application like Europass when applying for their first job.

Job Vacancies (Cedefop)

Supply and demand (2019)

  • Job vacancy data from the OJA project have been acquired from Skills Panorama for the purpose of measuring how supply of jobs compares with demand as inferred from the Europass database. Specifically, the job_applied_for field from Europass CVs has been used to measure demand for users between ages 15 and 49.
  • The following charts compare the distribution of supply and demand of occupations (ISCO 1) in 2019 for countries in the Eurozone area (EA-19). Additionally, the general distribution of occupations of the employed population based on the Labour Force Survey appear for reference.
  • Note that for OJA’s case, the EU average refers to EU27, while for Europass and LFS is refers to the EA19.

EU

IT

PT

ES

EL

FR

SI

DE

AT

MT

  • Qualitatively, it can be observed that the distribution of occupations in Europass CV’s resembles its equivalent in job vacancies, especially when compared to the general distribution of occupations on the market. The next section quantifies this similarity.
  • The biggest gap is noted for Clerical Support Workers and Plant and machine operators and assemblers, where the supply of jobs online is bigger than the demand from Europass users. This may be an indicator of an actual gap on the labour force, or a difference in the behaviour of the population working in those jobs when it comes to online job applications.
  • The opposite can be noted for Service and sales workers, with many Europass users interested in these jobs, but not equally as many online vacancies. The discrepancy in this case may be a result of these particular open job positions not being announced online as often as jobs from other ISCO groups.

Correlation between the supply and demand

  • We use the Pearson correlation coefficient (r) to measure the linear relationship between the occupations distribution on Europass (both weighted and unweighted) and on job vacancies on each country.
Country Source r t d.f p-value
EU Europass (Weighted) 0.9661474 9.908055 7 0.0000227
Italy Europass (Weighted) 0.9176151 6.108085 7 0.0004872
Portugal Europass (Weighted) 0.9298582 6.686741 7 0.0002808
Spain Europass (Weighted) 0.9716155 10.866556 7 0.0000123
Greece Europass (Weighted) 0.9678767 10.184984 7 0.0000190
France Europass (Weighted) 0.8228066 3.830476 7 0.0064538
Slovenia Europass (Weighted) 0.4395991 1.294898 7 0.2364269
Germany Europass (Weighted) 0.9400677 7.294056 7 0.0001636
Austria Europass (Weighted) 0.9394135 7.250745 7 0.0001698
Malta Europass (Weighted) 0.6123692 2.049373 7 0.0796040
EU Europass (Raw) 0.9387850 7.209772 7 0.0001760
Italy Europass (Raw) 0.8700171 4.668861 7 0.0022906
Portugal Europass (Raw) 0.9341104 6.923060 7 0.0002266
Spain Europass (Raw) 0.9740810 11.393377 7 0.0000090
Greece Europass (Raw) 0.9433405 7.521541 7 0.0001348
France Europass (Raw) 0.8665648 4.594025 7 0.0025018
Slovenia Europass (Raw) 0.4758030 1.431248 7 0.1954515
Germany Europass (Raw) 0.9508351 8.122998 7 0.0000827
Austria Europass (Raw) 0.9334679 6.885957 7 0.0002343
Malta Europass (Raw) 0.5462385 1.725358 7 0.1281134
  • For the EU average r is close to 1, suggesting a high correlation between supply and demand. This is the case for most individual countries as well, with the highest correlation being noted for Greece and Spain. Slovenia and Malta are the only cases where the correlation is under 0.5.

Job Tenure (LFS)

Job tenure per age group and gender (EA19, 2019)

  • Job tenure most commonly refers to the length of time workers have been at their current job or with their current employer. The Labour Force Survey provides measurements of tenure as part of the indicator reporting on Employment by sex, age and job tenure [lfsa_egad].
  • Due to the varying definitions of job tenure, measuring it on CVs contained in the Europass database proves to be challenging. For the purpose of this comparison, it has been measured for Europass users employed at the time of their CV creation instead of historical data, as the statistic typically pertains to continuing spells of employment rather than completed ones. In this context, job tenure is with respect to job position instead of employer.
  • The subsequent charts compare the distribution of a specific set of job tenure lengths reported in the LFS indicator with our measurements from the Europass database for countries in the Eurozone area (EA-19) as defined by Eurostat in 2019.

Y15-24 (Total)

Y15-24 (Male)

Y15-24 (Female)

Y25-49 (Total)

Y25-49 (Male)

Y25-49 (Female)

  • The distribution of job tenure observed in the Europass database differs strongly from the one reported by the Labour Force Survey indicator. This holds especially true for users in the age group Y25-49, who appear to be changing jobs far more frequently than what Labour Force Survey indicates, with users with job tenure greater than 5 years being over 2 times more infrequent than on the general population.
  • The distribution is closer for younger users, who however report job tenure between 1 and 2 years roughly 20% more commonly than the general population.
  • Job tenure duration inferred from CVs in the Europass database is more evenly distributed between the four periods defined. This disparity can be attributed to the fact that Europass is an online CV creator, which is inherently targeted to people looking for a new job. It is thus less likely for an individual who stayed with a particular employer or job title for a long time to use the application than someone who more commonly changes between jobs. Moreover, despite weighting, there still is a bias for younger ages.
  • Differences between the two genders are not major according to either source.

Job tenure per occupation (EA19, 2019)

  • Using the same assumptions as above, the distribution of job tenure of employed Europass users at the time of their CV creation has been calculated for each occupation (ISCO 1).
  • The LFS indicator Job tenure by sex, age, professional status and occupation [lfsa_qoe_4a2] was used for comparison. In this case, population with professional status “EMP” and age greater or equal to 25 years old was considered.
  • The subsequent charts compare the distribution of job tenure for each occupation between LFS and Europass for countries in the Eurozone (EA19) in 2019.

Managers

Professionals

Technicians and associate professionals

Clerical support workers

Service and sales workers

Skilled agricultural, forestry and fishery workers

Plant and machine operators and assemblers

Elementary occupations

  • The majority of Europass users across the board worked on their current job between 1 and 4 years when they created their CV.

References

  • Eurostat. (2020). EU - Labour Force Survey microdata 1983-2019, release 2020, version 1 (Version 1) [Data set]. Eurostat. https://doi.org/10.2907/LFS1983-2020V.1
  • European Centre for the Development of Vocational Training. (2019). Online job vacancies and skills analysis: a Cedefop pan European approach. Publications Office. https://doi.org/10.2801/097022
  • European Centre for the Development of Vocational Training., European Commission., European Training Foundation., International Labour Organization (ILO)., Organisation for Economic Co operation and Development., & UNESCO. (2021). Perspectives on policy and practice: tapping into the potential of big data for skills policy. Publications Office. https://doi.org/10.2801/25160
  • European Centre for the Development of Vocational Training. (2020). Europass CV Insights Report, June - September 2019.
  • European Centre for the Development of Vocational Training. (2019). Skills-OVATE: Skills Online Vacancy Analysis Tool for Europe
  • Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
  • Hope, A. C. A. (1968). A simplified Monte Carlo significance test procedure. Journal of the Royal Statistical Society Series B, 30, 582–598. doi: 10.1111/j.2517-6161.1968.tb00759.x. https://www.jstor.org/stable/2984263.
  • Kendall, M. G. (1938). A new measure of rank correlation, Biometrika, 30, 81–93. doi: 10.1093/biomet/30.1-2.81.

2016 - 2020