Labour market dataset will enable greater insight into employment inequalities

The dataset represents about one per cent of the workforce. The details of approximately 180,000 de-identified individuals in the PAYE and Self-Assessment data have been linked to their information in the ASHE. The linkage of this data can help to address information gaps about changes in the labour market within the 12-month intervals between ASHE surveys. Bringing together the annual survey with PAYE and Self-Assessment data also facilitates greater understanding of a person’s whole interaction with the labour market. 

This dataset was developed as part of the ADR England-funded Wage and Employment Dynamics project. The ASHE survey is conducted annually by the ONS. The PAYE and Self-Assessment data was provided by His Majesty’s Revenue & Customs (HMRC). By securely sharing this data for research, HMRC is enhancing our potential to understand the earnings and the labour market, and to improve wage inequalities in Britain. 

Damian Whittard, Associate Professor University of the West of England, said: “ASHE is an important dataset, but it restricts researchers to studying those that are just in employment in April of each year. For the first time, the HMRC linked datasets will enable researchers to understand the complete pattern of individuals work activity throughout the year, whether employed, self-employed or both. This dataset will be transformational in enabling researchers to understand labour market transitions across time and changes in circumstance”. 

About the data 

The ASHE dataset includes de-identified personal information such as age, gender, employment information, and employer information. The PAYE data provides details of point-in-time employment of people and what they have been paid. The Self-Assessment data provides employees’ income from self-employment and other forms of income. 

The linked dataset combines all detailed personal, employment, and job characteristics with PAYE and self-employment income data. It covers the 1% of the working population randomly selected for ASHE. The linkage brings together ASHE data from years 1997-2022 with PAYE data for tax year 2015-2019 and Self-Assessment data for tax year 2011-2018.  

The data has all been de-identified and has been linked under the provisions of the Digital Economy Act 2017, which provides a legal gateway for researchers to access government data in a secure way. 

The linked dataset includes:  

  • Personal characteristics: Age, gender, and residential location 
  • Employment information: Periods of employment and self-employment, number of jobs  
  • Job characteristics: Earnings from employment, income from self-employment, number of jobs, working hours, paid hours, occupation, and pensions  
  • Employer characteristics: Employer identifier, size, industry, legal status, and workplace location(s). 

How this dataset can make a difference 

This dataset can provide insight into changes, patterns and inequalities in the labour market. It can also give a deeper insight into the how different individuals work and earn money, for example the types of people who work multiple jobs, or supplement full-time employment with self-employed work.  

Researchers can examine a range of questions, for example: 

  • How large is the gender wage gap or earnings gap when considering all earned income and total compensation? 
  • Does having a student loan increase the likelihood of working in the gig economy? 
  • What does inequality look like across the wage distribution when we include income from self-employment alongside earnings?  
  • Is the volatility of changes in employment and/or income related to location/region? 
  • Which types of employees also have income from self-employment and what share of their earnings is accounted for by self-employment?  

How to access the data 

Accredited researchers can apply to access the ASHE linked to PAYE and Self-Assessment dataset via the ONS Research Accreditation Service

You can read more about the dataset, explore the data dictionary, and apply to access on the ADR UK Data Catalogue

 

Share this: