How the Longitudinal Education Outcomes data is becoming more innovative

The Scandinavian countries are famous amongst the research community for their fantastic ‘register’ data. This system uses individual identifiers to link together administrative data records capturing a multitude of aspects of people’s lives: education, employment, health, and much more. Individuals can even be linked to the firms in which they work, and to their family members, including across generations, with a de-identified version of this data made available for research purposes. This rich array of data allows researchers to uncover powerful stories about the impact of circumstances and experiences on people’s lives, and how different outcomes are linked.

We don’t have register data in the UK, but we’re certainly making great strides in that direction. Over the last decade or so, serious efforts have been made to maximise the value of data collected for the purposes of administering public policy. These efforts have included making standalone, de-identified datasets securely available for research purposes, and linking together multiple datasets to create new resources. These linked datasets enable us to answer a far richer range of policy-relevant research questions.

Enabling insight into people’s journeys through education and into the workforce

The Longitudinal Education Outcomes (LEO) dataset is one such linked resource. Created eight years ago by the Department for Education, it brings together de-identified individual records from schools, further and higher education institutions, and the tax and benefit system. This enables individuals to be tracked through education and into the labour market, providing insight into the characteristics and experiences which shape their journeys. It has already transformed understanding of how much individuals benefit from further and higher education, and how these experiences vary by characteristics and across institutions and subjects.

Now, the potential of LEO to enable transformative insights has been extended still further, with the addition of several new linked datasets:

  1. University application data: Individuals in LEO can now be linked to their university application records from the Universities and Colleges Admissions Service (UCAS).
  2. Business data: We can obtain some insight about the firms in which people in LEO are employed – including who works in the same firm – from a new link to the Inter-Departmental Business Register (IDBR). This is a list of businesses with Pay-As-You-Earn (PAYE) systems or those who are registered for Value-Added Tax (VAT), which contains some information about those businesses, including the industries in which they operate.
  3. Job support data from the pandemic: Finally, we can understand the characteristics of people who participated in the Coronavirus Job Retention Scheme or Self-Employment Income Support Scheme during the Covid-19 pandemic.

The addition of these new datasets opens up a whole raft of fascinating policy-relevant research questions. For example, our team is currently using the new link to IDBR data to understand the distribution of individuals with different levels and types of qualifications across firms. We are then considering what happens to the wages – hopefully a good proxy for the productivity – of people working in the firm when someone with a particular level and type of education joins. In other words, does having more educated co-workers improve how productive you are? Understanding this is important from a policy perspective, because it may indicate that the benefits of education go beyond the individuals themselves to also benefit others in society. This in turn could help to inform government decisions around investing more in education.

New resources to support researchers

As well as using the data ourselves, our team is also working in partnership with the Department for Education to develop resources that support other researchers to access and effectively use the LEO data. For example, we’ve developed and delivered a series of training courses to increase understanding of the LEO data and are also in the process of developing ‘low fidelity’ synthetic data. This is mock data created to reflect the format of the original data, such as its layout and the type of information it contains, but which does not contain any information relating to real individuals. It can help researchers better understand the real data and support users to progress their projects more rapidly.

The LEO data is a fantastic resource for the research community in the UK. With further data developments in the pipeline, it won’t be long before we can rival the Scandinavian countries in terms of the rich policy-relevant insights it can provide.

Find out more about the LEO data and how to access it.

Find out more about Claire’s ADR England-funded project.

Share this: