Administrative Data Research UK: Unlocking Linked Data for Research on Children and Young People

Children and young people are two of the most vulnerable groups in society and the Covid-19 pandemic has brought prolonged interruptions in access to formal education settings and other health and wellbeing support services. The impact of this has been felt particularly acutely for the most disadvantaged.

At the same time, the pandemic has also shown how the timely and secure use of data can be used to inform policy and improve people’s lives. One currently under-used form of data in research is government-held administrative data, which is data collected when individuals interact with public services such as schools and colleges, the NHS, the courts, and the benefits system.

ADR UK (Administrative Data Research UK) is a partnership revolutionising the way researchers access and use government-held data about people in the UK to enable better policy decisions that improve people’s lives. As well as opening up safe and secure access to deidentified data through a network of partnerships with trusted research environments, ADR UK supports and funds projects that link data from different sources together.

Linkages to education data has been at the forefront of ADR UK activities to date. These include linking the National Pupil Database (NPD) with data on children’s health, future employment, and interactions with the courts. Linking data in this way can lead us to a much more complete understanding of people’s lives and of how early experiences affect later life.

Figure 1: Simplified diagram of linked datasets available (or soon to be available) for access in the ONS Secure Research Service, with datasets coloured by ADR UK research themes. The full list of available datasets, as well as information about how to access the data, can be found in the ONS Secure Research Service metadata catalogue.

The Grading and Admissions Data for England (GRADE) dataset links GCSE and A level grade data (Ofqual) and university admissions (UCAS) data, with other information from the NPD for all pupils in England with data from 2017 to 2020. This data provides insights which can inform evidence-based assessment policy using lessons learned from the approach taken in 2020.

Growing Up in England (GUIE) has very recently become available for researchers, and links together 2011 Census data to a Department for Education (DfE) attainment dataset called the Feasibility All Education Dataset for England (AEDE) covering the academic years 2001/02 to 2014/15. In all, around 7 million records have been matched, providing an unparalleled sample size including deidentified information on many of the most vulnerable children. These include, for example, children caring for others, children with a disability or ill-health, and children from workless families. The data can be used to ask questions on how a child’s circumstances and characteristics, as well as those of their household, influence educational attainment.

Part of the ambitious flagship Ministry of Justice Data First programme, the MoJ – DfE data share links Police National Computer records from 2000 to the end of 2017 to records from the National Pupil Database. This data allows for unprecedented insight into the interactions between childhood characteristics, education outcomes and (re)-offending.

The Longitudinal Education Outcomes dataset (LEO) is a large and complex dataset linking information on education (schools, further education and higher education) with employment, benefits, and earnings data. This data can be used to understand how education affects labour market outcomes for different individuals, allowing an evidence base in the assessment of education policy and provision.

Other exciting data linkages looking at health and education are expected to be available for researchers to use later this year.  These include:

  • child-disease specific linkages such as Diabetes-NPD linkage for England and Wales which matches records on individuals with type 1 diabetes to their education records in the NPD to understand how disease-specific measures of heath affect, for example, school attendance and educational outcomes and attainment
  • ECHILD which links hospital records (such as A&E attendance) to the NPD for all children and young people up to age 25 who were born in England between 1st September 1995 and 30th August 2020 (approximately 14.7 million individuals), which will allow researchers to explore how education influences health, and how health affects education.

Together and individually, these linked datasets have the potential to massively enrich our understanding of the interactions between different aspects of children’s lives and how these shape their future outcomes and opportunities. In turn, this can inform policies and interventions that can break negative cycles and improve the lives and opportunities, particularly for the most vulnerable children and young people in society.

Find out more about our work, and get in touch with us hub@adruk.org.

This blog was originally posted on the UCL website.

Share this: