Glossary
-
Accredited researcher with an approved project
Researchers who have completed Safe Researcher Training and applied successfully to access data for a project in a trusted research environment (TRE). A Research Accreditation Panel (RAP) oversees the accreditation of researchers, projects, processors, and trusted research environments in accordance with the Digital Economy Act (DEA) 2017.
-
Administrative data
Information created when people interact with public services, such as schools, the NHS, or the courts, and collated by government. It is originally created for operational purposes, but it can be the source of powerful insights. ADR UK funds work to link and de-identify administrative data from different public services, and make the resulting dataset securely available for public good research.
-
Anonymised data
Anonymisation is the process of changing personal identifiers (i.e., address, name) in some way - such as being removed or substituted - so that the data is not identifiable anymore. This process allows data to be shared and used safely while preserving the privacy of people included in the data. Anonymised data is data that does not permit the identification of a person, household or business.
-
Data controller
A natural or legal person, agency, public authority or other body which, alone or jointly, determines the purpose and means of processing. Controllers make decisions about processing activities. They exercise overall control of the personal data being processed and are ultimately in charge of and responsible for the processing.
-
Data Dictionary
Information that describes key characteristics of a dataset, particularly the variables. The information can be used by researchers to write an application to use the dataset for a research project.
-
Data linkage
The act of bringing two or more datasets from different sources together, creating associations between the data. Data linkage can provide new statistical insights that are not possible with information from a single source.
-
Data owner
A person, organisation or department with responsibility for the management and control of data, in line with General Data Protection Regulation (GDPR) laws.
-
Data processing
The collection and manipulation of items of data to produce meaningful information. Includes collecting, cleaning, manipulating, storing, and sharing.
-
De-identified data
Data that has been de-identified does not contain any information that could identify a person, such as name, address, postcodes, etc. Identifiers are removed from the records before de-identified data is securely transferred to the trusted research environment.
-
Descriptive statistics
Numbers that describe the basic features of a dataset, helping to interpret the data without complex analysis.
-
Flagship dataset
Linked or linkable administrative datasets (either administrative linked to administrative data or administrative linked to other data) of significant research value and that we anticipate will have wide appeal to researchers. The flagship datasets are findable in the ADR UK Data Catalogue and available for all accredited researchers to apply to access via one of ADR UK’s trusted research environments. They have approved legal gateways for research and are publicly documented.
-
Key variable
A variable in common between two datasets, which may therefore be used for linking records between them.
-
Legal gateway
The legislation that allows accredited and approved researchers to access data for research and statistical purposes. One of the most commonly used legal gateways is the Digital Economy Act 2017, Section 64 - ‘Disclosure of information for research purposes’: de-identified data held by a public authority in connection with the authority’s functions may be disclosed to another person for the purpose of research, subject to meeting certain criteria.
-
Longitudinal data
A type of survey data where the same sample (people) is given the same survey at different points in time. It shows change at the individual level over time.
-
Metadata
Information that describes a dataset or a data item. This includes information that provides context to the data, such as how it was collected, coverage of the data, publication date, as well as data quality dimensions, and data characteristics such as summary statistics.
-
Mixed methods research
Research consisting of two or more methodologies, which may include quantitative and qualitative analysis.
-
Poisson regression
A form of regression analysis to model count data, e.g. the number of events in a given period.
-
Public good
Refers to activity which is motivated by its benefit to society, rather than private profit. This work often aims to provide evidence for public policies, services or decisions to ultimately improve lives. The UK Statistics Authority has outlined the criteria for serving public good (section 33.1).
-
Qualitative research
Research which uses non-numerical data - such as opinions, behaviours, or experiences - to understand reasons, motivations, or meanings.
-
Quantitative research
Research using numerical data. It can include data collection or use of pre-existing data and analysing it using statistical methods to understand patterns, relationships, or trends.
-
Regression analysis
A type of analysis that enables the relationship between two variables to be assessed while controlling for other variables in the analysis.
-
Research-ready data
A dataset that is deemed ‘research-ready’ is expected to meet a minimum standard of documentation and be suitably de-identified to protect individuals’ privacy, so it can be used for secure research. Read a report from ADR UK on research-ready data.
-
Survey data
Data that is limited to those who take part in the survey – the sample size therefore tends to be a lot smaller than administrative data.
-
Survival analysis
A method to study how long it takes for something to occur, looking at variables that might affect the timing of these events.
-
Synthetic data
Artificially manufactured data that does not relate to real statistical units, but has the look and structure of real data. It will have been generated from one or more population models, designed to be non-disclosive, and used either for teaching purposes, for testing code, or for use in developing methodology.
-
Trusted research environment (TRE)
Highly secure computing environments containing de-identified data. Researchers and their projects are required to be accredited to access and use this data.
-
User Guide
A guide written for potential users of a dataset (researchers, analysts) which (together with the data dictionary) provides enough information about the dataset that they may write a robust and feasible research application to use the data.