A fundamental cornerstone of statistical data analysis is reproducibility: given the same data and methods, analysts should be able to reproduce another analyst’s findings. This is especially important when analyses inform policies that directly affect people’s lives.
As an analyst, reusing code can reduce duplication of effort and save time. Sharing code from analysis and data management processes promotes a collaborative approach and helps researchers return to their work more efficiently. Knowing how to write and recognise well-written, well-documented code is therefore an essential skill.
The pages in this section are designed to help you get started with writing good code in a trusted research environment (TRE). They highlight best practice and provide examples and instructional materials across different programming languages.
Principles of statistical code
When writing statistical code, you should assume that others may need to read, understand, and reuse it. Sharing code supports efficiency, transparency, and collaboration across research teams.
Writing high-quality code can feel challenging. ADR UK supports the sharing and reuse of code by applying the FAIR principles, which promote the Findability, Accessibility, Interoperability and Reuse of data.
Recognised practice in coding
This short webinar offers practical tips for writing effective code.
Beginners may also find the guide Principles – Quality assurance of code for analysis and research useful. This resource covers:
- Code review practices
- Testing methodologies (unit, system, and integration testing)
- Documentation standards (code comments, docstrings, README files)
It is designed to help developers improve the quality and reliability of their code using established quality assurance practices.
Choosing your software for coding
This presentation from the SAIL Databank outlines good coding practices, including:
- Selecting an appropriate programming language
- Formatting and style considerations
- Efficient data manipulation
- Validation and documentation
- Version control tools
- Concept libraries
Reproducibility
Reproducibility is central to robust analysis. Analysts should be able to achieve the same results when using the same data and methods.
The following resources provide guidance and support for developing reproducible research practices:
- Reproducible Analytical Pipelines (RAP) strategy (UK Civil Service Analysis Function): guidance on implementing reproducible pipelines and open-source analysis, including data management, code versioning, documentation, packaging, code style, and input validation
- UK Reproducibility Network: a peer-led network promoting open research principles, offering training and introductory videos through its primer series
- The Turing Way: a comprehensive handbook covering reproducible data science, including FAIR principles, research data management, repository structure, collaboration, ethics, and version control
- Framework for Open and Reproducible Research Training: teaching resources for reproducible research
- National Centre for Research Methods: resources on reproducibility in the social sciences.
Coding with AI
This introductory presentation explores how AI tools can support data analysis, even for those with little or no programming experience.