Statistical simulations

Author

Vicki Hodgson

Published

June 9, 2025

Overview

This course is designed for those with some statistical knowledge and programming skills, but who are brand new to data simulation.

The focus for this course is on simulation for experimental design and statistics; the materials do not cover complex mechanistic simulations e.g., for modelling economics or epidemiology.

Learning Objectives
  • Explore statistical and experimental design concepts, introduced in other courses, via simulation
    • Understand the impact of factors such as sample size, confounds and collinearity on statistical analyses
    • Gain a more intuitive sense of core statistical concepts such as significance and power
  • Simulate life sciences-inspired datasets
    • Simulate a variety of predictor variables including continuous and categorical main effects and interactions
    • Simulate different types of response variable, including continuous, binary/proportional and count response variables
  • Use simulation to perform statistical analyses
    • Perform power analysis via simulation
    • Know how to use simulation for hypothesis testing, e.g., via resampling and/or bootstrapping techniques

Target Audience

This course is intended for researchers (postgraduate and postdoctoral) in the life sciences, including clinical settings.

Prerequisites

This course is designed to follow on from our existing programme of statistical courses.

Attendees must be familiar with the statistical concepts introduced in the Core statistics course. They should also be comfortable using the R programming language.

It is recommended that attendees are also familiar with more complex statistical models, such as generalised linear models and mixed effects models, to get the most out of this course.

Exercises

Exercises in these materials are labelled according to their level of difficulty:

Level Description
Exercises in level 1 are simpler and designed to get you familiar with the concepts and syntax covered in the course.
Exercises in level 2 combine different concepts together and apply it to a given task.
Exercises in level 3 require going beyond the concepts and syntax introduced to solve new problems.

Citation & Authors

Please cite these materials if:

  • You adapted or used any of them in your own teaching.
  • These materials were useful for your research work. For example, you can cite us in the methods section of your paper: “We carried our analyses based on the recommendations in YourReferenceHere”.

You can cite these materials as:

Hodgson, V. (2025). Statistical simulations. https://cambiotraining.github.io/quarto-course-template/

Or in BibTeX format:

@misc{YourReferenceHere,
  author = {Hodgson, Vicki},
  month = {5},
  title = {Statistical simulations},
  url = {https://cambiotraining.github.io/quarto-course-template/},
  year = {2025}
}

About the authors:

  • Vicki Hodgson
    Affiliation: Cambridge Centre for Research Informatics Training
    Roles: writing - original draft; conceptualisation

Acknowledgements

Some of the code examples provided in these course materials were written with the aid of a large language model (ChatGPT-4o). All code was reviewed and edited by the (human) authors, and AI was not used to generate any of the written content.