Working with Bacterial Genomes

Author

Andries van Tonder; Hugo Tavares; Bajuna Salehe

Published

October 21, 2025

Overview

This comprehensive course equips you with essential skills and knowledge in bacterial genomics analysis, primarily using Illumina-sequenced samples. You’ll gain an understanding of how to select the most appropriate analysis workflow, tailored to the genome diversity of a given bacterial species. Through hands-on training, you’ll apply both de novo assembly and reference-based mapping approaches to obtain bacterial genomes for your isolates. You will apply standardised workflows for genome assembly and annotation, including quality assessment criteria to ensure the reliability of your results. Along with typing bacteria using methods such as MLST, you’ll learn how to construct phylogenetic trees using whole genome and core genome alignments, enabling you to explore the evolutionary relationships among bacterial isolates. You’ll extend this to estimate a time-scaled phylogeny using a starting phylogenetic tree. Lastly, you’ll apply methods to detect antimicrobial resistance genes. As examples we will use Mycobacterium tuberculosis, Staphylococcus aureus and Streptococcus pneumoniae, allowing you to become well-equipped to conduct bacterial genomics analyses on a range of species.

TipLearning Objectives

By the end of this course, you will be able to:

  • Choose the most suitable analysis workflow based on the genome diversity of a given bacterial species.
  • Differentiate between “de novo assembly” and “reference-based mapping” approaches for reconstructing bacterial genomes.
  • Apply standardised workflows to assemble and annotate genomes using both approaches.
  • Evaluate the quality of assembled genomes and determine their suitability for downstream analysis.
  • Detect and remove recombinant regions.
  • Construct phylogenetic trees using both whole genome and core genome alignments.
  • Estimate a time-scaled phylogeny using and initial maximum likelihood phylogenetic tree and sample dates.
  • Conduct genomic epidemiology and strain typing.
  • Detect the presence of antimicrobial resistance genes in your isolates.

Target Audience

The course is aimed at biologists interested in microbiology, prokaryotic genomics and antimicrobial resistance.

Prerequisites

Essential

  • Basic understanding of high-throughput sequencing technologies.
    • Watch this iBiology video for an excellent overview.
  • A working knowledge of the UNIX command line (course registration page).
    • If you are not able to attend this prerequisite course, please work through our Unix command line materials ahead of the course (up to section 7).
  • A working knowledge of R (course registration page).
    • If you are not able to attend this prerequisite course, please work through our R materials ahead of the course.

Desirable

  • A basic knowledge of phylogenetics inference methods (course registration page).
  • A working knowledge of running analysis on High Performance Computing (HPC) clusters (course registration page).

Citation & Authors

Please cite these materials if:

  • You adapted or used any of them in your own teaching.
  • These materials were useful for your research work. For example, you can cite us in the methods section of your paper: “We carried our analyses based on the recommendations in YourReferenceHere”.

You can cite these materials as:

van Tonder, A., Tavares, H., Salehe, B. (2024). Working with Bacterial Genomes. https://cambiotraining.github.io/bacterial-genomics/

Or in BibTeX format:

@misc{YourReferenceHere,
  author = {van Tonder, Andries and Tavares, Hugo and Salehe, Bajuna},
  month = {7},
  title = {Working with Bacterial Genomes},
  url = {https://cambiotraining.github.io/bacterial-genomics/},
  year = {2024}
}

About the authors:

  • Andries van Tonder
    Affiliation: Department of Veterinary Medicine, University of Cambridge
    Roles: conceptualisation; primary author; data curation; coding; software
  • Hugo Tavares
    Affiliation: Cambridge Centre for Research Informatics Training
    Roles: editor; software; data curation
  • Bajuna Salehe
    Affiliation: Cambridge Centre for Research Informatics Training
    Roles: software

Acknowledgements

  • List any other sources of materials that were used.
  • Or other people that may have advised during the material development (but are not authors).