33 Pathogenwatch 2
Learning Objectives
- Analyse S. penumoniae genomes on Pathogenwatch and extract relevant information to use as metadata to annotate phylogenetic trees.
Exercise: Analysing Pneumococcal genomes with Pathogenwatch
Whilst you can upload FASTQ files to Pathogenwatch, it’s quicker if we work with already assembled genomes. We’ve provided pre-processed results for the S. pneumoniae data generated with assembleBAC
in the preprocessed/
directory which you can use to upload to Pathogenwatch in this exercise.
- Upload the assembled S. pneumoniae genomes to
Pathogenwatch
. - Once
Pathogenwatch
has finished processing the genomes, save the results to a collection called Chaguza Serotype 1. - Download the Typing and AMR profile tables to the
S_pneumoniae
directory. - Rename the tables to
chaguza-serotype-1-typing.csv
andchaguza-serotype-1-amr-profile.csv
respectively. - Merge the two tables with
sample_info.csv
by running themerge_pneumo_data.py
script in thescripts
directory. Make sure you are on thebase
software environment.
Hint
Refer back to Pathogenwatch if you need a reminder on how to perform these tasks.
Answer
- We opened
Pathogenwatch
in our web browser and logged in. We then clicked on UPLOAD, the Upload FASTA(s) button in the “Single Genome FASTAs” section and the + button before navigating to thepreprocessed/assemblebac/assemblies
directory. We then selected all the assembly files and clicked Open on the dialogue window. - We waited for
Pathogenwatch
to finish processing the genomes, clicked the VIEW GENOMES button, then saved the results to a collection called Chaguza Serotype 1 by clicking Selected Genomes –> Create Collection and adding Chaguza Serotype 1 to the Title box. Finally, we clicked Create Now button to create our collection.
- We clicked on the download icon in the top right-hand corner and selected Typing table.
- We clicked on the download icon in the top right-hand corner and selected AMR profile.
- We renamed the files to
chaguza-serotype-1-typing.csv
andchaguza-serotype-1-amr-profile.csv
respectively and moved them to theS_pneumoniae
directory. - We went back to the
base
software environment withmamba activate base
- We ran the
merge_pneumo_data.py
script to create a TSV file calledpneumo_metadata.tsv
in your analysis directory:
python scripts/merge_pneumo_data.py -s sample_info.csv -t chaguza-serotype-1-typing.csv -a chaguza-serotype-1-amr-profile.csv