38 Working with ONT data
- Understand the advantages of Oxford Nanopore Technologies (ONT) sequencing compared to Illumina.
- Recognise the unique characteristics of ONT data and the alternative pipelines available for its analysis.
38.1 Introduction to Oxford Nanopore Technologies (ONT)
Oxford Nanopore Technologies (ONT) offers a compelling alternative to Illumina’s dominant short-read sequencing platforms, particularly for applications where long-read sequencing provides a critical advantage. ONT’s sequencing technology is based on nanopore sensing, which measures changes in electrical current as DNA or RNA strands pass through protein nanopores, enabling real-time, ultra-long reads (often exceeding 100 kb). This contrasts with Illumina’s short-read (50–300 bp) sequencing-by-synthesis approach. Key advantages of ONT include its portability (with devices ranging from pocket-sized MinIONs to high-throughput PromethION systems), lower upfront costs, and the ability to directly detect base modifications (e.g., methylation) without additional processing. However, ONT has historically had higher error rates (~5–15%) compared to Illumina’s ultra-high accuracy (~99.9%), though improvements in chemistry and basecalling algorithms have narrowed this gap. Additionally, ONT’s real-time data streaming allows for adaptive sequencing, where reads of interest can be selectively targeted during a run.
Long-read sequencing excels in applications where resolving complex genomic regions is essential, such as de novo genome assembly, structural variant detection, and resolving repetitive or highly homologous sequences (e.g., telomeres, centromeres, and transposable elements). It is also invaluable for full-length transcriptome sequencing (isoform detection) and metagenomic analyses, where short reads often fail to distinguish closely related species or genes. While Illumina remains the gold standard for high-accuracy, high-throughput applications (e.g., SNP calling, targeted sequencing, and large-scale population studies), ONT’s long reads provide a unique advantage in clinical diagnostics (e.g., identifying large pathogenic deletions or fusion genes) and field-based sequencing (e.g., outbreak surveillance in remote areas). The choice between ONT and Illumina ultimately depends on the trade-offs between read length, accuracy, cost, and the specific biological question at hand.
38.1.1 Alternative pipelines for ONT data
There are alternative pipelines for analysing ONT data which are designed to handle the unique characteristics of ONT data, including its long reads and higher error rates. These include the following options that can be used instead of the pipelines we have used in this course:
bacQC-ONT
: This pipeline is designed for quality control and taxonomic classification of ONT reads. Many of the tools are the same as those used in thebacQC
pipeline, but it uses the ONT-specific QC toolsnanoplot
andpycoQC
.assembleBAC-ONT
: This pipeline is designed for de novo assembly and annotation of bacterial genomes from ONT reads.wf-bacterial-genomes
: This pipeline is for mapping and small variant calling of haploid samples. It can be used instead ofbactmap
for ONT data.
38.2 Summary
- Oxford Nanopore Technologies (ONT) offers long-read sequencing capabilities, which are particularly useful for applications requiring resolution of complex genomic regions, such as de novo genome assembly and structural variant detection.
- ONT’s sequencing technology is based on nanopore sensing, enabling real-time, ultra-long reads, which contrasts with Illumina’s short-read sequencing.
- ONT has advantages such as portability, lower costs, and the ability to directly detect base modifications, but it has historically had higher error rates compared to Illumina.
- There are alternative pipelines for ONT data, such as
bacQC-ONT
,assembleBAC-ONT
, andwf-bacterial-genomes
, which are designed to handle the unique characteristics of ONT data.