6 Nextflow on HPC
- Configure a Nextflow pipeline to run on an HPC cluster using a custom configuration file that matches the available resources and job scheduling policies.
- Execute a Nextflow pipeline on a HPC and monitor job submissions using SLURM.
- Use terminal multiplexers like
screen
ortmux
to manage long-running Nextflow processes on an HPC. - Apply HPC best practices, including resource allocation and ethical job submission strategies, to optimise workflow performance.
6.1 Nextflow HPC configuration
To run our workflows on an HPC, it is advisable to specify a configuration file tailored to your system. This file should include:
- The job scheduler used to manage jobs (e.g. SLURM, LSF, PBS, HTCondor, etc.).
- Job submission details, such as your billing account (if applicable), the partition or queue to submit jobs to, and other relevant information.
- Resource specifications like maximum RAM and CPUs available on the partitions/queues.
- Settings that control job submission, including the number of concurrent jobs, submission rate, and how often Nextflow checks job statuses.
As briefly mentioned in the previous chapter, a config file is a set of attributes used by Nextflow when running a pipeline. By default, Nextflow will look for configuration files in predefined locations (see advanced configuration), but you can also specify a config file using the -c
option.
Below is an example of a configuration file for an HPC that uses SLURM as the job scheduler. This example is based on the Cambridge University HPC, specifically the “cascade lake” nodes (docs).
// See more info on process scope here: https://www.nextflow.io/docs/latest/config.html#scope-process
{
process // Our job scheduling system or executor
// many executors supported (cloud and HPC): https://www.nextflow.io/docs/latest/executor.html
= 'slurm'
executor
// the queue or partition we want to use
= 'cclake'
queue }
// Limit nextflow submissions rates to a reasonable level to be kind to other users
// See all options here: https://www.nextflow.io/docs/latest/config.html#scope-executor
{
executor = 'YOUR-BILLING-ACCOUNT'
account = '3410MB'
perCpuMemAllocation = '200'
queueSize = '3 min'
pollInterval = '5 min'
queueStatInterval = '50sec'
submitRateLimit = '5 min'
exitReadTimeout }
// For nf-core pipelines, specify MAX parameters to avoid going over the limits
// these values should match the resources in the chosen queue/partition
{
params = '192.GB'
max_memory = '56'
max_cpus = '12.h'
max_time }
// Options when using the singularity profile
{
singularity = true
enabled // useful if you are unsure about filesystem binding
= true
autoMounts // Allow extra time to pull out a container in case the servers are slow
= '1 h'
pullTimeout // Specify a cache dir to re-use images that have already been downloaded
= 'PATH/TO/nextflow-singularity-cache'
cacheDir }
Here is an explanation of this configuration:
- The
process
directive defines:- The
executor
, which in this example is SLURM (the job scheduler). By default, this option would be “local”, meaning commands run on the current computer. - The
queue
, which corresponds to the SLURM--partition
option, determining the type of node your jobs will run on.
- The
- The
executor
directive further configures the job scheduler:account
is the billing account (if relevant), equivalent to-A
option in SLURM.perCpuMemAllocation
submits jobs using--mem-per-cpu
, relevant for the Cambridge HPC. This is optional and may vary by institution.queueSize
limits the number of simultaneous jobs in the queue. HPC admins may impose limits, so adjust this accordingly. Even with high limits, it’s advisable to limit simultaneous jobs to reduce the load on the job scheduler.pollInterval
,queueStatInterval
,submitRateLimit
andexitReadTimeout
are settings that manage how often Nextflow checks job statuses and interacts with the scheduler. These settings help ensure that you use the scheduler efficiently and ethically. Rapid job submissions and frequent queue checks can overload the scheduler and might trigger warnings from HPC admins.
- The
params
directive is for pipeline-specific options. Here, we set generic options for all nf-core pipelines:max_memory
,max_cpus
andmax_time
, which depend on your specific HPC setup and account.
- The
singularity
directive configures Singularity for running pipelines in an isolated software environment. This can be set up using the singularity scope.autoMounts = true
automatically mounts the filesystem, which is helpful if you’re unfamiliar with filesystem bindings. On most HPC systems, admins handle this, so you may not need to worry about it.cacheDir
ensures that previously downloaded images aren’t downloaded again. This is beneficial if you run the same pipeline multiple times or different pipelines that use the same software images. We recommend setting up a cache directory in a location accessible from the compute nodes.
Proper executor configuration is crucial for running your jobs efficiently on the HPC, so make sure you spend some time configuring it correctly.
6.2 Running Nextflow on a HPC
When working on an HPC cluster, you typically interact with two types of nodes:
- The head or login node, used for low-resource tasks such as navigating the filesystem, moving files, editing scripts, and submitting jobs to the scheduler.
- The compute nodes, where computationally intensive tasks are executed, typically managed by the job scheduler.
You might wonder if it’s acceptable to run your Nextflow command directly on the HPC head/login node. Generally, this is perfectly fine because Nextflow itself doesn’t consume a lot of resources. The main Nextflow process handles interactions with the job scheduler (e.g. SLURM), checks job statuses in the queue, submits new jobs as needed, and logs progress information. Essentially, it automates the process of submitting and tracking jobs, so it isn’t computationally demanding.
However, it’s important to ensure that your Nextflow process continues to run even if you log out of the HPC (which you’ll likely want to do, as workflows can take hours or even days to complete!). There are two primary ways to achieve this: running Nextflow as a background process or using a persistent terminal with a terminal multiplexer.
6.2.1 Nextflow as a background process
The nextflow
command has the -bg
option, which allows you to run the process in the background. If you want to check on the progress of your Nextflow run, you can review the .nextflow.log
file, which logs the workflow’s progress in a text format.
6.2.2 Persistent terminal
If you prefer interactive output on the terminal, we recommend using a terminal multiplexer. A terminal multiplexer lets you open “virtual terminals” that continue running in the background, allowing processes to persist even if you close your window or disconnect from the HPC.
Two popular and widely available terminal multiplexers are screen
and tmux
. Both work similarly, and we’ll briefly demonstrate their usage below.
The first step is to start a session called “demo”:
- For
screen
:screen -S demo
(note the uppercase-S
) - For
tmux
:tmux new -s demo
This opens a session, which essentially looks like your regular terminal. However, you can detach from this session, leaving it running in the background and come back to it later.
As an illustrative example, let’s run the following command, which counts to 600 every second:
for i in {1..600}; do echo $i; sleep 1; done
This command will run for 10 minutes. Imagine this was your Nextflow process, printing pipeline progress on the screen.
If you want to log out of the HPC and leave this task running, you can detach the session, returning to the main terminal:
- For
screen
: press Ctrl + A then D - For
tmux
: press Ctrl + B then D
Finally, log out from the HPC (e.g. using the exit
command). Before logging out, it’s a good idea to note the node you’re on. One way to do this is with the hostname
command.
Suppose your login node was called login-09
. You can log back into this specific node as follows:
ssh username@login-09.train.bio
Once back in your terminal, you can list any running sessions:
screen -ls
tmux ls
You should see your demo
session listed. To reattach to your session:
screen -r demo
tmux attach -t demo
You’ll find your command still running in the background!
6.3 Exercises
6.4 Summary
- Nextflow pipelines can be configured to run on a HPC using a custom
config
file. This file should include:- Which job scheduler is in use (e.g.
slurm
,lsf
, etc.). - The queue/partition name that you want to run the jobs in.
- CPU and memory resource limits for that queue.
- Job submission settings to keep the load on the scheduler low.
- Which job scheduler is in use (e.g.
- To execute the workflow using the custom configuration file, use the
-c your.config
option with thenextflow
command. - The
nextflow
process can be run on the login node, however it is recommended to use a terminal multiplexer (screen
ortmux
) to have persistent terminal that can be retrieved after logout.