6 LSF Scheduler
- Submit a simple job using LSF and analyse its output.
- Edit a job submission script to request non-default resources.
- Use LSF environment variables to customise scripts.
- Use the commands
bjobs
andbacct
to obtain information about the jobs. - Troubleshoot errors occurring during job execution.
6.1 Job Scheduler Overview
As we briefly discussed in “Introduction to HPC”, HPC servers usually have a job scheduling software that manages all the jobs that the users submit to be run on the compute nodes. This allows efficient usage of the compute resources (CPUs and RAM), and the user does not have to worry about affecting other people’s jobs.
The job scheduler uses an algorithm to prioritise the jobs, weighing aspects such as:
- how much time did you request to run your job?
- how many resources (CPUs and RAM) do you need?
- how many other jobs have you got running at the moment?
Based on these, the algorithm will rank each of the jobs in the queue to decide on a “fair” way to prioritise them. Note that this priority dynamically changes all the time, as jobs are submitted or cancelled by the users, and depending on how long they have been in the queue. For example, a job requesting many resources may start with a low priority, but the longer it waits in the queue, the more its priority increases.
6.2 Submitting a Job with LSF
To submit a job to LSF, you need to include your code in a shell script. Let’s start with a minimal example in lsf/simple_job.sh
, which contains the following code:
#!/bin/bash
sleep 60 # hold for 60 seconds
echo "This job is running on:"
hostname
We can run this script from the login node using the bash
interpreter (make sure you are in the correct directory first: cd ~/hpc_workshop/
):
bash lsf/simple_job.sh
This job is running on:
gen3-head1
To submit the job to the scheduler we instead use the bsub
command in a very similar way:
bsub lsf/simple_job.sh
However, this throws back an error:
Returning output by mail is not supported on this cluster.
Please use the -o option to write output to disk.
Request aborted by esub. Job not submitted.
Do you have any ideas of what this error could be?
Our job, like all LSF jobs, has an output (echo printout and the hostname) and job statistics about what was done on the HPC. Because the Sanger LSF is set up to disallow outputs to be sent to your email by default for security reasons, it is impossible for the job to run without specifying a “standard output” file. We can fix this error using the -o
argument:
bsub -o simple_job.out lsf/simple_job.sh
Instead the output is sent to a file, which we called simple_job.out
. This file will be located in the same directory where you launched the job from.
However, when running again, we get another error that says:
Sorry no available user group specified for
this job. Please resubmit your job with
-G groupname or set the \$LSB_DEFAULT_USERGROUP environment variable.
Request aborted by esub. Job not submitted.
The other absolutely necessary argument for submitting a job to the Farm is the -G groupname variable. This specifies to which group at the Sanger you’re billing the compute resources. We’ll be using a temporary testing group for this course called farm-course
, but once you settle into a lab you’ll use their own group name.
Let’s try adding it to the bsub argument list:
bsub -o simple_job.out -G farm-course lsf/simple_job.sh
If it was submitted correctly, we should see this message:
Job <xxxxxx> is submitted to default queue <normal>.
Once the job is finished, we can investigate the output by looking inside the file, for example cat simple_job.out
.
The first line of the shell scripts #!/bin/bash
is called a shebang and indicates which program should interpret this script. In this case, bash is the interpreter of shell scripts (there’s other shell interpreters, but that’s beyond what we need to worry about here).
Remember to always have this as the first line of your script. If you don’t, bsub
will throw an error.
6.3 Configuring Job Options
The -o
argument is just one of over 70 different options for submitting a job with bsub
. You can imagine the bsub command would get rather long and difficult to keep track of! To make submitting a job simpler and more reproducible, you can include each of your bsub arguments as a line starting with #BSUB
at the beginning of your script within the script, after the shebang.
Here is how we could modify our script:
#!/bin/bash
#BSUB -o logs/simple_job.out
#BSUB -G farm-course
sleep 8 # hold for 8 seconds
echo "This job is running on:"
hostname
If we now re-run the script using bsub simple_job.sh
, the output goes to a file within the log folder named simple_job.out
.
There are many other options we can specify when using LSF, and we will encounter several more of them as we progress through the materials. Here are some of the most common ones (anything in <>
is user input):
Command | Description |
---|---|
-cwd <path> |
working directory used for the job. This is the directory that LSF will use as a reference when running the job. |
-o <path/filename> |
file where the output that would normally be printed on the console is saved in. This is defined relative to the working directory set above. |
-e <path/filename> |
file where the error log is saved in. This is defined relative to the working directory set above. If you don’t specify an error file, the error log will write to the -o output file. |
-G <name> |
group name. This is required on the farm as it logs compute resources used for billing to your group. Ask your labmates for the name. |
-q <name> |
partition name. See details in the following section. |
- n <ncores> |
number of CPUs to be requested. |
-R "select[mem><megabytes_required>] rusage[mem=<megabytes_required>]" |
First part of two options to request custom RAM memory for the job. |
-M<megabytes_required> |
Second part of two options to request custom RAM memory for the job. |
-W<time in the form of [hour:]minute> |
the time you need for your job to run. This is not always easy to estimate in advance, so if you’re unsure you may want to request a good chunk of time. However, the more time you request for your job, the lower its priority in the queue. |
-J <name> |
a name for the job. |
If you don’t specify any options when submitting your jobs, you will get the default configured by the HPC admins. For example, on farm5, the defaults you will get are:
- 10 minutes of running time (equivalent to
-W10
) - normal partition (equivalent to
-q normal
) - 1 CPU (equivalent to
-n 1
) - 100MB RAM (equivalent to
-M100 -R "select[mem>100] rusage[mem=100]"
)
6.3.1 Partitions/Queues
Often, HPC servers have different types of compute node setups (e.g. partitions for fast jobs, or long jobs, or high-memory jobs, etc.). LSF calls these “queues” and you can use the -q
option to choose which queue your job runs on. Usually, which queues are available on your HPC should be provided by the admins.
It’s worth keeping in mind that these partitions have separate queues, so you should always try to choose the partition that is most suited to your job.
You can check the queues available using the command bqueues -l
.
For example, on farm5 we have to partitions with the following characteristics:
- General use partitions:
normal
partition (default) with a maximum 12 hourslong
partition with a maximum 48 hoursbasement
partition with a maximum 30 days; only 300 basement jobs are allowed per user simultaneously.
- Special case partitions:
hugemem
/hugemem-restricted
/teramem
queues for large memory machines (512GB/1TB).yesterday
partition for very urgent jobs that need to be done “yesterday”; only 7 jobs allowed per user simultaneously.small
partition for many, very small jobs (batches 10 jobs together to prevent scheduler overload).
6.4 Getting Job Information
After submitting a job, we may want to know:
- What is going on with my job? Is it running, has it finished?
- If it finished, did it finish successfully, or did it fail?
- How many resources (e.g. RAM) did it use?
- What if I want to cancel a job because I realised there was a mistake in my script?
You can check the status of all your jobs in the queue by using:
bjobs -w
Or get detailed information on one job in particular with:
bjobs -l <JOBID>
This gives you information about the job’s status while it’s running: PEND
means it’s pending (waiting in the queue) and RUN
means it’s running.
Once the job is complete, you can still use bjobs -l <JOBID>
to get job statistics. However, you may find it a bit easier to use bhist
(below) as it includes time and memory usage in a bit easier-to-read way.
bhist -l JOBID
This shows you the status of the job, whether it completed or not, how long it took to run, and how much memory it used. Therefore, this command is very useful to determine suitable resources (e.g. RAM, time) next time you run a similar job.
Alternatively, you can use the bacct
command once a job has completed, which allows displaying this and other information in a more condensed way (and for multiple jobs if you want to).
For example:
bacct JOBID
will give you information about one specific job
You can add other options to the bacct command to glean more or less information with:
-l
for extra information about the job-b
for brief information about the job
You can also select groups of jobs based on certain characteristics, like:
-q <partition>
to select all jobs you’ve run in a certain partition-d
to select all jobs that have completed successfully-e
to select all jobs that had end status of EXIT (failed)-x
to select jobs that raised an exception while running
As a rule, running bacct without the -l option results in aggregate job statistics for the jobs included, while with the -l option results in a long list of separate, per-job statistics.
All the options available with bacct
can be listed using bacct -h
. If you forgot what the job id is, check the stdout file (created with the -o
argument of bsub).
The bacct
command may not be available on every HPC, as it depends on how it was configured by the admins.
On our farm, bacct
is reading information from /usr/local/lsf/work/<cluster_name>/logdir/lsb.acct.*
.
Finally, if you want to suspend a job, you can use:
bstop <JOBID>
Suspended jobs can be restarted using:
bresume <JOBID>
To irreversibly end a job, use:
bkill <JOBID>
All three commands bstop
, bresume
, and bkill
can be applied to all of your own jobs by replacing the job ID with 0
.
It’s impossible to edit other users’ jobs, so don’t worry about accidentally deleting everyone’s Farm jobs!
WATCH OUT
When specifying the -o
option including a directory name, if the output directory does not exist, bjobs
will still run, but produce no output file.
For example, let’s say that we would like to keep our job output files in a folder called “logs”. For the example above, we might set these #BSUB options:
#BSUB -cwd /nfs/users/nfs_USERINITIAL/USERID/hpc_workshop/
#BSUB -o logs/simple_job.log
But, unless we create the logs/
directory before running the job, bjobs
will not produce a file of standard output.
Another thing to note is that you should not use the ~
home directory shortcut with the -cwd
option. For example:
#BSUB -cwd ~/hpc_workshop/
will not reliably work. Instead you should use the full path as shown above.
6.4.1 Exercise
6.5 LSF Environment Variables
One useful feature of LSF jobs is the automatic creation of environment variables. Generally speaking, variables are a character that store a value within them, and can either be created by us, or sometimes they are automatically created by programs or available by default in our shell.
An example of a common shell environment variable is $HOME
, which stores the path to the user’s /home
directory. We can print the value of a variable with echo $HOME
.
The syntax to create a variable ourselves is:
VARIABLE="value"
Notice that there should be no space between the variable name and its value.
If you want to create a variable with the result of evaluating a command, then the syntax is:
VARIABLE=$(command)
Try these examples:
# Make a variable with a path starting from the user's /home
DATADIR="$HOME/hpc_workshop/data/"
# list files in that directory
ls $DATADIR
# create a variable with the output of that command
DATAFILES=$(ls $DATADIR)
When you submit a job with LSF, it creates several variables, all starting with the prefix $LSB_
. One useful variable is $LSB_MAX_NUM_PROCESSORS
, which stores how many CPUs we requested for our job. This means that we can use the variable to automatically set the number of CPUs for software that support multi-processing. We will see an example in Exercise 2.
Here is a table summarising some of the most useful environment variables that LSF creates:
Variable | Description |
---|---|
$LSB_MAX_NUM_PROCESSORS |
Number of CPUs requested with -n |
$LSB_JOBID |
The job ID |
$LSB_JOBNAME |
The name of the job defined with -J |
$LSB_EXECCWD |
The working directory defied with -cwd |
$LSB_JOBINDEX |
The number of the sub-job when running parallel arrays (covered in the Job Arrays section) |
6.5.1 Exercise
6.6 Interactive Login
Sometimes it may be useful to directly get a terminal on one of the compute nodes. This may be useful, for example, if you want to test some scripts or run some code that you think might be too demanding for the login node (e.g. to compress some files).
It is possible to get interactive access to a terminal on one of the compute nodes using the -Is
argument in bsub
. This command takes options similar to the normal bsub
program, so you can request resources in the same way you would when submitting scripts.
For example, to access to 8 CPUs and 10GB of RAM for 10 minutes on one of the compute nodes we would do:
bsub -G farm-course -Is -n8 -R "select[mem>1000] rusage[mem=1000]" -M1000 -q normal -W10 bash
You may get a message saying that LSF is waiting to allocate your request (you go in the queue, just like any other job!). Eventually, when you get in, you will notice that your terminal will indicate you are on a different node (different from the login node). You can check by running hostname
.
After you’re in, you can run any commands you wish, without worrying about affecting other users’ work. Once you are finished, you can use the command exit
to terminate the session, and you will go back to the login node.
Note that, if the time you requested (with the -W
option) runs out, your session will be immediately killed.
6.7 Summary
- Include the commands you want to run on the HPC in a shell script.
- Always remember to include
#!/bin/bash
as the first line of your script.
- Always remember to include
- Submit jobs to the scheduler using
bsub submission_script.sh
. - Customise the jobs by including
#BSUB
options at the top of your script (see table in the materials above for a summary of options).- As a good practice, always define an output file with
#BSUB -o
. All the information about the job will be saved in that file, including any errors.
- As a good practice, always define an output file with
- Check the status of a submitted job by using
bjobs
. You can get detailed information about a job (such as the time it took to run or how many resources it used) usingbjobs -l JOBID
orbacct -l JOBID
orbhist JOBID
. - To cancel a running job use
bkill JOBID
.