Submitting Cluster Jobs

From UMass GHPCC User Wiki
Jump to: navigation, search

Submitting your first job

There are two different kinds of jobs you can run, interactive and batch. Interactive jobs are ones where you get a command line on a compute node. This is great for compiling software or running programs that require you watch progress as it runs. Batch jobs are ones that you can submit, walk away, and get the results later. Since you interact with them in different ways, the way you submit these kinds of jobs are also different


Submitting batch jobs

Submitting a batch job is easy:

[mfk@ghpcc06 ~]$ bsub -q long hostname

This will submit a job to the long queue and run the command hostname. When the job output is sent to you via e-mail, it will contain the name of the host your job ran on.

As an example if we have a simple script which sleeps for 60 seconds: wait.sh

#!/bin/bash

echo "Sleep 60"
sleep 60

Note: commands submitted to LSF are just like if they were run from a command prompt. Shell scripts and other applications either have to be in $PATH or have the path explicitly listed. In addition, they have to have the execute bit set (chmod u+x wait.sh) in order for them to run. If you aren't sure, start an interactive job (see below) and try to run your command.

We can submit this job via:

$ bsub < ./wait.sh

or

$ bsub < $HOME/wait.sh

Note: As we have not defined memory or time they will be defaulted as shown below to 1GB and 60 minutes respectively

Memory Usage not specified, setting to 1GB: -R rusage[mem=1024]
Job runtime not specified, setting to 60 minutes: -W 60
Job <5721> is submitted to default queue <long>.

Please note that the path must be specified using ./ unless it is already included in $PATH

Batch Job Options

   #BSUB -n X                   # Where X is in the set {1..X}
   #BSUB -J Bowtie_job          # Job Name
   #BSUB -o myjob.out           # Append to output log file
   #BSUB -e myjob.err           # Append to error log file
   #BSUB -oo myjob.log          # Overwrite output log file 
   #BSUB -eo myjob.err          # Overwrite error log file 
   #BSUB -q short               # Which queue to use {short, long, parallel, GPU, interactive}
   #BSUB -W 0:15                # How much time does your job need (HH:MM)
   #BSUB -L /bin/sh             # Shell to use

LSF environment variables

   LSB_ERRORFILE: Name of the error file
   LSB_JOBID:     Batch job ID assigned by LSF.
   LSB_JOBINDEX:  Index of the job that belongs to a job array.
   LSB_HOSTS:     The list of hosts that are used to run the batch job.
   LSB_QUEUE:     The name of the queue the job is dispatched from.
   LSB_JOBNAME:   Name of the job.
   LS_SUBCWD:     The directory where the job was submitted.

Submitting interactive jobs

In starting an interactive job, the process is slightly different, as are the results:

[mfk@ghpcc06 ~]$ bsub -q interactive -Is bash
Job <5311> is submitted to queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on c05b10.umassrc.org>>
[mfk@c05b10 ~]$

As you can see, the queue has changed from long to interactive. The additional command line option of -Is specifies you want an interactive session, and the command that is being run is bash, which you can change if you prefer tcsh or other available shells.

You will not get an e-mail output when this job completes, and you'll also notice that the shell prompt has changed to show the shell is now running on c05b10.

There is an upper limit of 8 wall hours for interactive jobs.

Submitting a more complex job

If you have a script you run routinely and do not wish to re-enter the bsub command line options, you can include them in the script and bsub will use those options.

[mfk@ghpcc06 ~]$ cat submit.sh
#!/bin/bash

#BSUB -n 4
#BSUB -R rusage[mem=2048] # ask for 2GB per job slot, or 8GB total
#BSUB -W 0:10
#BSUB -q long # which queue we want to run in
#BSUB -R "span[hosts=1]" # All job slots on the same node (needed for threaded applications)

hostname
[mfk@ghpcc06 ~]$ bsub < submit.sh
Job <10610> is submitted to queue <long>.
[mfk@ghpcc06 ~]$

Submitting jobs with variables as input

The following command line execution will submit the script "bowtie.sh" with input files passed in using a for loop

Let's assume that we are in a directory which contains the following files:

sample1.fastq
sample2.fastq

We simply loop over these files using the bash for loop, and submit two jobs to the short queue running bowtie.sh with sample1.fastq as the first input, and sample2.fastq as the second input file to be executed

for F in `ls -1` ; do
     bsub -n 2 -q short bowtie.sh "$F" ;
done

Note the $F is passed in as the argument to the bowtie.sh script


Submitting an array job

Say you have 1000 files that you need to process and they're numbered sequentially file.1 through file.1000. A job array allows you to have a single command process all 1000 files from a single submission line. This is a bit more complex than the above jobs and requires more knowledge of shell programming and environment variables.

From the command line, you can submit a job like this:

[mfk@ghpcc06 ~]$ bsub -W 1:0 -R "rusage[mem=1024]" -J "myarray[1-1000]" "process file.\$LSB_JOBINDEX"

Like other jobs, we include the -W and -R to specify wall time and memory resources (1 hour and 1024M per job). The -J says we're starting a job array called 'myarray' and be numbered sequentially from 1 through 1000 inclusively. You can add additional ranges or number in there by separating them with a comma, so -J "myarray[1-1000,1500]" would work. Our command here is enclosed in double quotes and the index in the array is stored in the $LSB_JOBINDEX environment variable. Because of the way that the bash shell handles environment variables, we have to escape it with a \ so the value is put in at the proper time. By the time the job runs on a node, the command now looks like:

process file.1000

And runs without error.

Since this can get very confusing very quickly, we advise you instead use a shell script and submit that:

#!/bin/bash

#BSUB -W 1:0
#BSUB -R rusage[mem=1024]
#BSUB -J "myarray[1-1000]"
#BSUB -o logs/out.%J.%I
#BSUB -e logs/err.%J.%I
process file.$LSB_JOBINDEX

You can then submit it to the cluster as:

[mfk@ghpcc06 ~]$ bsub < sleep.sh

Resource Reservations

Each job is different. Some require lots of memory, some require lots of CPU, some require local disk space, and some require combinations of all of those, or other resources like access to a GPU. Using a resource request ensures your job gets dispatched to the node that best matches what you ask for and prevents nodes from being oversubscribed and affecting everyone's job.

Requesting additional job slots

Non-threaded jobs

Applications like MPI allow you to have job slots on different systems but communicate with each other via a network-based mechanism that means each of the cores can be on separate systems. In this case, it usually doesn't matter where the cores are. By using -n you can request a number of cores for your job. LSF will try to put all of the cores on the same system, but there's no guarantee this is the case (see Threaded jobs below if you want that).

In order to request 4 cores, a run time of 50 minutes, and 1GB of memory per job slot:

bsub -q parallel -n 4 -W 0:50 -R "rusage[mem=1024]" ./myparalleljob.sh

Threaded jobs

Threaded applications work by using multiple cores on the same system (this kind of application usually says it uses POSIX threads or pthreads). By using a resource reservation asking for a span of hosts to be one, you can ensure that all of the job slots you get are on the same system.

bsub -q short -n 4 -W 0:50 -R "span[hosts=1]" -R "rusage[mem=1024]" ./mytheadedjob.sh

Example

Submitting a job which requires four contiguous cores on the same node, 40GB of RAM, and a run time of 50 minutes:

bsub -q short -n 4 -W 0:50 -R "span[hosts=1]" -R rusage[mem=40000] < bowtie.sh

Submitting GPU Based Jobs

For a user to correctly submit a gpu job, they’ll need to specify the following (in addition to memory, wallclock, etc):

-q gpu -R rusage[ngpus_excl_p=X]

Where X is the number of gpu devices needed.

Example script:

#!/bin/sh

# JOB Name 1GPUTest
#BSUB -J 1GPUtest

# GPU queue
#BSUB -q gpu

# 1GB RAM, 1 GPU device
#BSUB -R "rusage[mem=1024,ngpus_excl_p=1] span[hosts=1]"

# Wall time of 60 minutes
#BSUB -W 60

#BSUB -o "/home/CHANGE_TO_YOUR_USERNAME/%J.out"
#BSUB -e "/home/CHANGE_TO_YOUR_USERNAME/%J.err"

# If you need openmpi load the module
module load openmpi/2.0.1

# Let's assume you are using CUDA as well
module load cuda/8.0.44

# our executable 
$HOME/bin/MoteCarloGPU

Job requests that span multiple nodes will allocate ngpus_excl_p=X gpus per node. For example:

#!/bin/bash
#BSUB -q gpu
#BSUB -n 8
#BSUB -R "rusage[mem=1024]"
#BSUB -W 60

# Allocate one GPU per node
#BSUB -R "rusage[ngpus_excl_p=1]"

# Set one core per node
#BSUB -R "rusage[ptile=1]"

module load cuda/8.0.64
$HOME/bin/someprogram

Will run on 8 nodes with a total of 8 cores, 8192MB of memory, and 8 GPUs. Each node will have access to 1 core, 1 GPU, and 1024MB of memory.

If ptile=1 is changed to ptile=2 the job will run on 4 nodes with 8 cores, 4 gpus, and 8192MB of memory. Each node will then have access to 2 cores, 1 GPU, and 2048MB of memory.