Configure an LSF or OpenLava Queue

The configuration of an LSF or OpenLava queue uses the procedure LA::AddSite.

The following is a complete example of a configuration. Although the example uses LSF, it also applies to OpenLava.

The scenario is the following:
  • We have an LSF master running on the machine uni00. We are going to use that same machine to run bjobs. We are going to login to that machine as user "cadmgr".
  • The LSF master is running in the timezone called PST8PDT (US-LosAngeles)
  • We are going to call the cluster "lsfUni"
  • We are going to setup a directory to run bjobs and the location of such directory as seen from uni00 is /<remote_host_directory>/lsf_la_dir.
set lsfQueue            "lsfUni"
set laDirOnRemoteHost   /some/dir/on/remote/host/lsf_la_dir
set remoteHost          "uni00"
set remoteUser          "cadmgr"
set lsfWeight           120
set lsfDur              0
set lsfTZ               "PST8PDT"

LA::AddSite lsf1@uni00 $lsfQueue {} \
    -lsf  \
    -lsfdur        $lsfDur \
    -remotehost    $remoteHost \
    -user          $remoteUser \
    -lsfdir        $laDirOnRemoteHost \
    -defaultweight $lsfWeight \
    -timezone      $lsfTZ

Options

-lsf
LSF or OpenLava cluster, rather than an Accelerator instance.
-lsfdur
Controls the interpretation of "duration=N" inside of rusage[] statement. If the value of -lsfdur is 1, then the duration is honored, else it is ignored. The default is 0, i.e. to ignore the duration statement.
-remotehost and -user
Together specify a user,host pair used to run bjobs. The system needs passwordless ssh access to that host for that user.
-lsfdir
Specifies a directory that must exist on the remote host.
-defaultWeight
specifies the weight of this cluster relative to the weights of other LSF or OpenLava clusters or Accelerator instances. For more information on weight, please see the documentation for the LA::AddSite command.
-timezone
Specifies the Time Zones to be used when running bjobs. This is normally the timezone of the Allocator server.

Steps Used to Sample LSF

Based on the above configuration, vovlad creates a periodic job that is used to to sample the LSF or OpenLava cluster and to import the data into Accelerator. This job looks as following:
vovlalsf $lsfQueue $removeHost $laDirOnRemoteHost $remoteUser $lsfDur $lsfTZ vovlabjobs.sh
This periodic jobs runs a few scripts to get the bjobs details from LSF. To set the allocations on LSF through elim:
vovlagetbjobs -host $remoteHost -out .../$lsfQueue.bjobs.tz -queue $lsfQueue \
    -u $remoteUser -dir $laDirOnRemoteHost -script vovlabjobs.sh
This is followed by another job that translates the output of bjobs into information suitable for Accelerator.
env TZ=$lsfTZ vovlatranslatebjobs -queue $lsfQueue -dur $lsfDur -f \
    .../bjobs.last.gz -host $remoteHost -no-tz-correct

Preparing the Remote Directory where the Sampling is Done

On the remote machine that has access to "bjobs", you need to setup a directory with a few scripts to perform the periodic sampling of LSF. The templates for those scripts can be found in $VOVDIR/etc/la/.

Copy the file $VOVDIR/etc/la/vovlabjobs.sh to the remote directory and make it executable. It is perfectly normal to assume that you will have to edit the file to make it work in your environment.

The output of bjobs -l needs to be preprocessed by the script $VOVDIR/etc/la/vovlapreprocessbjobs which should also be copied to the remote directory and should be made executable.