Accelerator Plus Modulation

When using Accelerator Plus, jobs launched in Accelerator Plus are essentially bundled into groups that are run by vovtaskers on hosts allocated by the base scheduler. This means that it is harder to depend on job retirement to free up slots in the base scheduler, because the bundle of jobs is of course many times longer than the individual jobs.

This section describes a means of freeing up slots more quickly by preempting the vovtaskers that have been assigned to Accelerator Plus, based on FairShare statistics. A preempted vovtasker will stop accepting jobs (tasker status "DONE") and will still finish any running job.

The preemption rule drives the system and this is the main place to influence the systems behavior. A sample rule is found in $THISGIT/vovpreempt/config.tcl and this should be appended to any existing preemption rules in XXXX.swd/vovpreemptd/config.tcl.

While the rule can be tuned there are some key elements that must be retained. Preemptable jobs should be have the predicate JOBNAME~${WXQueueName} and the method should send SIGUSR2 but only to the vovtasker process: 0:*:EXT,SIGUSR2,vovtasker.

The preemptable job sort predicate is "FS_EXCESS_RUNNING DESC, PRIORITY, AGE DESC" which chooses vovtaskers ordered on greatest excess FairShare, lowest priority and oldest age.

Here is an example of a preemption rule for job modulation in Accelerator Plus:
# Taken from $VOVDIR/etc/config/vovpreemptd/config_wx_modulation.tcl
set WXQueueName wx
VovPreemptRule \
    -pool     "WXJobModulation" \
    -rulename "fastFairshare_$WXQueueName" \
    -ruletype "FAST_FAIRSHARE" \
    -method   "0:*:EXT,SIGUSR2,vovtasker" \
    -killage   0 \
    -numjobs  10 \
    -maxattempts 1 \
    -waitingfor "HW" \
    -preempting  "JOBNAME~${WXQueueName} FS_EXCESS_RUNNING<0" \
    -preemptable "JOBNAME~${WXQueueName} FS_EXCESS_RUNNING>0 FSRANK9>>@FSRANK9@" \
    -resumeres "" \
    -enable     1 \


This is a dynamic system with quite a few moving parts and this makes monitoring a bit challenging. Some suggestions follow.

Turn on the debug option in the preemption rule - the preemption activity will be logged in a property attached to the preemption rule object. Use the global preemption debug flag to get the info also in the main vovserver log.
% vovsh -x 'vtk_server_config   set_debug_flag PreemptRules'
% vovsh -x 'vtk_server_config reset_debug_flag PreemptRules'