Reconciliation Daemon Configuration

Summary information for vovreconciled:
Working directory vnc.swd/vovreconciled
Config file vnc.swd/vovreconciled/config.tcl

The daemon vovreconciled periodically checks all running jobs and looks for resources that are either "Requested/Not Used" or "Not Requested/Used". When the daemon is reasonably sure about the resource mismatch, it will reconcile the grabbed resources list for the running jobs by calling vtk_resourcemap_change_grab.


vovreconciled: Usage Message
  
  DESCRIPTION:
      vovreconciled is a daemon that detects "requested/not_used"
      resources for running jobs and removes them from the
      "grabbed resources" list after a certain amount of time,
      called "RevocationDelay"
  
      The RevocationDelay is set to the smallest value
      found in the following places:
  
      1. The property AGGRESSIVE_SCHEDULING_DELAY
                             (old) attached to the job class object, if defined
      2. The property REVOKE_DELAY
                             (new) attached to the job class object, if defined
      3. The property REVOKE_DELAY
                             attached to the resourceMap, if defined
      4. The value of RESD(revokeDelay), if defined.
  
  
      NO revocation is performed if any of the following are true
        1. If RevocationDelay  < 1
        2. If RevocationDelay  > 10000000
           (or 115d17h)
        3. If the resource is not derived from an external license.
        4. If the resource type is not "License"
        5. If the number of revocations for a license on a job  >
           $RESD(maxRevokes)=50
        6. If the CHANGEGRAB property exceeds RESD(maxPropLength)
        7. The job is younger than the RESD(revokeDelay)
  
      The config.tcl file must exist but it can be empty.
      The config file allows the user to set some additional options
  
      RESD(maxRevokes)    N  N is the maximum number of times a license on a
                             job can be revoked.  default is 50
                             To see the number of times a specific license has
                             been revoked for a given job, view the
                             REVCNT_<license> property that will exist on the
                             job, where <license> is the name of the specific
                             license of interest.
      RESD(maxPropLength) N  N is the number of characters the CHANGEGRAB
                             property can be.  default 130000
      RESD(emailSkips)    N  1 enables/0 disables emailing the job owner and
                             optionally admins that a license could have been
                             revoked but was not, because the maximum number
                             of revoke was reached or the CHANGEGRAB property
                             is too long.  default 1
      RESD(adminEmails)   S  A comma-separated string of userId's that are sent
                             emails on skips. default ""
      RESD(revokeDelay)   T  number of seconds a job must be running before it
                             can be considered to have a license revoked.
                             default 10000000 seconds or
                             115d17h
      RESD(loopTime)      T  How often to run the check on all jobs.
                             default 30 seconds
  
  OPTIONS:
      -v                    -- Increase verbosity.
      -h                    -- Show this help.
      -loop <TIMESPEC>      -- Default 30s
      -inert                -- Run in inert mode where nothing changes
                               for the job.
  
  EXAMPLES:
      % vovreconciled
      % vovreconciled -h
      % vovreconciled -loop 2m
      % vovreconciled -v
  

vovreconciled Operations

This daemon, if activated, runs continuously and checks all running jobs every 30 seconds. It looks at running jobs whose age is greater than the RESD(revokeDelay) or from the most recent resumption. If one of such jobs has an RNU resource (Requested but Not Used) for longer than a certain reconciliation time (Treconcile), then the job is flagged for reconciliation. If the condition persists for 3 consecutive cycles, then the resource is removed from the list of grabbed resources for the job.

The reconciliation time Treconcile is computed as the list of:
  • The value of the property REVOKE_DELAY attached to the resource map (a TIMESPEC)
  • The value of the property REVOKE_DELAY attached to the jobclass (a TIMESPEC)
  • The value of RESD(revokeDelay) in config.tcl

Later on, if a job is found to use a resource that was previously reconciled away, that resource is restored to the job.

Override Delays

For each running job, vovreconciled looks at what it can do only after a certain amount of time has elapsed from the start of the job. This amount of time is called REVOKE DELAY and it is defined, by default, as the least of:
  • The value of the property REVOKE_DELAY in the jobclass
  • The value of the property REVOKE_DELAY in the resource map
  • The global variable RESD(revokeDelay)
Some customers may want to change this behavior. A possibility is to override the procedure VovGetRevokeDelay in the file config.tcl. Both the default implementation of this procedure as well as an example for an override are shown below:
####
#### DEFAULT IMPLEMENTATION
####
proc VovGetRevokeDelay { jobClass res displayMessage } {
    global RESD
    set revokeDelayOld    [VovJobClassGetProperty $jobClass AGGRESSIVE_SCHEDULING_DELAY 10000000]
    set revokeDelayNew    [VovJobClassGetProperty $jobClass REVOKE_DELAY                10000000]
    set revokeDelayResMap [VovResMapGetProperty   $res      REVOKE_DELAY                10000000]

    set revokeDelay [FindLeastDelay $revokeDelayOld $revokeDelayNew $revokeDelayResMap $RESD(revokeDelay)]

    if { $displayMessage > 0 } {
        set msg "    FindLeastDelay\n"
        append msg "\tAggressiveClass:        $revokeDelayOld\n"
        append msg "\tREVOKE_DELAY in class:  $revokeDelayNew (jobclass=$jobClass)\n"
        append msg "\tREVOKE_DELAY in ResMap: $revokeDelayResMap (resource=$res)\n"
        append msg "\tGlobal:                 $RESD(revokeDelay)\n"
        append msg "\tResult revokeDelay:     $revokeDelay"
        VovMessage $msg 5
    }

    return $revokeDelay
}
####
#### EXAMPLE OVERRIDE (to be implemented in vovreconciled/config.tcl
####
proc VovGetRevokeDelay { jobClass res displayMessage } {
    global RESD
    set revokeDelayClass    [VovJobClassGetProperty $jobClass REVOKE_DELAY 10000000]
    if { $revokeDelayClass != 1000000 } { return $revokeDelayClass }

    set revokeDelayResMap [VovResMapGetProperty   $res      REVOKE_DELAY 10000000]
    set revokeDelay [FindLeastDelay $revokeDelayResMap $RESD(revokeDelay)]

    return $revokeDelay
}