vtk_slave
Usage
vtk_slave subcommand ...
Description
This is the command used to retrieve information about slaves and the jobs that
are being executed by the slaves.
The supported subcommands are:
refresh returns nothing and is used to update the cached information on the slaves.
count returns the number of slaves;
get returns the list of slaveids;
info returns information about the slave. It requires to specify a slaveid
as in vtk_slave info $slaveId [options]. The options for this subcommand are
-capacity
-host
-curload
-effload
-maxload
-name
-power
-resource
-status
-timeleft
-job, where the info subcommands are
count to get the number of running jobs
get to get the list of jobIds of the running jobs
info <jobId> [-command|-duration|-percentdone|-start|-time|-user]
Note
This command is difficult to use.
vtk_slave_attach_to_agent
Usage:
vtk_slave_attach_to_agent slaveId agentId
Description:
This procedure is used to attach a BPS slave to a BPS agent
Example:
vtk_slave_attach_to_agent 00358286 00358287
ok
Returns:
"ok" or an error message
vtk_slave_config
Usage:
vtk_slave_config slaveIdOrName variable value
Description:
WHERE: variable can be one of:
* allowcoredump BOOLEAN, mostly used for debugging serious crashes (not gonna happen, anyways)
* cpus INTEGER, sets the number of cpus(cores)
* capacity INTEGER, sets the number of slots (but not more than the max capacity of the slave)
* changenameonreconnect BOOLEAN, if true then the slave will add a _r to its name in the case of reconnection to its vovserver; the default is false
* cleanupchildprocesses BOOLEAN, if true kill all child processes when a job exits, requires slaves to support cgroups; default is false (Linux only)
* close STRING, closes the slave from accepting jobs, displaying the accompanying STRING until opened
* coeff REAL in (0.01,100.0), that is the coefficient for the slave (it is used to divide the raw power to compute the effective power)
* debugjobcontrol BOOLEAN, used for debugging the job control activities (output in the slave logs)
* debugnuma BOOLEAN, used for debugging the NUMA affinity of jobs (output in the slave log)
* debugrlm BOOLEAN, used for debugging the interaction with RLM (output in slave log)
* efftotram INTEGER, expresses the effective total RAM of a slave, in MB
* expirelogs BOOLEAN, simulate the expiration of the slave logs, as it would normally happen at midnight; new logs are immediately created
* rotatelog BOOLEAN, recreate a log file and directory only when missing.
* graceperiod TIMESPEC, for RLM interaction liverecorder COMMAND in (on off save), for activating LiveRecorder by UndoDB
* liverecorder.logsize INTEGER, to specify the max value of the log size in bytes
* maxload REAL, the maximum load acceptable for the slave (if the effective load passes this threshold, the slave stop accepting jobs)
* maxwaitnostart TIMESPEC, the maximum wait we allow for jobs that have trouble starting (meaning the fork()/exec() takes an enourmous time); the default is 60 seconds
* maxwaittoreconnect TIMESPEC, the maximum wait after losing connection to server before attempting reconnection again
* message STRING, change the message for the slave
* message,sys STRING, change the system message for the slave (same as 'message')
* message,usr STRING, change the user message for the slave
* mindisk INTEGER, specify the minimum amount of disk space in MB or in percentage(0%-99%, for example, 10%) in /tmp for the slave to be active
* minramfree INTEGER, specify the minimum amount in MB of free RAM for the slave to be active
* name STRING, change the name of the slave; illegal or duplicate names are silently discarded
* numabindtosocket BOOLEAN, if true (default) the NUMA code allows the process to roam all cores in the same physical socket, else it binds the process to very specific cores
* open STRING, opens a closed slave so that it will accept jobs, displaying the accompanying STRING until overwritten by another message
* printstatus IGNORED, causes the slave to print its own status information to the log file
* ramsentry BOOLEAN, used to activate the RAM sentry functionality, i.e. the functionality that suspends small jobs to allow large jobs to finish
* rawpower INTEGER, change the raw power of the slave
* refresh IGNORED, discard all caches in slave, including environments and user credentials
* resources STRING, set the resources for this slave
* retrychdir INTEGER, sets the number of times chdir() is to be tried (assumes the file system is so stressed that even chdir() has trouble)
* retrychdirbackoff REAL, for retry chdir() the interval between tries increases according to this parameter
* retrychdirsleep INTEGER, sleep time between the first and second attempt at chdir()
* rlmlicense STRING, set the value of RLM_LICENSE (not very much used anymore)
* setenv VAR=VALUE, define or modify an environment variable in the top-level vovslave
* shutdowncancel IGNORED, used to rescue slaves that have been stopped (with STOP) but have not yet exited because they are still running jobs
* unsetenv VAR, unset the named environment variable update INTEGER, set the update interval for the slave, typically 60 seconds
* verbose INTEGER in (0,4), set the verbosity level of the slave
* waitafterjc INTEGER, seconds to wait after doing a job control action, except for EXT (before doing another one, the slave never stops)
* waitafterjcext INTEGER, seconds to wait after doing a EXT job control
and value can take different formats depending on the variable.
Examples:
vtk_slave_config 00123456 maxload 3.3
vtk_slave_config 00123456 capacity 1
vtk_slave_create
Usage:
vtk_slave_create array
Description:
The array contains the information of the slave create, including:
* host
* name
* group
* slavetype (one of "normal", "BPS_slave", "BPS_agent")
* capacity
* numjobs
* power
* timeleft
* resources
* status
* curload
* effload
* maxload
* message,usr
* message,sys
The following are for BPS slave only:
* status
- 0: OK
-1: Warning
-2: Unknown
* time left
* number of running jobs
* power
Examples:
set slave(host) alpaca
set slave(capacity) 2
set slave(resources) "finfarm RAM/100"
vtk_slave_create slave
00358286
Returns:
The slave Id of the slave just created, or an error message.
vtk_slave_delete
Usage:
vtk_slave_delete slaveId
Description:
This procedure is used to delete a slave object from server. Other names for this procedure are vtk_slave_del and vtk_slave_forget.
Example:
vtk_slave_delete 00358286
ok
Returns
"ok" or an error message
vtk_slave_find
Usage:
vtk_slave_find name
Description:
Examples:
vtk_slave_find alpaca
00358285
vtk_slave_find tiger
0
Returns:
The slave Id of the slave with the specified name, or 0 not found.
vtk_slave_get
Usage:
vtk_slave_get slaveId array
Description:
This procedure fills the array with the slave properties.
Examples:
vtk_slave_get 00358294 slave
ok
parray slave
slave(capacity) = 1
slave(cpus) = 1
slave(curload) = 0.16
slave(effload) = 0.16
slave(host) = alpaca
slave(lastupdate) = 1159547044
slave(maxload) = 1.50
slave(message) =
slave(name) = alpaca
slave(numjobs) = 0
slave(power) = 143678
slave(resources) = alpaca:0.0 RAM/753 linux i686 unix
slave(resources,spec) = alpaca:0.0 @RAM@ @VOVARCH@
slave(slavetype) = normal
slave(status) = READY
slave(timeleft) = Unlim.
slave(persistent) = 0
Returns:
"ok" or an error message
vtk_slave_modify
Usage:
vtk_slave_modify slaveId array
Description:
This procedure modifies the slave indentified by slaveId with the properties set in the array The usage of the array is the same as in vtk_slave_create.
Examples:
set slave(host) alpaca
set slave(capacity) 2
set slave(resources) "finfarm RAM/100"
vtk_slave_modify 00382732 slave
ok
Returns:
"ok" or an error message
vtk_slave_reserve
Usage:
vtk_slave_reserve slaveId [OPTIONS]
Description:
Reserve a slave. Without any option, this procedure clears the reservation of this slave.
Options:
-user username
Description: Reserve for specified user or users
-group groupname
Description: Reserve for specified fairshare group or groups
-osgroup groupname
Description: Reserve for specified Unix (or OS) group or groups
-jobclass jobclass
Description: Reserve for specified jobclass or jobclasses
-jobproj jobproj
Description: Reserve for specified jobproj
-bucketid bucketid_list
Description: Reserve for specified bucket id or ids
-id id_list
Description: Reserve for specified id or ids
-duration timespec
Description: Reserve this slave for specified duration. forever is an accepted value.
-start timespec
Description: Set reservation start time
-end timespec
Description: Set reservation end time
Examples:
vtk_slave_reserve 230
ok
vtk_slave_reserve 230 -user john -duration 3h
ok
vtk_slave_reserve 230 -user john,mary -group alpha -start 1049827481 -duration 2w
Returns:
"ok" or an error message
vtk_slave_set_timeleft
Usage
vtk_slave_set_timeleft time_left
Description
This procedure is used only from within a vovslave.
The argument specifies the amount of time left for the slave,
so that the server can decide to send jobs whose expected duration
does not exceed the time left.
If the time_left is "UNLIMITED", there is no limit on the time.
If the time_left is 0, the slave is effectively suspended.
Where time_left is a time specification or the string "UNLIMITED".
Returns
Nothing.