Manage Processes

This section describes how to use the command line and find all processes that are not currently managed by Accelerator. VOV can use a vovtasker to collect information about all processes from all hosts in a farm.

Processes that are descendents of vovtasker, orphans of vovtasker, and external processes can also be found. Foster jobs can be created for discovered orphans; these jobs can be accounted for by a tasker on the same host, and tracked for the rest of their lifetime.

vovprocessmgr: Usage Message
      % vovprocessmgr [OPTIONS]
  Report on and manage processes on hosts where an Altair Engineering vovtasker
  is running, for example, in Accelerator.
      -h             -- Show brief help.
      -v             -- Increase verbosity.
      -w             -- Wide output (tab-separated, no truncation in names).
      -refresh [-orphans [-host HOST[,HOST]...] [-nohost HOST[,HOST]...]]
                        -- Refresh the process info. Refreshes all process info
                        unless the -orphans option is also passed, which
                        refreshes the process info for orphaned processes only.
                        The -host/-nohost options apply when refreshing orphaned
                        processes only, otherwise, all hosts are included.
                        Orphaned processes are determined by the presence of the
                        VOV_JOBID variable in the environment of the process.
                        Note that this is an asynchronous operation that is sent
                        to remote slaves, requesting them to gather and send
                        process information to the server. The timeliness of the
                        response depends on the loading of both the slaves and 
                        of the server. For this reason, some amount of time
                        should be allowed between a refresh request and
                        reporting on processes of any type. For reports
                        involving only a few slaves, this could be measured in
                        seconds. For requests involving hundreds or thousands of
                        hosts, it may take several minutes for every slave to
                        report in. Refreshing process info is an expensive
                        operation that can result in a significant amount of
                        communication and loading on the vovserver process and
                        therefore should be used only when necessary.
      -external      -- Filter to processes that are not an descendant of slave.
      -descendants   -- Filter to current descendant processes of slave.
      -orphans       -- Filter to former descendant processes of slave. For
                        accurate results, refresh the process info using the
                        -orphans option prior to running an orphan report. An
                        orphan report will also include fostered jobs as well.
                        Note that if orphan processes are common, it is
                        recommended to enable automatic child process cleanup
                        via the slave.childProcessCleanup configuration
                        parameter in the policy.tcl file.
      -fostered      -- Filter to orphans currently being fostered.
      -all           -- Show all processes.
      -user "USER[,USER]..."              -- Filter to specified users.
      -host "HOST[,HOST]..."              -- Filter to specified hosts.
      -exe  "executableName[,exname]..."  -- Filter to specified executable
      -noheader      -- Suppress header.
      -noresv        -- Exclude reserved slaves.
      -noexternal    -- Exclude processes that are not an descendant of slave.
      -nodescendants -- Exclude current descendant processes of slave.
      -noorphans     -- Exclude former descendant processes of slave.
      -nofostered    -- Exclude orphans currently being fostered.
      -nouser "USER[,USER]..."              -- Exclude specified users.
      -nohost "HOST[,HOST]..."              -- Exclude specified hosts.
      -noexe  "executableName[,exname]..."  -- Exclude specified executable
      -age   "TIMESPEC"  -- default age 10m.
      -maxrecursion "N"  -- Limit in recursive check of parents (default 100).
      -foster            -- Create a foster job for each top-most orphaned
                            process. Note that if orphan processes are common, 
                            it is recommended to enable automatic child process
                            cleanup via the slave.childProcessCleanup
                            configuration parameter in the policy.tcl file.
      -clear             -- Forget all information about processes from the 
                            server (frees up memory). Must be the only option.
     % vovprocessmgr -refresh
     % vovprocessmgr -refresh -orphans
     % vovprocessmgr -orphans
     % vovprocessmgr -all -noexternal
     % vovprocessmgr -external -onlyuser john,mary,bob
     % vovprocessmgr -orphans -age 3h
     % vovprocessmgr -clear
The process information is accumulated in the vovserver and is released after approximately 5 minutes or until it is refreshed, whichever occurs first. To refresh information about all processes, use the following commands:
% vovproject enable vnc
% vovprocessmgr -refresh
vovprocessmgr sends a message to all the taskers to update the information about all processes and deliver the data collection to the vovserver. Sending all the data may take a few seconds.
Note: This command only works for the owner of Accelerator.
All processes can now be computed that are not children of a vovtaskerroot process with:
% vovprocessmgr -orphans
For example:
> vovprocessmgr -orphans
vovprocessmgr 07/08/2016 11:46:18: message: Analyzing 438 processes on 1 hosts that are older than 10m00s

  Mininum process age:	10m00s
  Exclude user: 	apache avahi canna daemon dbus gdm
			haldaemon haldeamon htt mysql named nobody
			ntp oracle postfix postgres root rpc
			rpcuser smmsp xfs
  Filter to orphans

Host          Pid      User        Executable      Age      State    RAM      CPU      Relation
titanus       23382    john        vovsh           S        10       10s      orphan
titanus       23383    john        vovsh           S        9        0s       orphan
titanus       23421    john        postgres        14d00h   S        212      5s       orphan
To create foster jobs for discovered orphans:
% vovprocessmgr -orphans -foster
To track fostered jobs:
% vovprocessmgr  -fostered
To find only the processes that are older than a specified time, for example 1 day, use the option -age as shown below:
% vovprocessmgr -orphans -age 1d

Removing unwanted processes from the farm can be necessary. For security, vovprocessmgr only provides the list of suspected orphans. Only an administrator with root privileges has the authority to access the machines to kill the processes that were listed in the information.

Stopped Taskers and Foster Jobs

Taskers account for jobs running on a stopped tasker that is on the same host. When a tasker is started, if there is a matching tasker in the stopped condition (waiting on its jobs to finish), the new tasker will adopt any jobs on the stopped tasker by using foster jobs. This prevents host overloading.