Interacting With Daemons
Launch this tutorial in a Jupyter Notebook on Binder:
In this module, we’ll look at how the HTCondor Python bindings can be used to interact with running daemons.
As usual, we start by importing the relevant modules:
[1]:
import htcondor
Configuration
The HTCondor configuration is exposed to Python in two ways:
The local process’s configuration is available in the module-level
param
object.A remote daemon’s configuration may be queried using a
RemoteParam
The param
object emulates a Python dictionary:
[2]:
print(htcondor.param["SCHEDD_LOG"]) # prints the schedd's current log file
print(htcondor.param.get("TOOL_LOG")) # print None, since TOOL_LOG isn't set by default
/home/jovyan/.condor/local/log/SchedLog
None
[3]:
htcondor.param["TOOL_LOG"] = "/tmp/log" # sets TOOL_LOG to /tmp/log
print(htcondor.param["TOOL_LOG"]) # prints /tmp/log, as set above
/tmp/log
Note that assignments to param
will persist only in memory; if we use reload_config
to re-read the configuration files from disk, our change to TOOL_LOG
disappears:
[4]:
print(htcondor.param.get("TOOL_LOG"))
htcondor.reload_config()
print(htcondor.param.get("TOOL_LOG"))
/tmp/log
None
In HTCondor, a configuration prefix may indicate that a setting is specific to that daemon. By default, the Python binding’s prefix is TOOL
. If you would like to use the configuration of a different daemon, utilize the set_subsystem
function:
[5]:
htcondor.param["TEST_FOO"] = "foo" # sets the default value of TEST_FOO to foo
htcondor.param["SCHEDD.TEST_FOO"] = "bar" # the schedd has a special setting for TEST_FOO
[6]:
print(htcondor.param['TEST_FOO']) # default access; should be 'foo'
foo
[7]:
htcondor.set_subsystem('SCHEDD') # changes the running process to identify as a schedd.
print(htcondor.param['TEST_FOO']) # since we now identify as a schedd, should use the special setting of 'bar'
bar
Between param
, reload_config
, and set_subsystem
, we can explore the configuration of the local host.
Remote Configuration
What happens if we want to test the configuration of a remote daemon? For that, we can use the RemoteParam
class.
The object is first initialized from the output of the Collector.locate
method:
[8]:
master_ad = htcondor.Collector().locate(htcondor.DaemonTypes.Master)
print(master_ad['MyAddress'])
master_param = htcondor.RemoteParam(master_ad)
<172.17.0.2:9618?addrs=172.17.0.2-9618&alias=fa6c829ace67&noUDP&sock=master_16_de02>
Once we have the master_param
object, we can treat it like a local dictionary to access the remote daemon’s configuration.
NOTE that the htcondor.param
object attempts to infer type information for configuration values from the compile-time metadata while the RemoteParam
object does not:
[9]:
print(repr(master_param['UPDATE_INTERVAL'])) # returns a string
print(repr(htcondor.param['UPDATE_INTERVAL'])) # returns an integer
'5'
5
In fact, we can even set the daemon’s configuration using the RemoteParam
object… if we have permission. By default, this is disabled for security reasons:
[10]:
master_param['UPDATE_INTERVAL'] = '500'
---------------------------------------------------------------------------
HTCondorReplyError Traceback (most recent call last)
/tmp/ipykernel_49/743935840.py in <module>
----> 1 master_param['UPDATE_INTERVAL'] = '500'
/opt/conda/lib/python3.9/site-packages/htcondor/_lock.py in wrapper(*args, **kwargs)
68 acquired = LOCK.acquire()
69
---> 70 rv = func(*args, **kwargs)
71
72 # if the function returned a context manager,
HTCondorReplyError: Failed to set remote daemon parameter.
Logging Subsystem
The logging subsystem is available to the Python bindings; this is often useful for debugging network connection issues between the client and server.
NOTE Jupyter notebooks discard output from library code; hence, you will not see the results of enable_debug
below.
[11]:
htcondor.set_subsystem("TOOL")
htcondor.param['TOOL_DEBUG'] = 'D_FULLDEBUG'
htcondor.param['TOOL_LOG'] = '/tmp/log'
htcondor.enable_log() # Send logs to the log file (/tmp/foo)
htcondor.enable_debug() # Send logs to stderr; this is ignored by the web notebook.
print(open("/tmp/log").read()) # Print the log's contents.
Sending Daemon Commands
An administrator can send administrative commands directly to the remote daemon. This is useful if you’d like a certain daemon restarted, drained, or reconfigured.
Because we have a personal HTCondor instance, we are the administrator - and we can test this out!
To send a command, use the top-level send_command
function, provide a daemon location, and provide a specific command from the DaemonCommands
enumeration. For example, we can reconfigure:
[12]:
print(master_ad['MyAddress'])
htcondor.send_command(master_ad, htcondor.DaemonCommands.Reconfig)
<172.17.0.2:9618?addrs=172.17.0.2-9618&alias=fa6c829ace67&noUDP&sock=master_16_de02>
09/19/23 21:41:27 SharedPortClient: sent connection request to <172.17.0.2:9618> for shared port id master_16_de02
[13]:
import time
time.sleep(1)
log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-4:])
['09/19/23 21:41:27 Sent SIGHUP to NEGOTIATOR (pid 20)\n', '09/19/23 21:41:27 Sent SIGHUP to SCHEDD (pid 21)\n', '09/19/23 21:41:27 Sent SIGHUP to SHARED_PORT (pid 18)\n', '09/19/23 21:41:27 Sent SIGHUP to STARTD (pid 24)\n']
We can also instruct the master to shut down a specific daemon:
[14]:
htcondor.send_command(master_ad, htcondor.DaemonCommands.DaemonOff, "SCHEDD")
time.sleep(1)
log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-1])
09/19/23 21:41:28 SharedPortClient: sent connection request to <172.17.0.2:9618> for shared port id master_16_de02
09/19/23 21:41:28 Can't open directory "/etc/condor/passwords.d" as PRIV_ROOT, errno: 13 (Permission denied)
09/19/23 21:41:28 Can't open directory "/etc/condor/passwords.d" as PRIV_ROOT, errno: 13 (Permission denied)
09/19/23 21:41:28 The SCHEDD (pid 21) exited with status 0
Or even turn off the whole HTCondor instance:
[15]:
htcondor.send_command(master_ad, htcondor.DaemonCommands.OffFast)
time.sleep(10)
log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-1])
09/19/23 21:41:29 SharedPortClient: sent connection request to <172.17.0.2:9618> for shared port id master_16_de02
09/19/23 21:41:29 **** condor_master (condor_MASTER) pid 16 EXITING WITH STATUS 0
Let’s turn HTCondor back on for future tutorials:
[16]:
import os
os.system("condor_master")
time.sleep(10) # give condor a few seconds to get started