Interacting With Daemons

Launch this tutorial in a Jupyter Notebook on Binder: Binder

In this module, we’ll look at how the HTCondor Python bindings can be used to interact with running daemons.

As usual, we start by importing the relevant modules:

[1]:
import htcondor

Configuration

The HTCondor configuration is exposed to Python in two ways:

  • The local process’s configuration is available in the module-level param object.

  • A remote daemon’s configuration may be queried using a RemoteParam

The param object emulates a Python dictionary:

[2]:
print(htcondor.param["SCHEDD_LOG"])   # prints the schedd's current log file
print(htcondor.param.get("TOOL_LOG")) # print None, since TOOL_LOG isn't set by default
/home/jovyan/.condor/local/log/SchedLog
None
[3]:
htcondor.param["TOOL_LOG"] = "/tmp/log" # sets TOOL_LOG to /tmp/log
print(htcondor.param["TOOL_LOG"])       # prints /tmp/log, as set above
/tmp/log

Note that assignments to param will persist only in memory; if we use reload_config to re-read the configuration files from disk, our change to TOOL_LOG disappears:

[4]:
print(htcondor.param.get("TOOL_LOG"))
htcondor.reload_config()
print(htcondor.param.get("TOOL_LOG"))
/tmp/log
None

In HTCondor, a configuration prefix may indicate that a setting is specific to that daemon. By default, the Python binding’s prefix is TOOL. If you would like to use the configuration of a different daemon, utilize the set_subsystem function:

[5]:
htcondor.param["TEST_FOO"] = "foo"         # sets the default value of TEST_FOO to foo
htcondor.param["SCHEDD.TEST_FOO"] = "bar"  # the schedd has a special setting for TEST_FOO
[6]:
print(htcondor.param['TEST_FOO'])        # default access; should be 'foo'
foo
[7]:
htcondor.set_subsystem('SCHEDD')         # changes the running process to identify as a schedd.
print(htcondor.param['TEST_FOO'])        # since we now identify as a schedd, should use the special setting of 'bar'
bar

Between param, reload_config, and set_subsystem, we can explore the configuration of the local host.

Remote Configuration

What happens if we want to test the configuration of a remote daemon? For that, we can use the RemoteParam class.

The object is first initialized from the output of the Collector.locate method:

[8]:
master_ad = htcondor.Collector().locate(htcondor.DaemonTypes.Master)
print(master_ad['MyAddress'])
master_param = htcondor.RemoteParam(master_ad)
<172.17.0.2:9618?addrs=172.17.0.2-9618&alias=fa6c829ace67&noUDP&sock=master_16_de02>

Once we have the master_param object, we can treat it like a local dictionary to access the remote daemon’s configuration.

NOTE that the htcondor.param object attempts to infer type information for configuration values from the compile-time metadata while the RemoteParam object does not:

[9]:
print(repr(master_param['UPDATE_INTERVAL']))      # returns a string
print(repr(htcondor.param['UPDATE_INTERVAL']))    # returns an integer
'5'
5

In fact, we can even set the daemon’s configuration using the RemoteParam object… if we have permission. By default, this is disabled for security reasons:

[10]:
master_param['UPDATE_INTERVAL'] = '500'
---------------------------------------------------------------------------
HTCondorReplyError                        Traceback (most recent call last)
/tmp/ipykernel_49/743935840.py in <module>
----> 1 master_param['UPDATE_INTERVAL'] = '500'

/opt/conda/lib/python3.9/site-packages/htcondor/_lock.py in wrapper(*args, **kwargs)
     68             acquired = LOCK.acquire()
     69
---> 70             rv = func(*args, **kwargs)
     71
     72             # if the function returned a context manager,

HTCondorReplyError: Failed to set remote daemon parameter.

Logging Subsystem

The logging subsystem is available to the Python bindings; this is often useful for debugging network connection issues between the client and server.

NOTE Jupyter notebooks discard output from library code; hence, you will not see the results of enable_debug below.

[11]:
htcondor.set_subsystem("TOOL")
htcondor.param['TOOL_DEBUG'] = 'D_FULLDEBUG'
htcondor.param['TOOL_LOG'] = '/tmp/log'
htcondor.enable_log()    # Send logs to the log file (/tmp/foo)
htcondor.enable_debug()  # Send logs to stderr; this is ignored by the web notebook.
print(open("/tmp/log").read())  # Print the log's contents.

Sending Daemon Commands

An administrator can send administrative commands directly to the remote daemon. This is useful if you’d like a certain daemon restarted, drained, or reconfigured.

Because we have a personal HTCondor instance, we are the administrator - and we can test this out!

To send a command, use the top-level send_command function, provide a daemon location, and provide a specific command from the DaemonCommands enumeration. For example, we can reconfigure:

[12]:
print(master_ad['MyAddress'])

htcondor.send_command(master_ad, htcondor.DaemonCommands.Reconfig)
<172.17.0.2:9618?addrs=172.17.0.2-9618&alias=fa6c829ace67&noUDP&sock=master_16_de02>
09/19/23 21:41:27 SharedPortClient: sent connection request to <172.17.0.2:9618> for shared port id master_16_de02
[13]:
import time

time.sleep(1)

log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-4:])
['09/19/23 21:41:27 Sent SIGHUP to NEGOTIATOR (pid 20)\n', '09/19/23 21:41:27 Sent SIGHUP to SCHEDD (pid 21)\n', '09/19/23 21:41:27 Sent SIGHUP to SHARED_PORT (pid 18)\n', '09/19/23 21:41:27 Sent SIGHUP to STARTD (pid 24)\n']

We can also instruct the master to shut down a specific daemon:

[14]:
htcondor.send_command(master_ad, htcondor.DaemonCommands.DaemonOff, "SCHEDD")

time.sleep(1)

log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-1])
09/19/23 21:41:28 SharedPortClient: sent connection request to <172.17.0.2:9618> for shared port id master_16_de02
09/19/23 21:41:28 Can't open directory "/etc/condor/passwords.d" as PRIV_ROOT, errno: 13 (Permission denied)
09/19/23 21:41:28 Can't open directory "/etc/condor/passwords.d" as PRIV_ROOT, errno: 13 (Permission denied)
09/19/23 21:41:28 The SCHEDD (pid 21) exited with status 0

Or even turn off the whole HTCondor instance:

[15]:
htcondor.send_command(master_ad, htcondor.DaemonCommands.OffFast)

time.sleep(10)

log_lines = open(htcondor.param['MASTER_LOG']).readlines()
print(log_lines[-1])
09/19/23 21:41:29 SharedPortClient: sent connection request to <172.17.0.2:9618> for shared port id master_16_de02
09/19/23 21:41:29 **** condor_master (condor_MASTER) pid 16 EXITING WITH STATUS 0

Let’s turn HTCondor back on for future tutorials:

[16]:
import os
os.system("condor_master")
time.sleep(10)  # give condor a few seconds to get started