Version 23.0 LTS Releases

These are Long Term Support (LTS) versions of HTCondor. As usual, only bug fixes (and potentially, ports to new platforms) will be provided in future 23.0.y versions. New features will be added in the 23.x.y feature versions.

Warning

The configuration macros JOB_ROUTER_DEFAULTS, JOB_ROUTER_ENTRIES, JOB_ROUTER_ENTRIES_CMD, and JOB_ROUTER_ENTRIES_FILE are deprecated and will be removed for V24 of HTCondor. New configuration syntax for the job router is defined using JOB_ROUTER_ROUTE_NAMES and JOB_ROUTER_ROUTE_<Name>. Note: The removal will occur during the lifetime of the HTCondor V23 feature series. (HTCONDOR-1968)

The details of each version are described below.

Version 23.0.8

Release Notes:

  • HTCondor version 23.0.8 released on April 11, 2024.

New Features:

  • None.

Bugs Fixed:

  • Fixed a bug that caused ssh-agent processes to be leaked when using grid universe remote batch job submission over SSH. (HTCONDOR-2286)

  • Fixed a bug where DAGMan would crash when the provisioner node was given a parent node. (HTCONDOR-2291)

  • Fixed a bug that prevented the use of ftp: URLs in the file transfer plugin. (HTCONDOR-2273)

  • Fixed a bug where a job that’s matched to an offline slot ad remains idle forever. (HTCONDOR-2304)

  • Fixed a bug where the condor_shadow would not write a job termination event to the job log for a completed job if the condor_shadow failed to reconnect to the condor_starter prior to completing cleanup. This would result in DAGMan workflows being stuck waiting forever for jobs to finish. (HTCONDOR-2292)

  • Fixed bug where the Shadow failed to write its job ad to JOB_EPOCH_HISTORY when it failed to reconnect to the Starter. (HTCONDOR-2289)

  • Fixed a bug in the Windows MSI installer that would cause installation to fail when the install path had a space in the path name, such as when installing to C:\Program Files (HTCONDOR-2302)

  • Fixed a bug where the USER_JOB_WRAPPER was allowed to create job event log information events with newlines in them, which broke the event log parser. (HTCONDOR-2305)

  • Fixed SyntaxWarning raised by Python 3.12 in condor_adstash. (HTCONDOR-2312)

  • Improved use of Vault for job credentials. Reject some invalid use cases and avoid redundant work with frequent job submission. (HTCONDOR-2038) (HTCONDOR-2232)

  • Fixed an issue where HTCondor could not be installed on Debian or Ubuntu platforms if there was more that one condor user in LDAP. (HTCONDOR-2306)

Version 23.0.6

Release Notes:

  • HTCondor version 23.0.6 released on March 14, 2024.

New Features:

  • Speed up starting of daemons on Linux systems configured with very large number of file descriptors. (HTCONDOR-2270)

Bugs Fixed:

  • Fixed bug in DAGMan where nodes that had retries would incorrectly set its descendants to the Futile state if the node job got removed. (HTCONDOR-2240)

  • Fixed bug in the event log reader that would rarely cause DAGMan to lose track of a job, and wait forever for a job that had really finished, with DAGMan not realizing that said job had indeed finished. (HTCONDOR-2236)

  • Fixed condor_test_token to access the SciTokens cache as the correct user when run as root. (HTCONDOR-2241)

  • Fixed a bug that caused a crash if a configuration file or submit description file contained an empty multi-line value. (HTCONDOR-2249)

  • Fixed a bug where a submit transform or a job router route could crash on a two argument transform statement that had missing arguments. (HTCONDOR-2280)

  • Fixed error handing for the -format and -autoformat options of the condor_qusers tool when the argument to those options was not a valid expression. (HTCONDOR-2269)

  • Fixed a bug where the condor_collector generated an invalid host certificate for itself on macOS. (HTCONDOR-2272)

Version 23.0.4

Release Notes:

  • HTCondor version 23.0.4 released on February 8, 2024.

New Features:

  • The condor_starter will now set the environment variable NVIDIA_VISIBLE_DEVICES either to none or to a list of the full uuid of each GPU device assigned to the slot. (HTCONDOR-2242)

  • When the HTCondor Keyboard daemon (condor_kbdd) is installed, a configuration file is included to automatically enable user input monitoring. (HTCONDOR-2255)

  • The condor_starter can now be configured to capture the stdout and stderr of file transfer plugins and write that output into the StarterLog. (HTCONDOR-1459)

  • Updated condor_upgrade_check script for better support and maintainability. This update includes new flags/functionality and removal of old checks for upgrading between V9 and V10 of HTCondor. (HTCONDOR-2168)

Bugs Fixed:

  • Fixed a bug in the HTCondor Keyboard daemon where activity detected by the X Screen Saver extension was ignored. (HTCONDOR-2255)

  • Search engine timeout settings for condor_adstash now apply to all search engine operations, not just the initial request to the search engine. (HTCONDOR-2167)

  • Ensure Perl dependencies are present for the condor_gather_info script. The condor_gather_info script now properly reports the User login name. Also, report the contents of /etc/os-release`. (HTCONDOR-2094)

  • The submit language will no longer treat request_gpu_memory and request_gpus_memory as requests for a custom resource of type gpu_memory or gpus_memory respectively. (HTCONDOR-2201)

  • Fixed bug where DAG node jobs declared inline inside a DAG file would fail to set the Job ClassAd attribute JobSubmitMethod. (HTCONDOR-2184)

  • Fixed SyntaxWarning raised by Python 3.12 in scripts packaged with the Python bindings. (HTCONDOR-2212)

Version 23.0.3

Release Notes:

  • HTCondor version 23.0.3 released on January 4, 2024.

  • Preliminary support for openSUSE LEAP 15. (HTCONDOR-2156)

New Features:

Bugs Fixed:

  • The file transfer plugin documents that an exit code of 0 is success, 1 is failure, and 2 is reserved for future work to handle the need to refresh credentials. The definition has now changed so that any non-zero exit codes are treated as an error putting the job on hold. (HTCONDOR-2205)

  • Fixed a bug where any file I/O error (such as disk full) was ignored by the condor_starter when writing the ClassAd file that controlled file transfer plugins. As a result, in rare cases, file transfer plugins could be unknowingly given incomplete sets of files to transfer. (HTCONDOR-2203)

  • Fixed a crash in the Python bindings when job submit fails due to any reason. A common reason might be when SUBMIT_REQUIREMENT_NAMES fails. (HTCONDOR-1931)

  • There is a fixed size limit of 5120 bytes for chip commands. The starter now returns an error, and the chirp_client prints out an error when requested to send a chirp command over this limit. Previously, these were silently ignored. (HTCONDOR-2157)

  • Fixed a bug where the Python-based HTChirp client had its max line length set much shorter than is allowed by the HTCondor Chirp server. The client now also throws a relevant error when this max limit is hit while sending commands to the server. (HTCONDOR-2142)

  • Linux jobs with a invalid #! interpreter now get a better error message when the Execution Point is running as root. This was enhanced in 10.0, but a bug prevented the enhancement from fully working on a system installed Execution Point. (HTCONDOR-1698)

  • Fixed a bug where the DAGMan job proper for a DAG with a final node could stay stuck in the removed job state. (HTCONDOR-2147)

  • Correctly identify GPUsAverageUsage and GPUsMemoryUsage as floating point values for condor_adstash. (HTCONDOR-2170)

  • Fixed a bug where condor_adstash would get wedged due to a logging failure. (HTCONDOR-2166)

  • Updated the usage and man page of the condor_drain tool to include information about the -reconfig-on-completion option. (HTCONDOR-2164)

Version 23.0.2

Release Notes:

  • HTCondor version 23.0.2 released on November 20, 2023.

New Features:

  • None.

Bugs Fixed:

  • Fixed a bug when Hashicorp Vault is configured to issue data transfer tokens (which is not the default), job submission could hang and then fail. Reverted a change to condor_submit that disconnected the output stream of SEC_CREDENTIAL_STORER to the user’s console, which broke OIDC flow. (HTCONDOR-2078)

  • Fixed a bug that could result in job sandboxes not being cleaned up for batch grid jobs submitted to a remote cluster. (HTCONDOR-2073)

  • Improved cleanup of ssh-agent processes when submitting batch grid universe jobs to a remote cluster via ssh. (HTCONDOR-2118)

  • Fixed a bug where the condor_negotiator could fail to contact a condor_schedd that’s on the same private network. (HTCONDOR-2115)

  • Fixed CGROUP_MEMORY_LIMIT_POLICY = custom for cgroup v2 systems. (HTCONDOR-2133)

  • Implemented DISABLE_SWAP_FOR_JOB support for cgroup v2 systems. (HTCONDOR-2127)

  • Fixed a bug in the OAuth and Vault credmons where log files would not rotate according to the configuration. (HTCONDOR-2013)

  • Fixed a bug in the condor_schedd where it would not create a permanent User record when a queue super user submitted a job for a different owner. This bug would sometimes cause the condor_schedd to crash after a job for a new user was submitted. (HTCONDOR-2131)

  • Fixed a bug that could cause jobs to be created incorrectly when a using initialdir and max_idle or max_materialize in the same submit file. (HTCONDOR-2092)

  • Fixed bug in DAGMan where held jobs that were removed would cause a warning about the internal count of held job procs being incorrect. (HTCONDOR-2102)

  • Fixed a bug in condor_transfer_data where using the -addr flag would automatically apply the -all flag to transfer all job data back making the use of -addr with a Job ID constraint fail. (HTCONDOR-2105)

  • Fixed warnings about use of deprecated HTCondor Python binding methods in the htcondor dag submit command. (HTCONDOR-2104)

  • Fixed several small bugs with Trust On First Use (TOFU) for SSL authentication. Added configuration parameter BOOTSTRAP_SSL_SERVER_TRUST_PROMPT_USER, which can be used to prevent tools from prompting the user about trusting the server’s SSL certificate. (HTCONDOR-2080)

  • Fixed bug in the condor_userlog tool where it would crash when reading logs with parallel universe jobs in it. (HTCONDOR-2099)

Version 23.0.1

Release Notes:

  • HTCondor version 23.0.1 released on October 31, 2023.

  • We added a HTCondor Python wheel for Python 3.12 on PyPI. (HTCONDOR-2117)

  • The HTCondor tarballs now contain apptainer version 1.2.4. (HTCONDOR-2111)

New Features:

  • None.

Bugs Fixed:

  • Fixed a bug introduced in HTCondor 10.6.0 that prevented USE_PID_NAMESPACES from working. (HTCONDOR-2088)

  • Fix a bug where HTCondor fails to install on Debian and Ubuntu platforms when the condor user is present and the /var/lib/condor directory is not. (HTCONDOR-2074)

  • Fixed a bug where execution times reported for ARC CE jobs were inflated by a factor of 60. (HTCONDOR-2068)

  • Fixed a bug in DAGMan where Service nodes that failed caused the DAGMan process to fail an assertion check and crash. (HTCONDOR-2051)

  • The job attributes CpusProvisioned, DiskProvisioned, and MemoryProvisioned are now updated for Condor-C and Job Router jobs. (HTCONDOR-2069)

  • Updated HTCondor Windows binaries that are statically linked to the curl library to use curl version 8.4.0. The update was due to a report of a vulnerability, CVE-2023-38545, which affects earlier versions of curl. (HTCONDOR-2084)

  • Fixed a bug on Windows where jobs would be inappropriately put on hold with an out of memory error if they returned an exit code with high bits set (HTCONDOR-2061)

  • Fixed a bug where jobs put on hold by the shadow were not writing their ad to the job epoch history file. (HTCONDOR-2060)

  • Fixed a rare race condition where condor_rm’ing a parallel universe job would not remove the job if the rm happened after the job was matched but before it fully started (HTCONDOR-2070)

Version 23.0.0

Release Notes:

  • HTCondor version 23.0.0 released on September 29, 2023.

New Features:

  • A condor_startd without any slot types defined will now default to a single partitionable slot rather than a number of static slots equal to the number of cores as it was in previous versions. The configuration template use FEATURE : StaticSlots was added for admins wanting the old behavior. (HTCONDOR-2026)

  • The TargetType attribute is no longer a required attribute in most Classads. It is still used for queries to the condor_collector and it remains in the Job ClassAd and the Machine ClassAd because of older versions of HTCondor require it to be present. (HTCONDOR-1997)

  • The -dry-run option of condor_submit will now print the output of a SEC_CREDENTIAL_STORER script. This can be useful when developing such a script. (HTCONDOR-2014)

  • Added ability to query epoch history records from the Python bindings. (HTCONDOR-2036)

  • The default value of SEC_DEFAULT_AUTHENTICATION_METHODS will now be visible in condor_config_val. The default for SEC_*_AUTHENTICATION_METHODS will inherit from this value, and thus no READ and CLIENT will no longer automatically have CLAIMTOBE. (HTCONDOR-2047)

  • Added new tool condor_test_token, which will create a SciToken with configurable contents (including issuer) which will be accepted for a short period of time by the local HTCondor daemons. (HTCONDOR-1115)

Bugs Fixed:

  • Fixed a bug that would cause the condor_startd to crash in rare cases when jobs go on hold (HTCONDOR-2016)

  • Fixed a bug where if a user-level checkpoint could not be transferred from the starter to the AP, the job would go on hold. Now it will retry, or go back to idle. (HTCONDOR-2034)

  • Fixed a bug where the CommittedTime attribute was not set correctly for Docker Universe jobs doing user level check-pointing. (HTCONDOR-2014)

  • Fixed a bug where condor_preen was deleting files named ‘OfflineAds’ in the spool directory. (HTCONDOR-2019)

  • Fixed a bug where the blahpd would incorrectly believe that an LSF batch scheduler was not working. (HTCONDOR-2003)

  • Fixed the Execution Point’s detection of whether libvirt is working properly for the VM universe. (HTCONDOR-2009)

  • Fixed a bug where container universe did not work for late materialization jobs submitted to the condor_schedd (HTCONDOR-2031)

  • Fixed a bug where the condor_startd could crash if a new match is made at the end a drain request. (HTCONDOR-2032)