Version 24.x LTS Releases
HTCondor 24.x transitioned from feature to LTS with the release of 24.12.13. These are Long Term Support (LTS) versions of HTCondor. As usual, only bug fixes (and potentially, ports to new platforms) will be provided in future 24.12.y versions. New features will be added in the 25.x.y feature versions.
Version 24.12.20
Release Notes:
HTCondor version 24.12.20 released on May 14, 2026.
New Features:
HTCondor tarballs now contain Pelican 7.24.2
The condor package now requires pelican-7.24.2.
Bugs Fixed:
Fixed a bug that caused attempts to refresh a job’s X.509 proxy file spooled by the AP to fail. (HTCONDOR-3660)¶
htcondor2.paramnow properly handles configuration options typed as long integers containing values greater than approximately 2 billion and no longer crashes the process when accessing theparamobject. (HTCONDOR-3616)¶Improved messages in STARTD_LOG when the condor_startd aborts due to failing to detect the total amount of physical memory or when RESERVED_MEMORY leaves no memory for slot creation. (HTCONDOR-3667)¶
Fixed a bug in the parallel universe where the job attribute RemoteUserCpu would only contain the cpu usage from one proc in the parallel job. (HTCONDOR-3677)¶
condor_ssh_to_job now properly supports running into docker and container universe jobs with a one-shot command argument. In addition, the shell now enters the cgroup of the job for container universe jobs. (HTCONDOR-3658)¶
Fixed bug where Execution Points using STARTD_ENFORCE_DISK_LIMITS without being explicitly provided a backing store (Volume Group) would fail clean up to clean up the automatically generated backing store on shutdown. (HTCONDOR-3657)¶
VM Universe jobs now use the widely supported and ubiquitous
Intel e1000network card. This is needed on Enterprise Linux 10 hosts, where the default virtual NIC model selected by libvirt is incompatible. (HTCONDOR-3642)¶
Version 24.12.19
Release Notes:
HTCondor version 24.12.19 released on April 16, 2026.
New Features:
HTCondor tarballs now contain Pelican 7.24.0
The condor package now requires pelican-7.24.0.
Bugs Fixed:
Fixed a bug that would cause the condor_startd to crash if a custom DOCKER docker wrapper emitted unexpected output to the “images” sub-command. (HTCONDOR-3596)¶
Fixed a bug where condor_status would not adjust the Name column width to fit data for some output formats. This bug would result in columns after the Name column not being properly aligned when the longest name exceeded 34 characters. (HTCONDOR-3255)¶
Fixed a bug where condor_vacate would fail to find the address of the condor_startd when provided a slot name. With this change the condor_collector now searches through both Startd Daemon Ads and Startd Slot Ads when handling a locate query. (HTCONDOR-3580)¶
Fixed a bug in condor_q match analysis that would always count sub-expressions that evaluated to a double or integer as not matching. (HTCONDOR-3601)¶
Fixed a bug where an NVIDIA GPU that has been subdivided in MIG sub-devices would not properly pass the device names in the various NVIDIA related environment variables. (HTCONDOR-3567)¶
Fixed a performance problem with the classad2 python bindings that caused the parseAds methods to run in n-squared time. (HTCONDOR-3632)¶
Fixed a bug where allowed_execute_duration would possibly not be respected for jobs that did not do input file transfer. (HTCONDOR-3633)¶
Fixed a bug for batch grid universe jobs where submitting to a remote system would fail if the username on the remote system contained a dot. (HTCONDOR-3581)¶
Fixed a bug where some of the values reported for job execution time were enormous if the Access Point daemons restarted while the job was running. (HTCONDOR-3592)¶
Fixed a bug where an administrator couldn’t query a daemon’s configuration using condor_config_val when authentication is required for READ level authorization. (HTCONDOR-3572)¶
The
htcondor2.send_command()method no longer incorrectly requires atargetparameter. (HTCONDOR-3597)¶
Version 24.12.18
Release Notes:
HTCondor version 24.12.18 released on March 12, 2026.
New Features:
Enable the in-memory cache option in newer versions of the scitokens-cpp library. This allows the library to continue operating when the on-disk cache is not useable. (HTCONDOR-3560)¶
HTCondor tarballs now contain Pelican 7.23.0
The condor package now requires pelican-7.23.0.
Bugs Fixed:
Improve AMD GPU detection when using ROCM6/HIP libraries and fix a potential crash when using the HIP libraries with the default detection options passed by the condor_startd. (HTCONDOR-3533)¶
Fixed issue where jobs could match Execution Points utilizing STARTD_ENFORCE_DISK_LIMITS only for them to immediately be kicked off due to unavailable disk space due to quantization issues between HTCondor and LVM. This would could cause lots of unnecessary job churn when Execution Points are very busy. (HTCONDOR-3490)¶
Fixed a bug where an job with a circular Requirements expression would result in a crash in condor_q when the
-betteroption was passed. The crash was fixed, and condor_submit was changed to report an error when the submit file has a requirements command that references Requirements. (HTCONDOR-2629)¶Fixed a bug in condor_status where the Offline-GPUs column of the
-gpusoutput was always empty. (HTCONDOR-3558)¶Fixed a bug in
htcondor2.Schedd.history()which prevented its since argument from being the string representation of an expression. (HTCONDOR-3435)¶Fixed a bug where a backfill p-slot would have no assigned GPUs after a reconfig. (HTCONDOR-3072)¶
Fixed a bug in condor_submit that could expand the executable path incorrectly when using late materialization and a different executable for each job. (HTCONDOR-3546)¶
Fixed a bug where new STARTD_CRON values didn’t trigger an immediate update of slot ads to the collector when configured to do so. (HTCONDOR-3526)¶
Update Annex’s knowledge about the Delta HPC system. (HTCONDOR-3559)¶
Version 24.12.17
Release Notes:
HTCondor version 24.12.17 released on February 12, 2026.
New Features:
None.
Bugs Fixed:
Fixed
htcondor2.Schedd.refreshGSIProxy()to use correct variables and not immediately fail. (HTCONDOR-3503)¶Fixed bugs in condor_history that made it very slow on filesystems with rate-limits on I/O operations. (HTCONDOR-3512)¶
Version 24.12.16
Release Notes:
HTCondor version 24.12.16 released on January 29, 2026.
New Features:
HTCondor tarballs now contain Pelican 7.22.0
The condor package now requires pelican-7.22.0.
HTCondor tarballs now contain Apptainer 1.4.5
The condor RPM package now requires at least apptainer version 1.4.5.
Bugs Fixed:
Fixed a bug where specifying a scope or audience when storing a Vault-managed credential results in a stored token that can’t be used. (HTCONDOR-3506)¶
In the condor_schedd when a transform attempts to set an immutable attribute more than once, the second attempt be quietly ignored rather than failing job materialization. (HTCONDOR-3495)¶
Fixed a bug where a backfill p-slot could get into a state where it would advertise that it had capacity, but refuse every claim with a log message indicating that it had zero CPUs. (HTCONDOR-3072)¶
Fixed a bug in the shared-port daemon that caused new network connections to fail if only a few bytes are immediately available to read. (HTCONDOR-3420)¶
Fixed a bug where a job with a floating point value for RequestMemory or RequestDisk might not match the slot created to run it. (HTCONDOR-3423)¶
Fixed a bug that would cause a crash in the condor_rooster and condor_defrag when many slots would have identical ROOSTER_UNHIBERNATE_RANK expressions, or if was not constant (e.g. used random() or time()). (HTCONDOR-3436)¶
Removed memory leak from
htcondor2.JobEventLog.events(). (HTCONDOR-3474)¶Changed condor_history so that when it prints jobs in
-longformat it prints them with attributes sorted alphabetically like condor_q does. (HTCONDOR-3481)¶Fix Logical Volume setup timeout on Execution Points enforcing disk usage for jobs that have a large request_disk value (
800+ GB). (HTCONDOR-3432)¶Fixed issue with Execution Points enforcing disk usage generating an excessive amount of metadata archives filling up
/etc/lvm/archive. (HTCONDOR-3488)¶Added FILETRANSFER_PLUGIN_CLASSAD_TIMEOUT for administrators to set a longer timeout for plugins to detect their health. (HTCONDOR-3455)¶
Fixed a bug where docker universe jobs would bring back a file named
.docker_stderrorback to the AP after job completion. (HTCONDOR-3424)¶Fixed a bug where the values of the attributes LocalJobsIdle and LocalJobsRunning in the submitter ad were reversed. (HTCONDOR-3456)¶
Version 24.12.15
Release Notes:
HTCondor version 24.12.15 released on December 15, 2025.
New Features:
condor_submit will now report an error when output_destination is not a URL and suggest an alternative. (HTCONDOR-3385)¶
HTCondor tarballs now contain Pelican 7.21.1
The condor package now requires pelican-7.21.1.
HTCondor tarballs now contain Apptainer 1.4.4
The condor RPM package now requires at least apptainer version 1.4.4.
Bugs Fixed:
Improve compatibility with versions of HTCondor that attempt to set attribute OsUser in the job ad. (HTCONDOR-3415)¶
Calling
Schedd.submit()with a list of dictionaries with dissimilar keys will no longer mangle the values (and instead ignore the extra keys, as implied by the documentation). (HTCONDOR-3351)¶Fixed a bug that could cause the AP to fail to read job credential files. (HTCONDOR-3377)¶
Fixed bugs that could cause a crash in the authentication code. (HTCONDOR-3394)¶
Fixed a bug where the condor_negotiator would fail to contact the condor_schedd to perform matchmaking if its ALLOW_CLIENT configuration parameter didn’t authorize the identity
submit-side@matchsession. (HTCONDOR-3378)¶Fixed a bug where tools like condor_status would print an incorrect value for the
-af:roption when the value to be printed was a single attribute reference. (HTCONDOR-3347)¶
Version 24.12.14
Release Notes:
HTCondor version 24.12.14 released on November 3, 2025.
New Features:
Added a new EP configuration knob, DOCKER_TRUST_LOCAL_IMAGES which defaults to false. Setting this to true allows users to run docker images which have been pre-staged in the EPs docker cache even if the image does not exist in a repository, or if the user does not have permission to pull from that repository. (HTCONDOR-3315)¶
The condor_schedd will now include the address of a condor_credd that is running under the same condor_master in its ClassAd and address file. This allows the submission process to get the the address of the condor_credd from the condor_schedd for some situations where credentials must be stored as part of job submission. The Kerberos local issuer will now use this mechanism and no longer query the collector for the address of the condor_credd. (HTCONDOR-3281)¶
Bugs Fixed:
Fix interoperability problem between HTCondor-CE 24 and 25 which manifests as a Job Router crash when upgrading the CE to HTCondor 25 (HTCONDOR-3355)¶
Fixed a bug in
Schedd.submit()where single-entry itemdata could be truncated to its first character. (HTCONDOR-3272)¶Fixed a bug in
Schedd.submit()where the$(step)submit variable wasn’t being set. (HTCONDOR-3272)¶Fixed a bug in
Submit.itemdata()causing multi-entry itemdata to be returned incorrectly. (HTCONDOR-3272)¶Fixed a bug where using max_idle, container_image and transfer_input_files could result in the container only being transferred along with the first job. (HTCONDOR-3092)¶
Changed the documentation for
classad2.ClassAd.matches()andclassad2.ClassAdto match their implementations; the original documentation was in error, and both are actually backwards-compatible with the first version of the Python bindings. Made other minor changes; see the ticket for details. (HTCONDOR-3328)¶Calling
classad2.ExprTree(None)no longer results in an invalidclassad2.ExprTree, preventing segmentation faults when using the object. (HTCONDOR-3319)¶Fixed a bug where condor_history could take many minutes to read a single line from the history file when the line is many megabytes long. This was causing ingestion of epoch ads into a database to timeout. (HTCONDOR-3299)¶
Fixed a bug where if input file transfer failed, occasionally no descriptive error message about the failure would make it back into the job hold reason. (HTCONDOR-3327)¶
Fixed a very longstanding bug where extremely fast machines would overflow a 32 bit counter and return -1 for the KFlops slot attribute. (HTCONDOR-3288)¶
The
gpuandgpu-debugqueues at Anvil are no longer represented in htcondor annex as whole-node queues, fixing a problem where the “whole node” would have 4 GPUs and only 1 CPU. (HTCONDOR-3324)¶Fix problem running PyTorch jobs on multiple GPUs with newer versions of the CUDA library by providing long GPU IDs in the
CUDA_VISIBLE_DEVICESenvironment variable (HTCONDOR-3350)¶Calling
htcondor2.Schedd.history()on an object whose corresponding daemon can’t be contacted will no longer cause a segmentation fault. (HTCONDOR-3314)¶Executions points using disk enforcement failed when provided backing LVM_VOLUME_GROUP_NAME and LVM_THINPOOL_NAME contained hyphens. (HTCONDOR-3334)¶
Add a timeout for all HTTP operations in the grid universe (affects arc, ec2, and gce grid types). (HTCONDOR-3300)¶
Fixed a bug where administrative tools (e.g. condor_drain) can fail if authentication is not required for queries to the condor_collector. (HTCONDOR-3301)¶
The condor_startd now correctly removes docker containers that have escaped from docker universe jobs launched by condor_starter's that have exited uncleanly. (HTCONDOR-3338)¶
Annexes no longer download a default configuration tarball, making them more robust. This required upgrading the default version of HTCondor run by annexes to 25.1.0, which can figure the details out on its own. On Delta, annexes now give Delta additional time to start file-transfer plug-ins, so they should be more-reliably available there. (HTCONDOR-3220)¶
condor_adstash no longer skips entire history files when it encounters a line that cannot be decoded to UTF-8. (HTCONDOR-3335)¶
Fixed a bug in blahpd when
/tmpis mounted with thenoexecoption. (HTCONDOR-3343)¶
Fixed bug where when SINGULARITY_TARGET_DIR was set and STARTER_NESTED_SCRATCH was also set, then some environment variables that pointed into the job’s execute directory were not correct. (HTCONDOR-3265)¶
Fixed a bug causing some file-transfer plug-ins (box, gdrive, and onedrive) to fail immediately on start-up. (HTCONDOR-3317)¶
Fixed a bug where condor_qusers would treat the
addoption asenablewhen the leading dash was omitted. (HTCONDOR-3284)¶
Version 24.12.13
Release Notes:
HTCondor version 24.12.13 released on October 9, 2025.
New Features:
HTCondor tarballs now contain Pelican 7.20.2
The condor package now requires pelican-7.20.2.
Bugs Fixed:
Fixed bug where the message
Processing new events...would briefly flash while running condor_watch_q (HTCONDOR-3244)¶Fixed a bug where space and comma would be included in the list of separators for itemdata even if the itemdata had been supplied with the ASCII “unit separator”. This would cause itemdata entries containing spaces (or commas) to be incorrectly interpreted as multiple items, which could manifest as parse errors. You can work around this bug if only one of your entries has spaces and/or commas by moving that entry to the end of the line (or dictionary, if you’re submitting itemdata via Python). (HTCONDOR-3272)¶
Fixed false positive reporting of ClassAd unit specifier test in condor_upgrade_check. (HTCONDOR-3276)¶
Version 24.x Feature Releases
We release new features in these releases of HTCondor. The details of each version are described below.
Version 24.12.4
Release Notes:
HTCondor version 24.12.4 released on September 23, 2025.
New Features:
htcondor2.Collector.locate()now checks theMachineattribute (instead ofName) when trying to find condor_startd daemons, if thenameargument is not the name of a slot. This allows programmers to callhtcondor2.Collector.locate()using the fully-qualified domain name and get a result from a collector whether or not ENABLE_STARTD_DAEMON_AD is set. (HTCONDOR-2911)¶Add new configuration knob, LOCAL_UNIVERSE_CGROUP_ENFORCEMENT, which defaults to false. When true, and running on a cgroup enable system, local universe jobs must specify request_memory, and if the job exceeds that limit, it will be put on hold. (HTCONDOR-3170)¶
HISTORY_HELPER_MAX_HISTORY default raised from 10,000 to 2,000,000,000 effectively removing the limit. We now believe that running large remote history queries will not have a negative impact on the condor_schedd (HTCONDOR-3215)¶
condor_userprio now no longer needs the negotiator to be running if it only needs the collector to get userprio data. (HTCONDOR-3221)¶
Docker universe jobs now support the config parameter DOCKER_NETWORK_NAME which defaults to “docker0”. Setting this appropriately allows condor_chirp and
htcondor.htchirpto work correctly inside a docker universe job. (HTCONDOR-3197)¶condor_watch_q will now exit upon key press from keyboard. (HTCONDOR-3199)¶
Added new NumInputTransferStarts and NumOutputTransferStarts counters to the job record. (HTCONDOR-3194)¶
If the file listed in log is in a directory that does not exist, HTCondor will now try to create that directory (as the user). Previously, it ran the job without a user log file. (HTCONDOR-3210)¶
Added new knob, STARTER_NESTED_SCRATCH, which defaults to false. When false, the job scratch directory (named dir_XXX) is the immediate child of the EXECUTE directory. When true, the job’s scratch directory is a subdirectory thereof, named “scratch”, with the intention that all condor metadata files will be moved to peer directories, and not pollute the job’s scratch directory with condor control files. (HTCONDOR-3080)¶
Added optional second argument for the macro expansion function
$BASENAME()that specifies the file extension or suffix to remove. (HTCONDOR-3206)¶condor_adstash now parses and stores per-attempt transfer error data when reading transfer epoch history. (HTCONDOR-3122)¶
On Windows, ResidentSetSize and thus MemoryUsage will now be updated in the startd as frequently as ImageSize so that the default PREEMPT and WANT_HOLD policy expressions will evict jobs that go over Memory in a timely manner. (HTCONDOR-3224)¶
Updated condor_upgrade_check to test for well known gotchas between v24 and v25 of HTCondor installations. (HTCONDOR-3209)¶
HTCondor tarballs now contain Pelican 7.19.3
The condor package now requires pelican-7.19.3.
The condor RPM package now requires at least apptainer version 1.4.2.
Bugs Fixed:
Allow condor_submit -i to work with shell. (HTCONDOR-3208)¶
Fixed bug when STARTER_NESTED_SCRATCH is true and SINGULARITY_TARGET_DIR was also set. (HTCONDOR-3243)¶
Fixed bug that prevented emails being sent to users when jobs went on hold and notification was set to error. (HTCONDOR-3219)¶
Fixed bug introduced in 24.11 that caused the condor_schedd to occasionally crash when DAGMan completed. (HTCONDOR-3250)¶
Fixed a bug where the blahpd would fail when using Python 3.12 or later. (HTCONDOR-3225)¶
Fixed bug where jobs held due to exceeding disk usage had the same HoldReasonCode and HoldReasonSubCode as jobs that exceeded memory usage. (HTCONDOR-3248)¶
Fixed a bug where Flock Collectors could be forgotten after having a connection interruption. (HTCONDOR-3200)¶
Fixed a bug where a condor_schedd would not attempt to flock to collectors in the FLOCK_TO list when a communication failure occurred to a collector earlier in the list. (HTCONDOR-3200)¶
Fixed a packaging bug in tarballs for RPM-based systems which resulted in the Pelican file-transfer plug-in not being enabled by default. (HTCONDOR-3239)¶
Version 24.11.2
Release Notes:
HTCondor version 24.11.2 released on August 21, 2025.
New Features:
Added new job ClassAd attributes NumVacates and NumVacatesByReason. These attributes provide counts about why a job left the running state without completing (i.e. was vacated from the execution point). (HTCONDOR-3204)¶
Added new “notification = start” option to condor_submit, which sends an email when the job starts for the first time. (HTCONDOR-3133)¶
New configuration parameter TRUSTED_VAULT_HOSTS can be used to restrict which Vault servers the condor_credd will accept credentials for. (HTCONDOR-3136)¶
The new Python API now includes htcondor2.ping(), which operates like htcondor.SecMan.ping() in the old API. (HTCONDOR-3180)¶
The htcondor annex tool now has (limited) support for AWS’ EC2 annexes. The
condor_annextool has been withdrawn; use htcondor annex instead. (HTCONDOR-1630)¶Added new python method “get_claims” to the schedd object, which returns the classads of the claimed slots. (HTCONDOR-3181)¶
Add –version flag to the htcondor tool. (HTCONDOR-3091)¶
Initial support for Debian 13 (trixie). (HTCONDOR-3212)¶
HTCondor tarballs now contain Pelican 7.18.1
The condor package now requires pelican-7.18.1.
HTCondor tarballs now contain Apptainer 1.4.2
Bugs Fixed:
When responding to a ping request for the ALLOW authorization level, daemons no longer require authentication. (HTCONDOR-3195)¶
Fixed a bug with docker universe jobs that have a PostCmd. This PostCmd script was not passed the environment variables _CONDOR_MAINJOB_EXIT_CODE or _CONDOR_MAINJOB_EXIT_SIGNAL (HTCONDOR-3185)¶
Fixed a bug where the ImageSize attribute in a slot on a Windows EP was a large random value while the slot was running a job. The bug was due to a change in the Win32 API where the total virtual memory of a process is no longer reported. From now on ImageSize on a Windows EP will have the same value as ResidentSetSize. (HTCONDOR-3179)¶
Fixed a bug where an old client (version 9.0 or earlier) with lax security settings (authentication, encryption, and integrity all disabled or optional) would fail to communicate with a daemon with stronger security settings. (HTCONDOR-3189)¶
condor_token_requestno longer fails with an error if the token is automatically approved by the daemon. (HTCONDOR-239)¶
Version 24.10.3
HTCondor version 24.10.3 released on August 12, 2025.
New Features:
None.
Bugs Fixed:
Fixed a bug introduced in version 24.10.2 that caused condor_store_cred add to fail. (HTCONDOR-3213)¶
Version 24.10.2
Release Notes:
HTCondor version 24.10.2 released on July 28, 2025.
New Features:
In the condor_job_router, the old ClassAd-based route syntax (specified using
JOB_ROUTER_ENTRIESandJOB_ROUTER_DEFAULTS) is no longer supported. (HTCONDOR-3118)¶Added new condor_dag_checker tool for users to check DAG files for syntactical and logical errors prior to submission. (HTCONDOR-3088)¶
Improvements to condor_q for held jobs. The hold code and subcode are now displayed as part of the
-holdoption. A new option-hold-codesdisplays the first job for each unique hold code and subcode. (HTCONDOR-3127)¶Added new
-lvmoption to condor_status to view current disk usage of slots enforcing disk limits. This option can be paired with-startdto show information about execution points enforcing disk limits. (HTCONDOR-3119)¶The new Python API now includes
htcondor2.disable_debug(), which is intended interactive use (after debugging a problem). (HTCONDOR-3003)¶Some errors on the EP that occurred after the AP had released the corresponding claim could cause a slot to remain claimed until the job lease timeout had expired. This change should reduce incidents of this behavior. (HTCONDOR-3028)¶
Improvements to observability of common files transfer, including new entries in the shadow and starter daemon logs; a new CommonFiles event in the job/user event log; and a new transfer entry in the epoch history. (HTCONDOR-3052)¶
HTCondor tarballs now contain Pelican 7.17.2
The condor package now requires pelican-7.17.2.
Bugs Fixed:
Fixed a bug in condor_q default output where counts of jobs could be truncated to 6 digits. (HTCONDOR-3106)¶
Fixed a bug introduced in HTCondor version 24.8.0 where a job in Suspended status wouldn’t change to Idle status when evicted from an EP. This resulted in the job not being considered for scheduling, among other problems. (HTCONDOR-3174)¶
Execution Points enforcing disk limits will now subtract the size of pre-existing logical volumes from the advertised available disk. Any logical volumes associated with HTCondor are not subtracted. (HTCONDOR-3119)¶
Fixed a bug where the condor_credd mistakenly thought a Vault-managed OAuth2 credential was a plain user-provided access token. (HTCONDOR-3084)¶
Attempting to send common files to startds whose sinful string is more than 256 characters will no longer cause a shadow exception. (HTCONDOR-3128)¶
Fixed a memory leak in the condor_schedd that could be triggered by checkpointing. (HTCONDOR-3104)¶
Fixed an issue where a job may take an additional 20 minutes to be scheduled to run after leaving cool-down mode. See configuration knob SYSTEM_ON_VACATE_COOL_DOWN for more information about job cool-down mode. (HTCONDOR-3059)¶
Fixed a bug where the specific vacate reason wasn’t reported in the job event log or the job ad when a job was evicted from the EP. The message
Unspecified job interruptionand code 1005 were used instead. (HTCONDOR-3117)¶Fixed a bug that could cause the startd’s mips benchmark to run forever, consuming a cpu core and resulting in the Mips attribute to be undefined. (HTCONDOR-3134)¶
condor_ssh_to_job now works correctly when TMP is set to longer paths, such as when running under glidein. (HTCONDOR-3163)¶
Fixed a bug in htcondor job status and htcondor dag status that caused some time information to be displayed incorrectly. (HTCONDOR-3112)¶
Fixed a bug where the Machine and Job ClassAds would fail to be written into job scratch directories on Execution Points using STARTD_ENFORCE_DISK_LIMITS. (HTCONDOR-3156)¶
Version 24.9.2
Release Notes:
HTCondor version 24.9.2 released on June 26, 2025.
New Features:
Added new job ClassAd attribute TransferInputFileCounts. (HTCONDOR-3024)¶
Added new SCHEDD_DAEMON_HISTORY file for the Schedd to periodically write historical ClassAd records into. These records can be queried via condor_history using the new -daemon option or via
htcondor2.Schedd.daemonHistory(). (HTCONDOR-3061)¶condor_watch_q will now display tracking information for DAGMan jobs specified via the -clusters option. (HTCONDOR-3068)¶
Improved logging on the EP when a slot cannot be claimed because the Start expression evaluates to false. When this happens, analysis of the slot Requirements expression will be written to the
StartLog. (HTCONDOR-3033)¶Initial Support for Enterprise Linux 10, including the x86_64_v2 platform. (HTCONDOR-3090)¶
HTCondor tarballs now contain Pelican 7.17.0
The condor package now requires pelican-7.17.0.
Bugs Fixed:
Fixed a bug which could cause unnecessary activation failures if the previous job in the slot failed to transfer its output. This would manifest as slots being in the claimed/idle state for far longer than necessary. (HTCONDOR-3073)¶
Fixed a bug in the Vault credential monitor where access tokens were failing to be generated from Vault tokens when AUTH_SSL_CLIENT_CAFILE and/or AUTH_SSL_CLIENT_CADIR were undefined. (HTCONDOR-3086)¶
Fixed a bug in
htcondor2.Scheddwhere it didn’t work to use ajob_specparameter to specify a cluster ID as an integer, as a string without a proc ID, or in a list of such strings. (HTCONDOR-2979)¶The results of
key in htcondor2.paramandkey in htcondor2.param.keys()now match for keys which are defined to have no value. (Previously, such keys would be returned bykeys().) (HTCONDOR-3085)¶Fixed a memory leak in the condor_schedd when late materialization is used. (HTCONDOR-3096)¶
Fixed a bug where the condor_master would not start up on systems where
ulimit -nwas close to 2 ^ 31 file descriptors. (HTCONDOR-3079)¶Fixed a bug in condor_adstash that would prevent ads from being ingested to Elasticsearch or OpenSearch when a lookup of indexes did not start with a writable index. (HTCONDOR-3109)¶
Fixed a bug when DAGMan’s log file was on a full filesystem, DAGMan would not exit with the correct log file full exit code. (HTCONDOR-3066)¶
The submit commands kill_sig, remove_kill_sig, and hold_kill_sig are now ignored for Windows jobs. These control Unix process signals, which are not relevant on Windows. (HTCONDOR-3078)¶
Fixed a bug introduced in HTCondor 23.0.21, 23.10.21, and 24.0.5 that caused the condor_gridmanager to fail at startup if GRIDMANAGER_LOG_APPEND_SELECTION_EXPR was set to
True. (HTCONDOR-3099)¶The plug-in used to handle
httpandhttpsURLs now sets theuser-agentheader by default (condor_curl_plugin/0.2). Web servers which return a 403/Forbidden error when theuser-agentis empty will now function as expected. (HTCONDOR-3121)¶
Version 24.8.1
Release Notes:
HTCondor version 24.8.1 released on June 12, 2025.
New Features:
On Linux systems with cgroups enabled, jobs are now put in a “.scope” sub-cgroup of the per-job “.slice” cgroup. This makes it easier for pilot or glidein systems to further subdivide the job’s cgroup. (HTCONDOR-3008)¶
On Linux systems, added support to put each condor daemon in its own cgroup with the knob CGROUP_ALL_DAEMONS (HTCONDOR-3032)¶
The execute point now sets the execute permission bit on the executable even when it was transferred by a plugin. This is helpful when using pelican or osdf to transfer the job’s main executable. (HTCONDOR-3020)¶
Add a new configuration knob, STARTER_SETS_HOME_ENV which defaults to true. When true, the job will have the HOME environment variable set to whatever it is on the system. When false, HOME will not be set to anything. (HTCONDOR-3010)¶
Added new
haltandresumeverbs to htcondor dag for first class way to halt a DAG. (HTCONDOR-2898)¶Added new
htcondor2.DAGManclass to the python API for sending commands to running a DAGMan process. (HTCONDOR-2898)¶Added DAGMAN_NODE_JOB_FAILURE_TOLERANCE to inform DAGMan when to consider a placed job list as failed when job failures occur. (HTCONDOR-3019)¶
htcondor ap status will now show the RecentDaemonCoreDutyCycle of each reported Access Point’s condor_schedd. (HTCONDOR-3009)¶
condor_adstash can now be configured to fetch a custom projection of attributes for job (epoch) ClassAds. (HTCONDOR-2680)¶
condor_status will now accept
-totals(previously just-total) to better match other tools with the similar option. (HTCONDOR-3044)¶Improve diagnostics in the shadow when it fails to activate a claim. (HTCONDOR-3035)¶
The directory for LOCAL_UNIV_EXECUTE is no longer made world-writable. (HTCONDOR-3036)¶
Augment the libvirt_simple_script.awk script to provide needed UEFI boot information for ARM virtual machines. (HTCONDOR-3006)¶
The condor_upgrade_check script has been folded into the main condor package. (HTCONDOR-2995)¶
HTCondor tarballs now contain Pelican 7.16.5
The condor package now requires pelican-7.16.5.
Pelican 7.16.5 now includes end-to-end integrity checks for clients
HTCondor tarballs now contain Apptainer 1.4.1
Bugs Fixed:
Fixed a bug in the EP preventing a claimed slot from being re-used to run multiple jobs. The ability for an AP to run multiple jobs on the same claimed slot (i.e. without needing to go back to the central manager) is a critical scalability feature in HTCSS, especially when running large numbers of short-running jobs. The bug fixed here was introduced in HTCondor version 24.5.1, so if you are running HTCondor v24.5.x, v24.6.x, or v24.7.x, and run large numbers of short jobs, please consider upgrading. See the JIRA ticket for additional workarounds if you cannot upgrade. (HTCONDOR-3045)¶
On Linux and macOS, when using dataflow jobs, HTCondor now checks the modification times of dataflow nodes with sub-second accuracy. Previously, it just used seconds, which means that it might incorrectly not skip a dataflow job that it should have skipped if the output file was written in the same second as the input file. (HTCONDOR-3027)¶
Fixed condor_watch_q to output a useful error message and not exit when one of log files associated with jobs being tracked does not exist. (HTCONDOR-2978)¶
Removed job attribute
ToE. It has been replaced by job attributes VacateReason, VacateReasonCode, and VacateReasonSubCode. (HTCONDOR-2974)¶The
SlotNamefield in the job event log is now correct in the case where a condor_startd has a non-default name. (HTCONDOR-3047)¶Fixed a bug where
htcondor2.enable_debug()would cause the Python interpreter to exit if the debug log was configured incorrectly. (HTCONDOR-3004)¶Removed some memory leaks from version 2 of the Python bindings. (HTCONDOR-2981)¶
Fixed a bug introduced in HTCondor 24.7.0 which would lead to the directory
0(and subdirectories) spuriously being created in the SPOOL directory. (HTCONDOR-3026)¶Fixed a bug introduced in 24.0.7 and 24.7.3 when running on Linux cgroup v1 systems, jobs that were killed by the out-of-memory killer were considered completed instead of being put on hold. (HTCONDOR-3094)¶
The
htcondor2.Creddinitializer now properly raised aTypeErrorif the location argument isn’t aclassad2.ClassAd, rather than failing to raise aTypError. (HTCONDOR-2993)¶Fixed bug where DAGMAN_MAX_JOBS_IDLE value was not being respected by DAGMan even if no limit was specified via condor_submit_dags
-maxidleoption. (HTCONDOR-3011)¶Fixed some bugs with parallel universe jobs that can cause the condor_schedd to crash. (HTCONDOR-3049)¶
Fixed a bug that caused the condor_starter to crash if a job was vacated during input file transfer. (HTCONDOR-3016)¶
condor_watch_q will now correctly display the full range of job ids for each batch grouping. (HTCONDOR-2992)¶
Fixed bug in one of condor_upgrade_check's tests that would cause the test to check for incorrect hostname when running inside of a container. (HTCONDOR-3014)¶
Version 24.7.3
Release Notes:
HTCondor version 24.7.3 released on April 22, 2025.
New Features:
Improved the ability of condor_who to query condor_startd processes when condor_who is running as root or as the same user as the Startd, and added formatting options for use when the condor_startd is running as a job on another batch system. (HTCONDOR-2927)¶
htcondor credential add oauth2 can now be used to store tokens that can be used by jobs via use_oauth_services. The user is responsible for updating tokens that can expire. (HTCONDOR-2803)¶
Added OSHomeDir to starter’s copy of the job ad. (HTCONDOR-2972)¶
Add SYSTEM_MAX_RELEASES which implements an upper bound on the number of times any job can be released by a user or periodic expression. (HTCONDOR-2926)¶
Added the ability for an EP administrator to disable access to the network by a job, by setting NO_JOB_NETWORKING to true. (HTCONDOR-2967)¶
Added the ability for a docker universe job to fetch an authenticated image from the docker repository. (HTCONDOR-2870)¶
Improved condor_watch_q to display information about the number of jobs actively transferring input or output files. (HTCONDOR-2958)¶
The default value for DISABLE_SWAP_FOR_JOB has been changed to
True. This provides a more predictable and uniform user experience for jobs running on different EPs. (HTCONDOR-2960)¶Add htcondor annex login verb, which opens a shared SSH connection to the named HPC system. If you’ve recently created or added an annex at a particular system, it will re-use that cached connection; otherwise, you’ll have to login again, but that connection will then be re-usable by other htcondor annex commands. (HTCONDOR-2809)¶
Updated htcondor annex to work with Expanse’s new requirements for its
gpuandgpu-sharedqueues. (HTCONDOR-2634)¶Enhanced htcondor job status to also show the time to transfer the job input sandbox. (HTCONDOR-2959)¶
Jobs that use concurrency_limits can now re-use claims in the schedd. (HTCONDOR-2937)¶
Added shell for Linux systems. (HTCONDOR-2918)¶
START_VANILLA_UNIVERSE expressions may now refer to attributes in the schedd add using the prefix
SCHEDD. (HTCONDOR-2919)¶Hold messages generated by failure to transfer output now include how many files failed to transfer. (HTCONDOR-2903)¶
Added
-transfer-historyflag to condor_history to query historical Input, Output, and Checkpoint transfer ClassAds stored in the JOB_EPOCH_HISTORY files. (HTCONDOR-2878)¶Improved the parsing and handling of syntax errors in the transfer_output_remaps submit command. (HTCONDOR-2920)¶
DAGMan SERVICE nodes will no longer be removed automatically when a DAG contains a FINAL node and condor_rm is used on the DAGMan scheduler job. (HTCONDOR-2938)¶
The list of files generated by the manifest submit command now recursively includes subdirectories. (HTCONDOR-2903)¶
Added new option
-extractto condor_history to copy historical ClassAd entries that match a provided constraint to a specified file. (HTCONDOR-2923)¶EPs using disk enforcement via LVM and LVM_HIDE_MOUNT =
Truewill now advertise HasVM =Falsedue to VM universe jobs being incompatible with mount namespaces. (HTCONDOR-2945)¶Added support for running Docker universe on ARM hosts (HTCONDOR-2906)¶
The CLAIMTOBE authentication protocol now fully qualified user names with the system’s
$(UID_DOMAIN). To revert to the former semantics, set SEC_CLAIMTOBE_INCLUDE_DOMAIN to false. (HTCONDOR-2915)¶The condor_startd now distributes the LoadAvg assigned to a partitionable slot to the idle resources of the partitionable slot, and then to the dynamic slots. Machines that have only a single partitionable slot will now have the same behavior under a use policy:DESKTOP as they did in version 23.10.18 and 24.0.1. (HTCONDOR-2901)¶
HTCondor tarballs now contain Pelican 7.15.1
The condor package now requires pelican-7.15.1. The weak dependency is no longer used, because dnf would not update to the requested pelican version.
HTCondor tarballs now contain Apptainer 1.4.0
The condor RPM package now requires at least apptainer version 1.3.6.
Bugs Fixed:
Fixed a bug in the local issuer credential monitor that prevented the issuance of tokens using the WLCG profile. (HTCONDOR-2954)¶
Fixed bug where DAGMan would output an error message containing garbage when dumping failed node information to the debug log. (HTCONDOR-2899)¶
Fixed a bug where EP’s using STARTD_ENFORCE_DISK_LIMITS would mark a slot as
brokenwhen the condor_starter fails to remove the ephemeral logical volume but the condor_startd successfully removes the LV. (HTCONDOR-2953)¶Fixed a bug in the Vault credential monitor that kept credentials from being fetched if VAULT_CREDMON_PROVIDER_NAMES was unset. Introduced in HTCondor 24.3.0. (HTCONDOR-2912)¶
Fixed a bug in the local issuer credential monitor that kept credentials from being issued if LOCAL_CREDMON_TOKEN_VERSION (or named variant) was not set. (HTCONDOR-2965)¶
When using delegated cgroup v2, HTCondor no longer reports that that main job (often a pilot) has an out of memory condition when only the sub-job has hit an oom. (HTCONDOR-2944)¶
Fixed a bug that could cause the condor_starter to crash when running docker universe jobs with custom volume mounts. (HTCONDOR-2890)¶
Fixed a bug preventing spooled or remote jobs using preserve_relative_paths from working. (HTCONDOR-2877)¶
The condor_kbdd now also looks in the
XDG_RUNTIME_DIRECTORYwhen trying to find a XAuthority file to use to connect to a local X server. (HTCONDOR-2921)¶Fixed a bug that prevented daemons from updating their ads in the condor_collector when authentication is disabled but encryption or integrity is enabled. (HTCONDOR-2888)¶
Fixed a bug in condor_adstash that caused it to fail to discover condor_startd daemons using ENABLE_STARTD_DAEMON_AD (enabled by default since HTCondor 23.9). (HTCONDOR-2908)¶
Fixed a bug with transfer_output_remaps when given an erroneous trailing semicolon. (HTCONDOR-2910)¶
Fixed some bugs with parallel universe jobs that can cause the
condor_scheddto crash. (HTCONDOR-3049)¶Fixed inflated memory usage reporting for docker universe jobs on hosts using cgroups V2. The reported memory no longer includes the cached memory. (HTCONDOR-2961)¶
Fixed a bug where specifying transfer_output_remaps from a path which didn’t exist to a
file://URL would cause HTCondor to report a useless (albeit correct) error. (HTCONDOR-2790)¶Fixed a bug that could cause the condor_shadow daemon to crash when the transfer_input_files list was very long (thousands of characters). (HTCONDOR-2859)¶
Fixed a bug where two different condor_gridmanager processes could attempt to manage the same jobs when GRIDMANAGER_SELECTION_EXPR evaluated to
UNDEFINEDor an empty string for any job. (HTCONDOR-2895)¶Fixed a rare bug in the condor_schedd, when PER_JOB_HISTORY_DIR is set that could cause a repeated restart loop. (HTCONDOR-2902)¶
X.509 proxy delegation no longer fails when using OpenSSL 3.4.0 or later. (HTCONDOR-2904)¶
Fixed a bug that could cause the condor_gridmanager to crash when there were ARC CE jobs with no X509UserProxy. (HTCONDOR-2907)¶
Fixed a bug that usually prevented manifest from populating the
inandoutfiles. (HTCONDOR-2916)¶Fixed a bug that could cause a job submission to fail if a previous job submission to the same condor_schedd failed. (HTCONDOR-2917)¶
Fixed a bug where daemons wouldn’t immediately apply new security policy to incoming commands after a reconfigure. (HTCONDOR-2929)¶
Fixed a bug where condor_history would crash when reading a history file larger than
2GBin the default mode (backwards). (HTCONDOR-2933)¶Fixed a bug that caused the ce-audit plugin to fail. (HTCONDOR-2963)¶
Removed a scary-looking message in the log of the condor_collector about denying NEGOTIATOR-level authorization when the client wasn’t requesting that authorization level. (HTCONDOR-2964)¶
Fixed a bug that caused most updates of collector ads via UDP to be rejected. (HTCONDOR-2975)¶
Fixed a bug where the condor_shadow would wait for the job lease to expire (usually 40 minutes) before returning a job to idle status when the condor_starter failed to initialize. (HTCONDOR-2997)¶
The condor_startd now checks to see if the START expression of a static slot still evaluates to true before it allows a slot to be claimed. This helps to give an accurate reply to the condor_schedd when it tries to claim a slot with a START expression that changes frequently. (HTCONDOR-3013)¶
Version 24.6.1
Release Notes:
HTCondor version 24.6.1 released on March 27, 2025.
New Features:
None.
Bugs Fixed:
Security Item: This release of HTCondor fixes a security-related bug described at
Version 24.5.2
Release Notes:
HTCondor version 24.5.2 released on March 20, 2025.
New Features:
None.
Bugs Fixed:
The default value for STARTD_LEFTOVER_PROCS_BREAK_SLOTS has been changed to ‘False’. When ‘True’, the EP was erroneously marking slots as broken. (HTCONDOR-2946)¶
Version 24.5.1
Release Notes:
HTCondor version 24.5.1 released on March 4, 2025.
New Features:
The condor_starter now advertise StdoutMtime and StderrMtime which represent the most recent modification time, in seconds since the epoch of a job which uses file transfer. (HTCONDOR-2837)¶
The condor_startd, when running on a machine with Nvidia gpus, now advertises Nvidia driver version. (HTCONDOR-2856)¶
Increased the default width of condor_qusers output when redirected to a file or piped to another command to prevent truncation. (HTCONDOR-2861)¶
The condor_startd will now never lose track and leak logical volumes that were failed to be cleaned up when using STARTD_ENFORCE_DISK_LIMITS. The condor_startd will now periodically retry removal of logical volumes with an exponential back off. (HTCONDOR-2852)¶
The condor_startd will now keep dynamic slots that have a SlotBrokenReason attribute in
Unclaimedstate rather than deleting them when they change state toUnclaimed. A new configuration variable CONTINUE_TO_ADVERTISE_BROKEN_DYNAMIC_SLOTS controls this behavior. It defaults totruebut can be set tofalseto preserve the old behavior. This change also adds a new attribute BrokenContextAds to the daemon ad of the condor_startd. This attribute has a ClassAd for each broken resource in the startd. condor_status has been enhanced to use this new attribute to display more information about the context of broken resources when both-startdand-brokenarguments are used. (HTCONDOR-2844)¶The condor_startd will now permanently reduce the total slot resources advertised by a partitionable slot when a dynamic slot is deleted while it is marked as broken. The amount of reduction will be advertised in new attributes such as ad-attr:BrokenSlotCpus so that the original size of the slot can be computed. (HTCONDOR-2865)¶
Daemons will now more quickly discover with a non-responsive condor_collector has recovered and resume advertising to it. (HTCONDOR-2605)¶
Jobs can now request user credentials generated by any combination of the OAuth2, Local Issuer, and Vault credential monitors on the AP. Remote submitters can request these credentials without having any of the CREDMON-related parameters in their configuration files. (HTCONDOR-2851)¶
HTCondor tarballs now contain Pelican 7.13.0
Bugs Fixed:
Fixed a bug where the condor_gridmanager would write to log file GridmanagerLog.root after a reconfiguration. (HTCONDOR-2846)¶
htcondor annex shutdownnow works again. (HTCONDOR-2808)¶Fixed a bug where the job state table DAGMan prints to its debug file could contain a negative number for the count of failed jobs. (HTCONDOR-2872)¶
Fixed a bug where chirp would not work in container universe jobs using the docker runtime. (HTCONDOR-2866)¶
Fixed a bug where referencing
htcondor2.JobEvent.clustercould crash if processed log event was not associated with job(s) (i.e. had a negative value). (HTCONDOR-2881)¶Fixed a bug that caused the condor_gridmanager to abort if a job that it was managing disappeared from the job queue (i.e. due to someone running condor_rm -force). (HTCONDOR-2845)¶
Fixed a bug that caused grid ads from different Access Points to overwrite each other in the collector. (HTCONDOR-2876)¶
Fixed a memory leak that can occur in any HTCondor daemon when an invalid ClassAd expression is encountered. (HTCONDOR-2847)¶
Fixed a bug that caused daemons to go into infinite recursion, eventually crashing when they ran out of stack memory. (HTCONDOR-2873)¶
Version 24.4.0
Release Notes:
HTCondor version 24.4.0 released on February 4, 2025.
New Features:
Improved validation and cleanup of EXECUTE directories. The EXECUTE directory must now be owned by the condor user when the daemons are started as root. The condor_startd will not attempt to clean an invalid EXECUTE directory nor will it alter the file permissions of an EXECUTE directory. (HTCONDOR-2789)¶
For batch grid universe jobs, the PATH environment variable values from the job ad and the worker node environment are now combined. Previously, only the PATH value from the job ad was used. The old behavior can be restored by setting
blah_merge_paths=noin theblah.configfile. (HTCONDOR-2793)¶Many small improvements to condor_q
-analyzeand-better-analyzefor pools that use partitionable slots. As a part of this, the condor_schedd was changed to provide match information for the auto-cluster of the job being analyzed, which condor_q will report if it is available. (HTCONDOR-2720)¶The condor_startd now advertises a new attribute, SingularityUserNamespaces which is
truewhen apptainer or singularity work and are using Linux user namespaces, andfalsewhen it is using setuid mode. (HTCONDOR-2818)¶The condor_startd daemon ad now contains attributes showing the average and total bytes transferred to and from jobs during its lifetime. (HTCONDOR-2721)¶
The condor_credd daemon no longer listens on port
9620by default, but rather uses the condor_shared_port daemon. (HTCONDOR-2763)¶DAGMan will now periodically print a table regarding states of job placed to the Access Point to the debug log (
*.dagman.out). The rate at which this table in printed is dictated by DAGMAN_PRINT_JOB_TABLE_INTERVAL (HTCONDOR-2794)¶For arc grid universe jobs, the new submit command arc_data_staging can be used to supply additional elements to the DataStaging block of the ARC ADL that HTCondor constructs. (HTCONDOR-2774)¶
Bugs Fixed:
Changed the numeric output of htcondor job status so that the rounding to megabytes, gigabytes, etc. matches the binary definitions the rest of the tools use. (HTCONDOR-2788)¶
Fixed a bug in the negotiator that caused it to crash when matching offline ads. (HTCONDOR-2819)¶
Fixed a memory leak in the schedd that could be caused by
SCHEDD_CRONscripts that generate standard error output. (HTCONDOR-2817)¶Fixed a bug that cause the condor_schedd to crash with a segmentation fault if a condor_off
-fastcommand was run while a schedd cron script was running. (HTCONDOR-2815)¶Fixed issue where EPs using STARTD_ENFORCE_DISK_LIMITS would fill up the EPs filesystem due to excessive saving of metadata to
/etc/lvm/archive. (HTCONDOR-2791)¶Fixed bug where container_service_names did not work. (HTCONDOR-2829)¶
Fixed very rare bug that could cause the condor_startd to crash when the condor_collector times out queries and DNS is running very slowly. (HTCONDOR-2831)¶
Updated condor_upgrade_check to test for use for PASSWORD authentication and warn about the authenticated identity changing. (HTCONDOR-2823)¶
Version 24.3.0
Release Notes:
HTCondor version 24.3.0 released on January 6, 2025.
New Features:
Updated the condor_credmon_oauth and created a new
condor-credmon-multiRPM package which, when installed, allows user credentials added via Vault and user credentials generated via a local issuer to exist simultaneously without conflict (e.g. the Vault credmon will not attempt to refresh locally issued credentials). (HTCONDOR-2408)¶Added singularity launcher wrapper script that runs inside the container and launches the job proper. If this fails to run, HTCondor detects there is a problem with the container runtime, not the job, and reruns the job elsewhere. Controlled by parameter SINGULARITY_USE_LAUNCHER (HTCONDOR-1446)¶
EP’s using STARTD_ENFORCE_DISK_LIMITS will now advertise IsEnforcingDiskUsage in the machine ad. (HTCONDOR-2734)¶
Added new
AUTOoption to LVM_HIDE_MOUNT that creates a mount namespace for ephemeral logical volumes if the job is compatible with mount hiding (i.e not Docker jobs). TheAUTOvalue is now the default value. (HTCONDOR-2717)¶Added new submit command for container universe, mount_under_scratch that allows user to create writable ephemeral directories in their otherwise read only container images. (HTCONDOR-2728)¶
Environment variables from the job that start with
PELICAN_will now be set in the environment of the pelican file transfer plugin when it is invoked to do file transfer. This is intended to allow jobs to turn on enhanced logging in the plugin. (HTCONDOR-2674)¶When the condor_startd interrupts a job’s execution, the specific reason is now reflected in the job attributes VacateReason and VacateReasonCode. (HTCONDOR-2713)¶
Improved performance of condor_history by using the in-memory sort order of job attributes used by the condor_schedd. (HTCONDOR-2729)¶
If the startd detects that an exited or evicted job has leftover, unkillable processes, it now marks that slot as “broken”, and will not reassign the resources for that slot to any other jobs. Disabled if STARTD_LEFTOVER_PROCS_BREAK_SLOTS is set to false. (HTCONDOR-2756)¶
Methods in
htcondor2.Scheddwhich takejob_specarguments now accept a cluster ID in the form of anint. These functions (htcondor2.Schedd.act(),htcondor2.Schedd.edit(),htcondor2.Schedd.export_jobs(),htcondor2.Schedd.retrieve(), andhtcondor2.Schedd.unexport_jobs()) now also raiseTypeErrorif theirjob_specargument is not astr,listofstr,classad2.ExprTree, orint. (HTCONDOR-2745)¶Add new knob CGROUP_POLLING_INTERVAL which defaults to 5 (seconds), to control how often a cgroup system polls for resource usage. (HTCONDOR-2802)¶
Bugs Fixed:
Fixed a bug introduced in 24.2.0 where the daemons failed to start if configured to use only a network interface that didn’t have an IPv6 address. Also, the daemons will no longer bind and advertise an address that doesn’t match the value of NETWORK_INTERFACE. (HTCONDOR-2799)¶
The htcondor job submit command now issues credentials like condor_submit. (HTCONDOR-2745)¶
EPs spawned by htcondor annex no longer crash on start-up. (HTCONDOR-2745)¶
When resolving a hostname to a list of IP addresses, avoid using IPv6 link-local addresses. This change was done incorrectly in 23.9.6. (HTCONDOR-2746)¶
htcondor2.Submit.from_dag()andhtcondor.Submit.from_dag()now correctly raises an HTCondor exception when the processing of DAGMan options and submit time DAG commands fails. (HTCONDOR-2736)¶Fixed confusing job hold message that would state a job requested
0.0 GBof disk via request_disk when exceeding disk usage on Execution Points using STARTD_ENFORCE_DISK_LIMITS. (HTCONDOR-2753)¶You can now locate a collector daemon in the htcondor2 Python bindings. (HTCONDOR-2738)¶
Fixed a bug in condor_qusers tool where the
addargument would always enable rather than add a user. (HTCONDOR-2775)¶Fixed a bug where cgroup systems did not report peak memory, as intended but current instantaneous memory instead. (HTCONDOR-2800)¶ (HTCONDOR-2804)¶
Fixed an inconsistency in cgroup v1 systems where the memory reported by condor included memory used by the kernel to cache disk pages. (HTCONDOR-2807)¶
Fixed a bug on cgroup v1 systems where jobs that were killed by the Out of Memory killer did not go on hold. (HTCONDOR-2806)¶
Fixed incompatibility of condor_adstash with v2.x of the OpenSearch Python Client. (HTCONDOR-2614)¶
The
-subsystemargument of condor_status is once again case-insensitive for credd and defrag subsystem types. (HTCONDOR-2796)¶
Version 24.2.2
Release Notes:
HTCondor version 24.2.2 released on December 4, 2024.
New Features:
None.
Bugs Fixed:
If knob EXECUTE is explicitly set to a blank string in the configuration file for whatever reason, the execution point (startd) may attempt to remove all files from the root partition (everything in /) upon startup. (HTCONDOR-2760)¶
Version 24.2.1
Release Notes:
HTCondor version 24.2.1 released on November 26, 2024.
This version includes all the updates from Version 24.0.2.
The DAGMan metrics file has changed the name of metrics referring to
jobsto accurately refer to modern terminology asnodes. To revert back to old terminology set DAGMAN_METRICS_FILE_VERSION =1. (HTCONDOR-2682)¶
New Features:
DAGMan will now correctly submit late materialization jobs to an Access Point when DAGMAN_USE_DIRECT_SUBMIT =
True. (HTCONDOR-2673)¶Added new submit command primary_unix_group, which takes a string which must be one of the user’s supplemental groups, and sets the primary group to that value. (HTCONDOR-2702)¶
Improved DAGMan metrics file to use updated terminology and contain more metrics. (HTCONDOR-2682)¶
A condor_startd which has ENABLE_STARTD_DAEMON_AD enabled will no longer abort when it cannot create the required number of slots of the correct size on startup. It will now continue to run; reporting the failure to the collector in the daemon ad. Slots that can be fully provisioned will work normally. Slots that cannot be fully provisioned will exist but advertise themselves as broken. This is now the default behavior because daemon ads are enabled by default. The condor_status tool has a new option
-brokenwhich displays broken slots and their reason for being broken. Use this option with the-startdoption to display machines that are fully or partly broken. (HTCONDOR-2500)¶A new job attribute FirstJobMatchDate will be set for all jobs of a single submission to the current time when the first job of that submission is matched to a slot. (HTCONDOR-2676)¶
Added new job ad attribute InitialWaitDuration, recording the number of seconds from when a job was queued to when the first launch happened. (HTCONDOR-2666)¶
condor_ssh_to_job when entering an Apptainer container now sets the supplemental unix group ids in the same way that vanilla jobs have them set. (HTCONDOR-2695)¶
IPv6 networking is now fully supported on Windows. (HTCONDOR-2601)¶
Daemons will no longer block trying to invalidate their ads in a dead collector when shutting down. (HTCONDOR-2709)¶
Added option
FASTto configuration parameter MASTER_NEW_BINARY_RESTART. This will cause the condor_master to do a fast restart of all the daemons when it detects new binaries. (HTCONDOR-2708)¶
Bugs Fixed:
None.
Version 24.1.1
Release Notes:
HTCondor version 24.1.1 released on October 31, 2024.
This version includes all the updates from Version 24.0.1.
New Features:
Added
getto thehtcondor credentialnoun, which prints the contents of a stored OAuth2 credential. (HTCONDOR-2626)¶Added
htcondor2.set_ready_state()for those brave few writing daemons in the Python bindings. (HTCONDOR-2615)¶When blah_debug_save_submit_info is set in blah.config, the
stdoutandstderrof the blahp’s wrapper script is saved under the given directory. (HTCONDOR-2636)¶The DAG command SUBMIT-DESCRIPTION and node inline submit descriptions now work when DAGMAN_USE_DIRECT_SUBMIT =
False. (HTCONDOR-2607)¶Docker universe jobs now check the Architecture field in the image, and if it doesn’t match the architecture of the EP, the job is put on hold. The new parameter DOCKER_SKIP_IMAGE_ARCH_CHECK skips this. (HTCONDOR-2661)¶
Added a configuration template, use feature:DefaultCheckpointDestination. (HTCONDOR-2403)¶
Bugs Fixed:
If HTCondor detects that an invalid checkpoint has been downloaded for a self-checkpoint jobs using third-party storage, that checkpoint is now marked for deletion and the job rescheduled. (HTCONDOR-1258)¶