Version 23 Feature Releases

We release new features in these releases of HTCondor. The details of each version are described below.

Version 23.7.2

Release Notes:

  • HTCondor version 23.7.2 released on May 16, 2024.

  • This version includes all the updates from Version 23.0.10.

  • The use of multiple queue statements in a single submit description file is now deprecated. This functionality is planned to be removed during the lifetime of the V24 feature series. (HTCONDOR-2338)

  • The semantics of skip_if_dataflow have been changed to make more sense. The restrictions have been documented. (HTCONDOR-1899)

  • HTCondor tarballs now contain Pelican 7.8.2 (HTCONDOR-2399)

  • When removing a large dag, the schedd now removes any existing child dag jobs in a non-blocking way, making the schedd more responsive during this removal. (HTCONDOR-2364)

  • NOTE: Soon, IDTOKEN files with permissive file protections will be ignored. In particular, the /etc/condor/tokens.d directory and the tokens contained within should be only accessible by the root account.

New Features:

  • Periodic policy expressions like periodic_remove are now checked for during file input transfer. Previously, HTCondor didn’t start running these checks until the file transfer was finished at the job proper started. (HTCONDOR-2362)

  • A local universe job can now specify a container image, and it will run with that singularity or apptainer container runtime. (HTCONDOR-2180)

  • File transfer plugins that are installed on the EP can now advertise extra attributes into the STARTD ads. (HTCONDOR-1051)

  • DAGMan can now write a rescue DAG and abort when condor_dagman has been pending on nodes for DAGMAN_CHECK_QUEUE_INTERVAL seconds and the associated jobs are not found in the local condor_schedd queue. (HTCONDOR-1546)

  • In the unlikely event that a shadow exception event happens, the text is now saved in the job ad attribute LastShadowException for further debugging. (HTCONDOR-1896)

  • We now compute the path to the proper python3 interpreter for condor_watch_q at compile time. This should not change anything, but if it does break, the guilty ticket is: (HTCONDOR-1146)

  • If a collector defines a local-name, but not a COLLECTOR_NAME, the local name is now used as the default name. (HTCONDOR-1105)

  • Most daemon log messages about tasks in the STARTD_CRON_JOBLIST, BENCHMARKS_JOBLIST or SCHEDD_CRON_JOBLIST that were logged as D_FULLDEBUG messages are now logged using the new message category D_CRON. (HTCONDOR-2308)

  • A new -jobset display option was added to condor_q. If jobsets are enabled in the condor_schedd it will show information from the jobset ads. (HTCONDOR-2358)

  • If a schedd has a schedd-specific SPOOL directory (set by schedd_name.SPOOL), the schedd now creates that directory with the proper ownership and permissions. (HTCONDOR-907)

  • The file specified using the submit command starter_log is now returned on both success and on failure when the submit command when_to_transfer_output is set to ON_SUCCESS. In addition, a failure to transfer input is now treated as a failure for purposes of of ON_SUCCESS. (HTCONDOR-2347)

  • Removed some of the logging while loading the security configuration and moved some of the logging to D_SECURITY:2 to make the -debug:D_SECURITY option of the various tools more useful. (HTCONDOR-2369)

Bugs Fixed:

  • Fixed a bug where condor_submit -i did not work on a cgroup v2 system. (HTCONDOR-2438)

  • Fixed bug on cgroup v2 systems where a race condition could cause a job to run in the wrong cgroup v2 for a very short amount of time. If this job spawned a sub-job, the child job would forever live in the wrong cgroup. (HTCONDOR-2423)

  • Fixed a bug where using output_destination would still create directories on the access point. (HTCONDOR-2353)

Version 23.6.2

  • HTCondor version 23.6.2 released on April 16, 2024.

New Features:

  • None.

Bugs Fixed:

  • Fixed bug where the HoldReasonSubcode was not the documented value for jobs put on hold because of errors running a file transfer plugin. (HTCONDOR-2373)

Version 23.6.1

Release Notes:

  • HTCondor version 23.6.1 released on April 15, 2024.

  • NOTE: Soon, IDTOKEN files with permissive file protections will be ignored. In particular, the /etc/condor/tokens.d directory and the tokens contained within should be only accessible by the root account.

  • This version includes all the updates from Version 23.0.8.

New Features:

  • Allow the condor_startd to force a job that doesn’t ask to run inside a docker or apptainer container inside one with new parameters USE_DEFAULT_CONTAINER and DEFAULT_CONTAINER_IMAGE (HTCONDOR-2317)

  • Added new submit command docker_override_entrypoint to allow docker universe jobs to override the entrypoint in the image. (HTCONDOR-2321)

  • condor_q -better-analyze now emits the units for memory and disk. (HTCONDOR-2333)

  • Updated get_htcondor to allow the aliases lts for stable and feature for current when passed to the –channel option. (HTCONDOR-775)

  • Add htcondor job out, err, and log verbs to the htcondor CLI tool. (HTCONDOR-2182)

  • The condor_startd now honors the environment variable OMP_NUM_THREADS when setting the number of cores available. This allows glideins to pass an allocated number of cores from a base batch system to the glidein easily. (HTCONDOR-727)

  • If the EP is started under another batch system that limits the amount of memory to the EP via a cgroup limit, the condor_startd now advertises this much memory available for jobs. (HTCONDOR-727)

  • Added new job ad attribute JobSubmitFile which contains the filename of the submit file, if any. (HTCONDOR-2319)

  • When the docker_network_type is set to host, docker universe now sets the hostname inside the container to the same as the host, to ease networking from inside the container to outside the container. (HTCONDOR-2294)

  • For vanilla universe jobs not running under container universe, that manually start apptainer or singularity, the environment variables APPTAINER_CACHEDIR and SINGULARITY_CACHEDIR are now set to the scratch directory to insure any files they create are cleaned up on job exit. (HTCONDOR-2337)

  • condor_submit with the -i (interactive) flag, and also run with a submit file, now transfers the executable to the interactive job. (HTCONDOR-2315)

  • Added the environment variable PYTHON_CPU_COUNT to the set of environment variables set for jobs to indicate how many CPU cores are provisioned. Python 3.13 uses this override the detected count of CPU cores. (HTCONDOR-2330)

  • Added -file option to condor_token_list (HTCONDOR-575)

  • The configuration parameter ETC can now be used to relocate files that are normally place under /etc/condor on Unix platforms. (HTCONDOR-2290)

  • The submit file expansion $(CondorScratchDir) now works for local universe. (HTCONDOR-2324)

  • For jobs that go through the grid universe or Job Router, the terminate event will now include extended resource allocation and usage information when available. (HTCONDOR-2281)

  • The package containing the Pelican OSDF file transfer plugin is now a weak dependency for HTCondor. (HTCONDOR-2295)

  • Include a weak dependency on bash-completion so the htcondor CLI command has <TAB> completions. (HTCONDOR-2311)

  • DAGMan no longer suppresses email notifications for jobs it manages by default. To revert behavior of suppressing notifications set DAGMAN_SUPPRESS_NOTIFICATION to True. (HTCONDOR-2323)

  • Added configuration knobs GANGLIAD_WANT_RESET_METRICS and GANGLIAD_RESET_METRICS_FILE, enabling condor_gangliad to be configured to reset aggregate metrics to a value of zero when they are no longer being updated. Previously aggregate metrics published to Ganglia retained the last value published indefinitely. (HTCONDOR-2346)

  • The Job Router route keyword GridResource is now always optional. The job attribute GridResource can be set instead via a SET or similar command in the route definition. (HTCONDOR-2329)

  • The configuration variables SLOTS_CONNECTED_TO_KEYBOARD and SLOTS_CONNECTED_TO_CONSOLE now apply to partitionable slots but do not count them as slots. As a consequence of this change, when either of these variables are set equal to the number of CPUs, all slots will be connected. (HTCONDOR-2331)

Bugs Fixed:

Version 23.5.3

  • HTCondor version 23.5.3 released on March 25, 2024.

  • HTCondor tarballs now contain Pelican 7.6.2

New Features:

  • None.

Bugs Fixed:

  • None.

Version 23.5.2

Release Notes:

  • HTCondor version 23.5.2 released on March 14, 2024.

  • This version includes all the updates from Version 23.0.6.

  • The library libcondorapi has been removed from the distribution. We know of no known user for this C++ event log reading code, and all of our known users use the Python bindings for this, as we recommend. (HTCONDOR-2278)

New Features:

Bugs Fixed:

  • In some rare cases where docker universe could not start a container, it would not remove that container until the next time the start restarted. Now it is removed as soon as possible. (HTCONDOR-2263)

  • In rare cases, the values of TimeSlotBusy and TimeExecute would be incorrect in the job event log when the job was disconnected or did not start properly. (HTCONDOR-2265)

  • Fixed a bug that can cause the condor_gridmanager to abort when multiple grid universe jobs share the same proxy file to be used to authenticate with the remote job scheduling service. (HTCONDOR-2334)

Version 23.4.0

Release Notes:

  • HTCondor version 23.4.0 released on February 8, 2024.

  • This version includes all the updates from Version 23.0.4.

New Features:

  • Added configuration parameter SUBMIT_REQUEST_MISSING_UNITS, to warn or prevent submitting with RequestDisk or RequestMemory without a units suffix. (HTCONDOR-1837)

  • On RPM-based distributions, a new package condor-credmon-local is now available which provides the local SciTokens issuer credmon without installing extra packages required by the OAuth credmon. The condor-credmon-local package is now a dependency of the condor-credmon-oauth package. (HTCONDOR-2197)

  • The htcondor command line tools eventlog read command now optionally takes more than one eventlog to process at once. (HTCONDOR-2220)

  • Docker universe now passes –log-driver none by default when running jobs, but can be disabled with DOCKER_LOG_DRIVER_NONE knob. (HTCONDOR-2190)

  • Jobs that are assigned nVidia GPUs now have the environment variable NVIDIA_VISIBLE_DEVICES set in addition to, and with the same value as CUDA_VISIBLE_DEVICES, as newer nVidia run-times prefer the former. (HTCONDOR-2189)

  • Added job classad attribute ContainerImageSource, a string which is is set to the source of the image transfer. (HTCONDOR-1797)

  • If PER_JOB_HISTORY_DIR is set, it is now a fatal error to write a historical job to the history file, just like the normal history file. (HTCONDOR-2027)

  • condor_submit now generates requirements expressions for condor grid universe jobs like it does for vanilla universe jobs. This can be disabled by setting the new configuration parameter SUBMIT_GENERATE_CONDOR_C_REQUIREMENTS to False. (HTCONDOR-2204)

Bugs Fixed:

Version 23.3.1

  • HTCondor version 23.3.1 released on January 23, 2024.

  • HTCondor tarballs now contain Pelican 7.4.0

New Features:

  • None.

Bugs Fixed:

  • None.

Version 23.3.0

Release Notes:

  • HTCondor version 23.3.0 released on January 4, 2024.

  • Limited support for Enterprise Linux 7 in the 23.x feature versions. Since we are developing new features, the Enterprise Linux 7 build may drop features or be dropped entirely. In particular, Python 2 and OATH credmon support will be removed during the 23.x development cycle. (HTCONDOR-2194)

  • This version includes all the updates from Version 23.0.3.

New Features:

  • Improved the -convertoldroutes option of condor_transform_ads and added a new -help convert option. These changes are meant to assist in the conversion of CE’s away from the deprecated transform syntax. (HTCONDOR-2146)

  • Added ability for DAGMan node script STDOUT and/or STDERR streams be captured in a user defined debug file. For more information visit DAGMan script DEBUG file (HTCONDOR-2159)

  • Improve hold message when jobs on cgroup system exceed their memory limits. (HTCONDOR-1533)

  • Startd now advertises when jobs are running with cgroup enforcement in the slot attribute CgroupEnforced (HTCONDOR-1532)

  • START_CRON_LOG_NON_ZERO_EXIT now also logs the stderr of the startd cron job to the StartLog. (HTCONDOR-1138)

Bugs Fixed:

  • Container universe now works when file transfer is disabled or not used. (HTCONDOR-1329)

  • Removed confusing message in StartLog at shutdown about trying to kill illegal pid. (HTCONDOR-1012)

Version 23.2.0

Release Notes:

  • HTCondor version 23.2.0 released on November 29, 2023.

  • This version includes all the updates from Version 23.0.2.

New Features:

  • Added periodic_vacate to the submit language and SYSTEM_PERIODIC_VACATE to the configuration system. Historically, users used periodic_hold/release to evict “stuck” jobs, that is jobs that should finish in some amount of time, but sometimes run for an arbitrarily long time. Now with this new feature, for improved usability, users may use this single periodic_vacate submit command instead. (HTCONDOR-2114)

  • Linux EPs now advertise the startd attribute HasRotationalScratch to be true when HTCondor detects that the execute directory is on a rotational hard disk and false when the kernel reports it to be on SSD, NVME, or tmpfs. (HTCONDOR-2085)

  • Added TimeSlotBusy and TimeExecute to the event log terminate events to indicate how much wall time a job used total (including file transfer) and just for the job execution proper, respectively. (HTCONDOR-2101)

  • Most files that HTCondor generates are now written in binary mode on Windows. As a result, each line in these files will end in just a line feed character, without a preceding carriage return character. Files written by jobs are unaffected by this change. (HTCONDOR-2098)

  • HTCondor now uses the Pelican Platform to do file transfers with the Open Science Data Federation (OSDF). (HTCONDOR-2100)

  • HTCondor now does a better job of cleaning up inner cgroups left behind by glidein pilots. (HTCONDOR-2081)

  • Added new configuration option <Keyword>_HOOK_PREPARE_JOB_ARGS to allow the passing of arguments to specified prepare job hooks. (HTCONDOR-1851)

  • The default trusted CAs for OpenSSL are now always used by default in addition to any specified by AUTH_SSL_SERVER_CAFILE, AUTH_SSL_CLIENT_CAFILE, AUTH_SSL_SERVER_CADIR, and AUTH_SSL_CLIENT_CADIR. The new configuration parameters AUTH_SSL_SERVER_USE_DEFAULT_CAS and AUTH_SSL_CLIENT_USE_DEFAULT_CAS can be used to disable use of the default CAs for OpenSSL. (HTCONDOR-2090)

  • Using condor_store_cred to set a pool password on Windows now requires ADMINISTRATOR authorization with the condor_master (instead of CONFIG authorization). (HTCONDOR-2106)

  • When condor_remote_cluster installs binaries on an EL7 machine, it now uses the latest 23.0.x release. Before, it would fail, as current feature versions of HTCondor are not available on EL7. (HTCONDOR-2125)

  • HTCondor daemons on Linux no longer run very slowly when the ulimit for the maximum number of open files is very high. (HTCONDOR-2128)

  • Somewhat improved the performance of the _DEBUG flag D_FDS. But please don’t use this unless absolutely needed. (HTCONDOR-2050)

Bugs Fixed:

  • None.

Version 23.1.0

Release Notes:

  • HTCondor version 23.1.0 released on October 31, 2023.

  • This version includes all the updates from Version 23.0.1.

  • Enterprise Linux 7 support is discontinued with this release.

  • We have added HTCondor Python wheels for the aarch64 CPU architecture on PyPI. (HTCONDOR-2120)

New Features:

  • Improved condor_watch_q to filter tracked jobs based on cluster IDs either provided by the -clusters option or found in association to batch names provided by the -batches option. This helps limit the amount of output lines when using an aggregate/shared log file. (HTCONDOR-2046)

  • Added new -larger-than flag to condor_watch_q that filters tracked jobs to only include jobs with cluster IDs greater than or equal to the provided cluster ID. (HTCONDOR-2046)

  • The Access Point can now be told to use a non-standard ssh port when sending jobs to a remote scheduling system (such as Slurm). You can now specify an alternate ssh port with condor_remote_cluster. (HTCONDOR-2002)

  • Laid groundwork to allow an Execution Point running without root access to accurately limit the job’s usage of CPU and Memory in real time via Linux kernel cgroups. This is particularly interesting for glidein pools. Jobs running in cgroup v2 systems can now subdivide the cgroup they have been given, so that pilots can enforce sub-limits of the resources they are given. (HTCONDOR-2058)

  • HTCondor file transfers using HTTPS can now utilize CA certificates in a non-standard location. The curl_plugin tool now recognizes the environment variable X509_CERT_DIR and configures libcurl to search the given directory for CA certificates. (HTCONDOR-2065)

  • Improved performance of condor_schedd, and other daemons, by caching the value in /etc/localtime, so that debugging logs aren’t always stat’ing that file. (HTCONDOR-2064)

Bugs Fixed:

  • None.