Version 9 Feature Releases¶
We release new features in these releases of HTCondor. The details of each version are described below.
HTCondor version 9.8.1 released on April 25, 2022.
Fix problem that can cause HTCondor to not start up when the network configuration is complex. Long hostnames, multiple CCB addresses, having both IPv4 and IPv6 addresses, and long private network names all contribute to complexity. (HTCONDOR-1070)
HTCondor version 9.8.0 released on April 21, 2022.
This version includes all the updates from Version 9.0.12.
Added the ability to do matchmaking and targeted resource binding of GPUs into dynamic slots while constraining on the properties of the GPUs. This new behavior is enabled by using the
-nestedoption of condor_gpu_discovery, along with the new
require_gpuskeyword of condor_submit. With this change HTCondor can now support heterogeneous GPUs in a single partitionable slot, and allow a job to require to be assigned with a specific GPU when creating a dynamic slot. (HTCONDOR-953)
Added ClassAd functions
evalInEachContext. These functions are used to support matchmaking of heterogeneous custom resources such as GPUs. (HTCONDOR-977)
Added the Reverse GAHP, which allows condor_remote_cluster to work with remote clusters that don’t allow SSH keys or require Multi-Factor Authentication for all SSH connections. (HTCONDOR-1007)
If an administrator configures additional custom docker networks on a worker node and would like jobs to be able to opt into use them, the startd knob
DOCKER_NETWORKShas been added to allow additional custom networks to be added to the docker_network_type submit command. (HTCONDOR-995)
-keycommand-line option to condor_token_request, which allows users to ask HTCondor to use a particular signing key when creating the IDTOKEN. Added the corresponding configuration macro, SEC_TOKEN_FETCH_ALLOWED_SIGNING_KEYS, which defaults to the default key (
maxfunctions now promote boolean values in the list being operated on to integers rather than to error. (HTCONDOR-970)
Fix for condor_gpu_discovery crash when run on Linux for Power (ppc64le) architecture. (HTCONDOR-967)
HTCondor version 9.7.1 released on April 5, 2022.
Fixed bug introduced in HTCondor v9.7.0 where job may go on hold without setting a
HoldReasonSubCodeattributes in the job classad. In particular, this could happen when file transfer using a file transfer plugin failed. (HTCONDOR-1035)
HTCondor version 9.7.0 released on March 15, 2022.
This version includes all the updates from Version 9.0.11.
Added list type configuration for periodic job policy configuration. Added
SYSTEM_PERIODIC_REMOVE_NAMESwhich each define a list of configuration variables to be evaluated for periodic job policy. (HTCONDOR-905)
Container universe now supports running singularity jobs where the command executable is hardcoded in to the runfile. We call this running the container as the job. (HTCONDOR-966)
In most situations, jobs in COMPLETED or REMOVED status will no longer transition to HELD status. Before, these jobs could transition to HELD status due to job policy expressions, the condor_rm tool, or errors encountered by the condor_shadow or condor_starter. Grid universe jobs may still transition to HELD status if the condor_gridmanager can not clean up job-related resources on remote systems. (HTCONDOR-873)
Improved performance of the condor_schedd during negotiation. (HTCONDOR-961)
For arc grid universe jobs, environment variables specified in the job ad are now included in the ADL job description given to the ARC CE REST service. Also, added new submit command
arc_application, which can be used to add additional elements under the
<Application>element of the ADL job description given to the ARC CE REST service. (HTCONDOR-932)
Reduce the size of the singularity test executable by not linking in libraries it doesn’t need. (HTCONDOR-927)
DAGMan now manages job submission by writing jobs directly to the condor_schedd, instead of forking a condor_submit process. This behavior is controlled by the
DAGMAN_USE_DIRECT_SUBMITconfiguration knob, which defaults to
If a job specifies
output_destination, the output and error logs, if requested, will now be transferred to their respective requested names, instead of
condor_qedit and the Python bindings no longer request that job ad changes be forwarded to an active condor_shadow or condor_gridmanager. If forwarding ad changes is desired (say to affect job policy evaluation), condor_qedit has a new -forward option. The Python methods Schedd.edit() and Schedd.edit_multiple() now have an optional flags argument of type TransactionFlags. (HTCONDOR-963)
Added more statistics about file transfers in the job ClassAd. (HTCONDOR-822)
When the blahp submits a job to HTCondor, it no longer requests email notification about job errors. (HTCONDOR-895)
Fixed a very rare bug in the timing subsystem that would prevent any daemon from appearing in the collector, and periodic expressions to be run less frequently than they should. (HTCONDOR-934)
The view server can now handle very long Accounting Group names (HTCONDOR-913)
Fixed some bugs where
allowed_job_durationwould be evaluated at the wrong points in a job’s lifetime. (HTCONDOR-922)
Fixed several bugs in file transfer where unexpected failures by file transfer plugins would not get handled correctly, resulting in empty Hold Reason messages and meaningless Hold Reason Subcodes reported in the job’s classad. (HTCONDOR-842)
HTCondor version 9.6.0 released on March 15, 2022.
Security Items: This release of HTCondor fixes security-related bugs described at
HTCondor version 9.5.4 released on February 8, 2022.
Improved the ability of the Access Point to detect the disappearance of an Execution Point that is running a job. Specifically, the ability of the condor_shadow to detect a problem with the condor_starter. (HTCONDOR-954)
HTCondor no longer assumes that PID 1 is always visible. Instead, it checks to see if
/procwas mounted with the
1or less, and only checks for PID 1 if it was. (HTCONDOR-944)
HTCondor version 9.5.3 released on February 1, 2021.
Added new configuration option, CCB_TIMEOUT. Added new configuration option, CCB_REQUIRED_TO_START, which if set causes HTCondor to exit if CCB_ADDRESS was set but HTCondor could not obtain one. CCB_REQUIRED_TO_START is ignored if USE_SHARED_PORT is set, which is the default. (HTCONDOR-925)
Fixed a bug that caused any daemon to crash when it was configured to report to more than one collector, and any of the collectors’ names could not be resolved by DNS. (HTCONDOR-952)
Fixed a bug introduced earlier in this series where in very rare cases, a schedd would not appear in the collector when it started up, but would appear an hour later. (HTCONDOR-931)
HTCondor version 9.5.2 released on January 25, 2021.
Fixed a bug where the condor_shadow could run indefinitely when it failed to contact the condor_startd in an attempt to kill the job. This problem could become visible to the user in several different ways, such as a job appearing to not go on hold when periodic_hold becomes true. (HTCONDOR-933)
Fix problem where condor_ssh_to_job may fail to connect to a job running under an HTCondor tarball installation (glidein) built from an RPM based platform. (HTCONDOR-942)
Fixed a bug in the file transfer mechanism where URL transfers caused subsequent failures to report incorrect error messages. (HTCONDOR-915)
HTCondor version 9.5.1 released on January 18, 2022.
HTCondor now properly creates directories when transferring a directory tree out of SPOOL while preserving relative paths. This bug would manifest after a self-checkpointing job created a file in a new subdirectory of a directory in its checkpoint: when the job was rescheduled and had to download its checkpoint, it would go on hold. (HTCONDOR-923)
HTCondor version 9.5.0 released on January 13, 2022.
This version includes all the updates from Version 9.0.9.
Added new Container Universe that allows users to describe container images that can be run in Singularity or Docker or other container runtimes. (HTCONDOR-850)
Docker universe jobs can now self-checkpoint by setting checkpoint_exit_code in submit files. (HTCONDOR-841)
Docker universe now works with jobs that don’t transfer any files. (HTCONDOR-867)
The blahp is now included in the HTCondor Linux native packages. (HTCONDOR-838)
The tool bosco_cluster is being renamed to condor_remote_cluster. The tool can still be used via the old name, but that will stop working in a future release. (HTCONDOR-733)
condor_adstash can parse and push ClassAds from a file to Elasticsearch by using the
--ad_file PATHoption. (HTCONDOR-779)
Fixed a bug where if the submit file set a checkpoint_exit_code, and the administrator enabled singularity support on the execute node, the job would go on hold at checkpoint time. (HTCONDOR-837)
HTCondor version 9.4.1 released on December 21, 2021.
Added activation metrics (
Fix a bug where the error number could be cleared before being reported when a file transfer plugin fails. (HTCONDOR-889)
HTCondor version 9.4.0 released on December 2, 2021.
This version includes all the updates from Version 9.0.8.
A new configuration variable
EXTENDED_SUBMIT_COMMANDScan now be used to extend the submit language by configuration in the condor_schedd. (HTCONDOR-802)
In a HAD configuration, the negotiator is now more robust when trying to update to collectors that may have failed. It will no longer block and timeout for an extended period of time should this happen. (HTCONDOR-816)
SINGULARITY_EXTRA_ARGUMENTS can now be a ClassAd expression, so that the extra arguments can depend on the job. (HTCONDOR-570)
The Environment command in a condor submit file can now contain the string $$(CondorScratchDir), which will get expanded to the value of the scratch directory on the execute node. This is useful, for example, when transferring software packages to the job’s scratch dir, when those packages need an environment variable pointing to the root of their install. (HTCONDOR-805)
The classad_eval tool now supports evaluating ClassAd expressions in the context of a match. To specify the target ad, use the new
-target-filecommand-line option. You may also specify the context ad with
-my-file, a synonym for
-file. The classad_eval tool also now supports the
Added a configuration parameter HISTORY_CONTAINS_JOB_ENVIRONMENT which defaults to true. When false, the job’s environment attribute is not saved in the history file. For some sites, this can substantially reduce the size of the history file, and allow the history to contain many more jobs before rotation. (HTCONDOR-497)
Added an attribute to the job ClassAd
LastRemoteWallClockTime. It holds the wall clock time of the most recent completed job execution. (HTCONDOR-751)
SUBMIT_REQUIREMENT_*operations in the condor_schedd are now applied to late materialization job factories at submit time. (HTCONDOR-756)
--rgahp-nologinto remote_gahp, which removes the
-loption normally given to
bashwhen starting a remote blahpd or condor_ft-gahp. (HTCONDOR-734)
Herefile support was added to configuration templates, and the template
use FEATURE : AssignAccountingGroupwas converted to from the old transform syntax to the the native transform syntax which requires that support. (HTCONDOR-796)
The GPU monitor will no longer run if
use feature:GPUsis enabled but GPU discovery did not detect any GPUs. This mechanism is available for other startd cron jobs; see STARTD_CRON_<JobName>_CONDITION. (HTCONDOR-667)
Added a new feature where a user can export some of their jobs from the condor_schedd in the form of a job-queue file intended to be used by a new temporary condor_schedd. After the temporary condor_schedd runs the jobs, the results can be imported back to the original condor_schedd. This is experimental code that is not suitable for production use. (HTCONDOR-179)
When running remote_gahp interactively to start a remote condor_ftp-gahp instance, the user no longer has to set a fake
CONDOR_INHERITenvironment variable. (HTCONDOR-819)
Fixed a bug that prevented the condor_procd (and thus all of condor) from starting when running under QEMU emulation. Condor can now build and run under QEMU ARM emulation with this fix. (HTCONDOR-761)
Fixed several unlikely bugs when parsing the time strings in ClassAds (HTCONDOR-814)
Fixed a bug when computing the identity of a job’s X.509 credential that isn’t a proxy. (HTCONDOR-800)
Fixed a bug that prevented file transfer from working properly on Unix systems when the job created a file to be transferred back to the submit machine containing a backslash in it. (HTCONDOR-747)
Fixed some bugs which could cause the counts of transferred files reported in the job ad to be inaccurate. (HTCONDOR-813)
HTCondor version 9.3.2 released on November 30, 2021.
Added new submit command
allowed_execute_duration, which limits how long a job can run – not including file transfer – expressed in seconds. If a job exceeds this limit, it is placed on hold. (HTCONDOR-820)
A problem where HTCondor would not create a directory on the execute node before trying to transfer a file into it should no longer occur. (This would cause the job which triggered this problem to go on hold.) One way to trigger this problem was by setting
preserve_relative_pathsand specifying the same directory in both
HTCondor version 9.3.1 released on November 9, 2021.
Added new submit command
allowed_job_duration, which limits how long a job can run, expressed in seconds. If a job exceeds this limit, it is placed on hold. (HTCONDOR-794)
HTCondor version 9.3.0 released on November 3, 2021.
This version includes all the updates from Version 9.0.7.
As we transition from identity based authentication and authorization (X.509 certificates) to capability based authorization (bearer tokens), we have removed Globus GSI support from this release. (HTCONDOR-697)
Submission to ARC CE via the GridFTP interface (grid universe type nordugrid) is no longer supported. Submission to ARC CE’s REST interface can be done using the arc type in the grid universe. (HTCONDOR-697)
Revamped machine ad attribute
OpSys*and configuration parameter
OPSYS*values for macOS. The OS name is now
macOSand the version number no longer ignores the initial
11.of the actual OS version. For example, for macOS 10.15.4, the value of machine attribute
"macOS 10.15"instead of
"MacOSX 15.4". (HTCONDOR-627)
Added an example template for a custom file transfer plugin, which can be used to build new plugins. (HTCONDOR-728)
Added a new generic knob for setting the slot user for all slots. Configure ‘’NOBODY_SLOT_USER`` for all slots, instead of configuring a
SLOT<N>_USERfor each slot. (HTCONDOR-720)
Improved and simplified how HTCondor locates the blahp software. Configuration parameter
GLITE_LOCATIONhas been replaced by
Added new attributes to the job ClassAd which records the number of files transferred between the condor_shadow and condor_starter only during the last run of the job. (HTCONDOR-741)
When declining to put a job on hold due to the temporary scratch directory disappearing, verify that the directory is expected to exist and require that the job not be local universe. (HTCONDOR-680)
HTCondor version 9.2.0 released on September 23, 2021.
This version includes all the updates from Version 9.0.6.
SERVICEnode type to condor_dagman: a special node which runs in parallel to a DAG for the duration of its workflow. This can be used to run tasks that monitor or report on a DAG workflow without directly impacting it. (HTCONDOR-437)
Added new configuration parameter
NEGOTIATOR_MIN_INTERVAL, which sets the minimum amount of the time between the start of one negotiation cycle and the next. (HTCONDOR-606)
The condor_userprio tool now accepts one or more username arguments and will report priority and usage for only those users (HTCONDOR-559)
Added a new
-yescommand-line argument to the condor_annex, allowing it to request EC2 instances without manual user confirmation. (HTCONDOR-443)
HTCondor no longer crashes on start-up if
COLLECTOR_HOSTis set to a string with a colon and a port number, but no host part. (HTCONDOR-602)
Changed the default value of configuration parameter
Removed unnecessary limit on history ad polling and fixed some configuration parameter checks in condor_adstash. (HTCONDOR-629)
HTCondor version 9.1.6 limited release on September 14, 2021.
Fixed a bug that prevented Singularity jobs from running when the singularity binary emitted many warning messages to stderr. (HTCONDOR-698)
HTCondor version 9.1.5 limited release on September 8, 2021.
The number of files transferred between the condor_shadow and condor_starter is now recorded in the job ad with the new attributes. (HTCONDOR-679)
HTCondor version 9.1.4 limited release on August 31, 2021.
Jobs are no longer put on hold if a failure occurs due to the scratch execute directory unexpectedly disappearing. Instead, the jobs will return to idle status to be re-run. (HTCONDOR-664)
Fixed a problem introduced in HTCondor version 9.1.3 where X.509 proxy delegation to older versions of HTCondor would fail. (HTCONDOR-674)
HTCondor version 9.1.3 released on August 19, 2021.
This version includes all the updates from Version 9.0.5.
Globus GSI is no longer needed for X.509 proxy delegation
GSI is no longer in the list of default authentication methods. To use GSI, you must enable it by setting one or more of the
SEC_<access-level>_AUTHENTICATION_METHODSconfiguration parameters. (HTCONDOR-518)
The semantics of undefined user job policy expressions has changed. A policy whose expression evaluates to undefined is now uniformly ignored, instead of either putting the job on hold or treated as false. (HTCONDOR-442)
Added two new attributes to the job ClassAd,
NumHoldsByReason, that are used to provide historical information about how often this job went on hold and why. Details on all job ClassAd attributes, including these two new attributes, can be found in section: Job ClassAd Attributes (HTCONDOR-554)
The “ToE tag” entry in the job event log now includes the exit code or signal number, if and as appropriate. (HTCONDOR-429)
Docker universe jobs are now run under the built-in docker init process, which means that zombie processes are automatically reaped. This can be turned off with the knob DOCKER_RUN_UNDER_INIT = false (HTCONDOR-462)
Many services support the “S3” protocol. To reduce confusion, we’ve added new aliases for the submit-file commands
s3_secret_access_key_file. We also added support for
gs://-style Google Cloud Storage URLs, with the corresponding
gs_secret_access_key_filealiases. This support, and the aliases, use Google Cloud Storage’s “interoperability” API. The HMAC access key ID and secret keys may be obtained from the Google Cloud web console’s “Cloud Storage” section, the “Settings” menu item, under the “interoperability” tab. (HTCONDOR-453)
Add new submit command
batch_extra_submit_argsfor grid universe jobs of type
batch. This lets the user supply arbitrary command-line arguments to the submit command of the target batch system. These are supplied in addition to the command line arguments derived from other attributes of the job ClassAd. (HTCONDOR-526)
When GSI authentication is configured or used, a warning is now printed to daemon logs and the stderr of tools. These warnings can be suppressed by setting configuration parameters
Introduced a new command-line tool,
htcondor(see man page) for managing HTCondor jobs and resources. This tool also includes new capabilities for running HTCondor jobs on Slurm machines which are temporarily acquired to act as HTCondor execution points. (HTCONDOR-252)
Fixed a bug where jobs cannot start on Linux if the execute directory is placed under /tmp or /var/tmp. The problem is this breaks the default MOUNT_UNDER_SCRATCH option. As a result, if the administrator located EXECUTE under tmp, HTCondor can no longer make a private /tmp or /var/tmp directory for the job. (HTCONDOR-484)
HTCondor version 9.1.2 released on July 29, 2021.
Security Items: This release of HTCondor fixes security-related bugs described at
HTCondor version 9.1.1 released on July 27, 2021 and pulled two days later when an issue was found with a patch.
HTCondor version 9.1.0 released on May 20, 2021.
This version includes all the updates from Version 9.0.1.
The condor_convert_history command was removed. (HTCONDOR-392)
Added support for submission to the ARC CE REST interface via the new grid universe type arc. (HTCONDOR-138)
Added a new option in DAGMan to put failed jobs on hold and keep them in the queue when DAGMAN_PUT_FAILED_JOBS_ON_HOLD is True. For some types of transient failures, this allows users to fix whatever caused their job to fail and then release it, allowing the DAG execution to continue. (HTCONDOR-245)
gdb and strace now work in Docker Universe jobs. (HTCONDOR-349)
The condor_startd on platforms that support Docker now runs a simple Docker container at startup to verify that docker universe completely works. This can be disabled with the knob DOCKER_PERFORM_TEST (HTCONDOR-325)
On Linux machines with performance counter support, vanilla universe jobs now report the number of machine instructions executed (HTCONDOR-390)