Stable Release Series 9.0¶
This is the stable release series of HTCondor. As usual, only bug fixes (and potentially, ports to new platforms) will be provided in future 9.0.x releases. New features will be added in the 9.1.x development series.
The details of each version are described below.
HTCondor version 9.0.5 released on August 18, 2021.
If the SCITOKENS authentication method succeeds (that is, the client presented a valid SciToken) but the user-mapping fails, HTCondor will try the next authentication method in the list instead of failing. (HTCONDOR-589)
The bosco_cluster command now creates backup files when the
--overrideoption is used. (HTCONDOR-591)
Improved the detection of Red Hat Enterprise Linux based distributions. Previously, only
CentOSwas recognized. Now, other distributions such as
Rockyshould be recognized. (HTCONDOR-609)
condor-boincpackage is no longer required to be installed with HTCondor, thus making
Fixed a bug on the Windows platform where condor_submit would crash rarely after successfully submitting a job. This caused problems for programs that look at the return status of condor_submit, including condor_dagman (HTCONDOR-579)
The job attribute
ExitCodeis no longer missing from the job ad after
Fixed a bug where running condor_who as a non-root user on a Unix system would print a confusing warning to stderr about running as non-root. (HTCONDOR-590)
Fixed a bug where
condor_gpu_discoverywould not report any GPUs if any MIG-enabled GPU on the system were configured in certain ways. Fixed a bug which could cause
condor_gpu_discovery’s output to become unparseable after certain errors. (HTCONDOR-476)
HTCondor no longer ignores files in a job’s spool directory if they happen to share a name with an entry in
transfer_input_files. This allows jobs to specify the same file in
transfer_checkpoint_files, and still resume properly after a checkpoint. (HTCONDOR-583)
Fixed a bug where jobs running on Linux machines with cgroups enabled would not count files created in /dev/shm in the MemoryUsage attribute. (HTCONDOR-586)
Fixed a bug in the condor_now tool, where the condor_schedd would not use an existing security session to run the selected job on the claimed resources. This could often lead to the job being unable to start. (HTCONDOR-603)
HTCondor version 9.0.4 released on July 29, 2021.
Security Item: This release of HTCondor fixes a security-related bug described at
HTCondor version 9.0.3 released on July 27, 2021 and pulled two days later when an issue was found with a patch.
HTCondor version 9.0.2 released on July 8, 2021.
Removed support for GRAM grid jobs. (HTCONDOR-561)
HTCondor can now be configured to only use FIPS 140-2 approved security functions by using the new configuration template:
use security:FIPS. (HTCONDOR-319)
Added new command-line flag to condor_gpu_discovery,
-divide, which functions like
-repeat, except that it divides the GPU attribute
GlobalMemoryMbby the number of repeats (and adds the GPU attribute
DeviceMemoryMb, which is the undivided total). To enable this new behavior, modify
The maximum line length for
SCHEDD_CRONjob output has been extended from 8k bytes to 64k bytes. (HTCONDOR-498)
Added two new commands to condor_submit -
Reduced condor_shadow memory usage by 40% or more on machines with many (more than 64) cores. This allows a correspondingly greater number of shadows and thus jobs to run on these submit machines. (HTCONDOR-540)
Added support for using an authenticated SMTP relay on port 587 to condor_mail.exe on Windows. (HTCONDOR-303)
The condor_job_router_info tool will now show info for a rootly JobRouter even when the tool is not running as root. This change affects the way jobs are matched when using the
condor_gpu_discovery now recognizes Capability 8.6 devices and reports the correct number of cores per Compute Unit. (HTCONDOR-544)
Added command line option
--copy-ssh-keyto bosco_cluster. When set to no, this option prevents bosco_cluster from installing an ssh key on the remote system, and assume passwordless ssh is already possible. (HTCONDOR-270)
Update to be able to link in scitokens-cpp library directly, rather than always using dlopen(). This allows SciTokens to be used with the conda-forge build of HTCondor. (HTCONDOR-541)
When a Singularity container is started, and the test is run before the job, and the test fails, the job is now put back to idle instead of held. (HTCONDOR-539)
Fixed Munge authentication, which was broken starting with HTCondor 8.9.9. (HTCONDOR-378)
Fixed a bug in the Windows MSI installer where installation would only succeed at the default location of
Fixed a bug that prevented docker universe jobs from running on machines whose hostnames were longer than about 60 characters. (HTCONDOR-473)
Fixed a bug that prevented bosco_cluster from detecting the remote host’s platform when it is running Scientific Linux 7. (HTCONDOR-503)
Fixed a bug that caused the
delete-krboptions of condor_store_cred to fail. This bug also affected the Python bindings
GridJobIdis no longer removed from the job ad of grid-type
batchjobs when the job enters
Fixed a bug that could prevent HTCondor from noticing new events in job event logs, if those logs were being written from one machine and read from another via AFS. (HTCONDOR-463)
Using expressions for values in the ads of grid universe jobs of type batch now works correctly. (HTCONDOR-507)
Fixed a bug that prevented a personal condor from running in a private user namespace. (HTCONDOR-550)
Fixed a bug that cause the condor_master to hang for up to two minutes when shutting down, if it was configured to be a personal condor. (HTCONDOR-548)
When a grid universe job of type
nordugridfails on the remote system, the local job is now put on hold, instead of automatically resubmitted. (HTCONDOR-535)
Fixed a bug that caused SSL authentication to crash on rare occasions. (HTCONDOR-428)
Added the missing Ceiling attribute to negotiator user priorities in the Python bindings. (HTCONDOR-560)
Fixed a bug in DAGMan where SUBMIT-DESCRIPTION statements were incorrectly logging duplicate description warnings. (HTCONDOR-511)
Add the libltdl library to the HTCondor tarball. This library was inadvertently omitted when streamlining the build process in version 8.9.12. (HTCONDOR-576)
HTCondor version 9.0.1 released on May 17, 2021.
The installer for Windows will now replace the
condor_configfile even on an update. You must use
condor_config.localor a configuration directory to customize the configuration if you wish to preserve configuration changes across updates.
There is a known issue with the installer for Windows where it does not honor the Administrator Access list set in the MSI permissions dialog on a fresh install. Instead it will always set the Administrator access to the default value.
MUNGE security is temporarily broken.
The Windows MSI installer now sets up user-based authentication and creates an IDTOKEN for local administration. (HTCONDOR-407)
AssignAccountingGroupconfiguration template is in effect and a user submits a job with a requested accounting group that they are not permitted to use, the submit will be rejected with an error message. This configuration template has a new optional second argument that can be used to quietly ignore the requested accounting group instead. (HTCONDOR-426)
HTCondor now parses
/usr/share/condor/config.d/for configuration before
/etc/condor/config.d, so that packagers have a convenient place to adjust the HTCondor configuration. (HTCONDOR-45)
Added a boolean option
LOCAL_CREDMON_TOKEN_USE_JSONfor the local issuer condor_credmon_oauth that is used to decide whether or not the bare token string in a generated access token file is wrapped in JSON. Default is
LOCAL_CREDMON_TOKEN_USE_JSON = true(wrap token in JSON). (HTCONDOR-367)
Fixed a bug with jobs that require running on a different machine after a failure by referring to MachineAttrX attributes in their requirements expression. (HTCONDOR-434)
Fixed a bug in the way
AutoClusterAttrswas calculated that could cause matchmaking to ignore attributes changed by
Fixed a bug in the implementation of the submit commands
success_exit_codewhich would cause jobs which exited on a signal to go on hold (instead of exiting or being retried). (HTCONDOR-430)
Fixed a memory leak in the job router, usually triggered when job policy expressions cause removal of the job. (HTCONDOR-408)
Fixed some bugs that caused
bosco_cluster --addto fail. Allow
remote_gahpto work with older Bosco installations via the
--rgahp-scriptoption. Fixed security authorization failure between condor_gridmanager and condor_ft-gahp. (HTCONDOR-433) (HTCONDOR-438) (HTCONDOR-451) (HTCONDOR-452) (HTCONDOR-487)
Fixed a bug in condor_submit when a
SEC_CREDENTIAL_PRODUCERwas configured that could result in condor_submit reporting that the Queue statement of a submit file was missing or invalid. (HTCONDOR-427)
Fixed a bug in the local issuer condor_credmon_oauth where SciTokens version 2.0 tokens were being generated without an “aud” claim. The “aud” claim is now set to
LOCAL_ISSUER_TOKEN_AUDIENCE. The “ver” claim can be changed from the default of “scitokens:2.0” by setting
Fixed several bugs that could result in the condor_token_ tools aborting with a c++ runtime error on newer versions of Linux. (HTCONDOR-449)
HTCondor version 9.0.0 released on April 14, 2021.
The installer for Windows platforms was not ready for 9.0.0. Windows support will appear in 9.0.1.
Removed support for CREAM and Unicore grid jobs, glexec privilege separation, DRMAA, and condor_cod.
MUNGE security is temporarily broken.
The bosco_cluster command is temporarily broken.
A new tool condor_check_config can be used after an upgrade when you had a working condor configuration before the upgrade. It will report configuration values that should be changed. In this version the tool for a few things related to the change to a more secure configuration by default. (HTCONDOR-384)
The condor_gpu_discovery tool now defaults to using
-short-uuidform for GPU ids on machines where the CUDA driver library has support for them. A new option
-by-indexhas been added to select index-based GPU ids. (HTCONDOR-145)
Fixed a bug introduced in 8.9.12 where the condor_job_router inside a CE would crash when evaluating periodic expressions (HTCONDOR-402)