Upgrading from an 10.0 LTS version to an 23.0 LTS version of HTCondor

Upgrading from a 10.0 LTS version of HTCondor to a 23.0 LTS version will bring new features introduced in the 10.x versions of HTCondor. These new features include the following (note that this list contains only the most significant changes; a full list of changes can be found in the version history: Version 10 Feature Releases):

  • A condor_startd without any slot types defined will now default to a single partitionable slot rather than a number of static slots equal to the number of cores as it was in previous versions. The configuration template use FEATURE : StaticSlots was added for admins wanting the old behavior. (HTCONDOR-2026)

  • In an HTCondor Execution Point started by root on Linux, the default for cgroups memory has changed to be enforcing. This means that jobs that use more then their provisioned memory will be put on hold with an appropriate hold message. The previous default can be restored by setting CGROUP_MEMORY_LIMIT_POLICY = none on the Execution points. (HTCONDOR-1974)

  • Users can now define DAGMan save points to be able to save the state of a DAGs progess to a file and then re-run a DAG from that saved point of progress. (HTCONDOR-1636)

  • DAGMan has much better user control of enviroment variables present in the DAGMan job propers environment via condor_submit_dag's new flags (-include_env & -insert_env) and/or the new DAG file description command ENV. (HTCONDOR-1955) (HTCONDOR-1580)

  • Added the condor_qusers command to monitor and control users at the Access Point. Users disabled at the Access Point are no longer allowed to submit jobs. Jobs submitted before the user was disabled are allowed to run to completion. When a user is disabled, an optional reason string can be provided. (HTCONDOR-1723) (HTCONDOR-1853)

  • The condor_negotiator now support setting a minimum floor number of cores that any given submitter should get, regardless of their fair share. This can be set or queried via the condor_userprio tool, in the same way that the ceiling can be set or get. (HTCONDOR-557)

  • Added a -gpus option to condor_status. With this option condor_status will show only machines that have GPUs provisioned; and it will show information about the GPU properties. (HTCONDOR-1958)

  • The output of condor_status when using the -compact option has been improved to show a separate row for the second and subsequent slot type for machines that have multiple slot types. Also the totals now count slots that have the BackfillSlot attribute under the Backfill or BkIdle columns. (HTCONDOR-1957)

  • Container universe jobs may now specify the container_image to be an image transferred via a file transfer plugin. (HTCONDOR-1820)

  • Support for Enterprise Linux 9, Amazon Linux 2023, and Debian 12. (HTCONDOR-1285) (HTCONDOR-1742) (HTCONDOR-1938)

  • Administrators can specify a new history file for Access Points that records information about a job for each execution attempt. If enabled then this information can be queried via condor_history -epochs. (HTCONDOR-1104)

  • A single HTCondor pool can now have multiple condor_defrag daemons running and they will not interfere with each other so long as each has DEFRAG_REQUIREMENTS that select mutually exclusive subsets of the pool. (HTCONDOR-1903)

  • Add condor_test_token tool to generate a short lived SciToken for testing. (HTCONDOR-1115)

  • The job’s executable is no longer renamed to condor_exec.exe. (HTCONDOR-1227)

Upgrading from a 10.0 LTS version of HTCondor to a 23.0 LTS version will also introduce changes that administrators and users of sites running from an older HTCondor version should be aware of when planning an upgrade. Here is a list of items that administrators should be aware of. To see if any of the following items will affect an upgrade run condor_upgrade_check.

  • HTCondor will no longer pass all environment variables to the DAGMan proper manager jobs environment. This may result in DAGMan and its various parts (primarily PRE, POST,& HOLD Scripts) to start failing or change behavior due to missing needed environment variables. To revert back to the old behavior or add the missing environment variables to the DAGMan proper job set the DAGMAN_MANAGER_JOB_APPEND_GETENV configuration option. (HTCONDOR-1580)

  • We added the ability for the condor_schedd to track users over time. Once you have upgraded to HTCondor 23, you may no longer downgrade to a version before HTCondor 10.5.0 or HTCondor 10.0.4 LTS. (HTCONDOR-1432)

  • Execution Points without any administrator defined slot configuration will now default to creating and utilizing one partitionable slot. This causes Startd RANK expressions to have no effect. To revert an Execution Point to use static slots add use FEATURE:StaticSlots to the Execution Point configuration. (HTCONDOR-2026)

  • The configuration expression constant CpuBusyTime no longer represents a time delta but rather a timestamp of when the CPU became busy. The new expression constant CpuBusyTimer now represents the time delta of how long a CPU has been busy for. (HTCONDOR-1502)

  • The configuration expression constants ActivationTimer, ConsoleBusy, CpuBusy, CpuIdle, JustCPU, KeyboardBusy, KeyboardNotBusy, LastCkpt, MachineBusy, and NonCondorLoadAvg no longer exist by default for configuration expressions. To re-enable these constants either add use FEATURE:POLICY_EXPR_FRAGMENTS or one of the desktop policies to the configuration. (HTCONDOR-1502)

  • The job router configuration macros JOB_ROUTER_DEFAULTS, JOB_ROUTER_ENTRIES, JOB_ROUTER_ENTRIES_FILE, and JOB_ROUTER_ENTRIES_CMD are deprecated and will be removed during the lifetime of the HTCondor V23 feature series. (HTCONDOR-1968)