htcondor
API Reference
This page is an exhaustive reference of the API exposed by the htcondor
module. It is not meant to be a tutorial for new users but rather a helpful
guide for those who already understand the basic usage of the module.
Interacting with Collectors
- class htcondor.Collector(pool)
Client object for a remote condor_collector. The
Collector
can be used to:Locate a daemon.
Query the condor_collector for one or more specific ClassAds.
Advertise a new ad to the condor_collector.
- Parameters
pool (str or list[str]) – A
host:port
pair specified for the remote collector (or a list of pairs for HA setups). If omitted, the value of configuration parameterCOLLECTOR_HOST
is used.
- locate(daemon_type, name) object :
Query the condor_collector for a particular daemon.
- Parameters
daemon_type (
DaemonTypes
) – The type of daemon to locate.name (str) – The name of daemon to locate. If not specified, it searches for the local daemon.
- Returns
a minimal ClassAd of the requested daemon, sufficient only to contact the daemon; typically, this limits to the
MyAddress
attribute.- Return type
- locateAll(daemon_type) object :
Query the condor_collector daemon for all ClassAds of a particular type. Returns a list of matching ClassAds.
- Parameters
daemon_type (
DaemonTypes
) – The type of daemon to locate.- Returns
Matching ClassAds
- Return type
list[
ClassAd
]
- query(ad_type=htcondor.htcondor.AdTypes.Any, constraint='', projection=[], statistics='') object :
Query the contents of a condor_collector daemon. Returns a list of ClassAds that match the constraint parameter.
- Parameters
ad_type (
AdTypes
) – The type of ClassAd to return. If not specified, the type will be ANY_AD.constraint (str or
ExprTree
) – A constraint for the collector query; only ads matching this constraint are returned. If not specified, all matching ads of the given type are returned.projection (list[str]) – A list of attributes to use for the projection. Only these attributes, plus a few server-managed, are returned in each
ClassAd
.statistics (list[str]) – Statistics attributes to include, if they exist for the specified daemon.
- Returns
A list of matching ads.
- Return type
list[
ClassAd
]
- directQuery(daemon_type, name='', projection=[], statistics='') object :
Query the specified daemon directly for a ClassAd, instead of using the ClassAd from the condor_collector daemon. Requires the client library to first locate the daemon in the collector, then querying the remote daemon.
- Parameters
daemon_type (
DaemonTypes
) – Specifies the type of the remote daemon to query.name (str) – Specifies the daemon’s name. If not specified, the local daemon is used.
projection (list[str]) – is a list of attributes requested, to obtain only a subset of the attributes from the daemon’s
ClassAd
.statistics (str) – Statistics attributes to include, if they exist for the specified daemon.
- Returns
The ad of the specified daemon.
- Return type
- advertise(ad_list, command='UPDATE_AD_GENERIC', use_tcp=True) None :
Advertise a list of ClassAds into the condor_collector.
- Parameters
ad_list (list[
ClassAds
]) –ClassAds
to advertise.command (str) – An advertise command for the remote condor_collector. It defaults to
UPDATE_AD_GENERIC
. Other commands, such asUPDATE_STARTD_AD
, may require different authorization levels with the remote daemon.use_tcp (bool) – When set to
True
, updates are sent via TCP. Defaults toTrue
.
- class htcondor.DaemonTypes
An enumeration of different types of daemons available to HTCondor.
The values of the enumeration are:
- None
- Any
Any type of daemon; useful when specifying queries where all matching daemons should be returned.
- Master
Ads representing the condor_master.
- Schedd
Ads representing the condor_schedd.
- Startd
Ads representing the resources on a worker node.
- Collector
Ads representing the condor_collector.
- Negotiator
Ads representing the condor_negotiator.
- HAD
Ads representing the high-availability daemons (condor_had).
- Generic
All other ads that are not categorized as above.
- Credd
- class htcondor.AdTypes
A list of different types of ads that may be kept in the condor_collector.
The values of the enumeration are:
- None
- Any
Type representing any matching ad. Useful for queries that match everything in the collector.
- Generic
Generic ads, associated with no particular daemon.
- Slot
Slot ads produced by the condor_startd daemon. Represents the available slots managed by the startd.
- StartDaemon
Daemon ads produced by the condor_startd daemon. There is only a single daemon ad for each STARTD. Daemon ads are used for monitoring and location requests, but not for running jobs.
- Startd
Ads produced by the condor_startd daemon. Usually represents the available slots managed by the startd, but may indicate STARTD daemon ads. Use Slot or StartDaemon enum values to be explicit about which type of ads.
- StartdPrivate
The “private” ads, containing the claim IDs associated with a particular slot. These require additional authorization to read as the claim ID may be used to run jobs on the slot.
- Schedd
Schedd ads, produced by the condor_schedd daemon.
- Master
Master ads, produced by the condor_master daemon.
- Collector
Ads from the condor_collector daemon.
- Negotiator
Negotiator ads, produced by the condor_negotiator daemon.
- Submitter
Ads describing the submitters with available jobs to run; produced by the condor_schedd and read by the condor_negotiator to determine which users need a new negotiation cycle.
- Grid
Ads associated with the grid universe.
- HAD
Ads produced by the condor_had.
- License
License ads. These do not appear to be used by any modern HTCondor daemon.
- Credd
- Defrag
- Accounting
Interacting with Schedulers
- class htcondor.Schedd(location_ad)
Client object for a condor_schedd.
- Parameters
location_ad (
ClassAd
orDaemonLocation
) – AClassAd
describing the location of the remote condor_schedd daemon, as returned by theCollector.locate()
method, or a tuple of type DaemonLocation as returned bySchedd.location()
. If the parameter is omitted, the local condor_schedd daemon is used.
- transaction()
Warning
Schedd.transaction() was deprecated in version 10.7.0 and will be removed in a future release. Use Schedd.submit() instead.
transaction( (Schedd)self [, (TransactionFlags)flags=0 [, (bool)continue_txn=False]]) -> Transaction :
This method is DEPRECATED. Use
Schedd.submit()
instead.Start a transaction with the condor_schedd.
Starting a new transaction while one is ongoing is an error unless the
continue_txn
flag is set.- param flags
Flags controlling the behavior of the transaction, defaulting to 0.
- type flags
- param bool continue_txn
Set to
True
if you would like this transaction to extend any pre-existing transaction; defaults toFalse
. If this is not set, starting a transaction inside a pre-existing transaction will cause an exception to be thrown.- return
A
Transaction
object.
- query(constraint='true', projection=[], callback=None, limit=-1, opts=htcondor.htcondor.QueryOpts.Default) object :
Query the condor_schedd daemon for job ads. Job ads may be quite large and there may be tens of thousands of them, so you may want to specify a projection. In memory-constrained environments, you may also need to impose a strict constraint and make more than one query.
- Parameters
constraint (str or
classad.classad.ExprTree
) – A query constraint. Only jobs matching this constraint will be returned. Defaults to'true'
, which means all jobs will be returned.projection (list[str]) – Attributes that will be returned for each job in the query. At least the attributes in this list will be returned, but additional ones may be returned as well. An empty list (the default) returns all attributes.
callback – A callable object; if provided, it will be invoked for each ClassAd. The return value (if not
None
) will be added to the returned list instead of the ad.limit (int) – The maximum number of ads to return; the default (
-1
) is to return all ads.opts (
QueryOpts
.) – Additional flags for the query; these may affect the behavior of the condor_schedd.
- Returns
ClassAds representing the matching jobs.
- Return type
list[
ClassAd
]
- xquery()
Warning
Schedd.xquery() was deprecated in version 10.7.0 and will be removed in a future release.
xquery( (Schedd)self [, (object)constraint=’true’ [, (list)projection=[] [, (int)limit=-1 [, (QueryOpts)opts=htcondor.htcondor.QueryOpts.Default [, (object)name=None]]]]]) -> QueryIterator :
Warning
This function is deprecated.
Query the condor_schedd daemon for job ads.
Warning
This returns an iterator of
ClassAd
objects, which means you may not need to hold all of the ads returned by the query in memory simultaneously. However, this method holds a connection open to the schedd, and a fork of the schedd will remain active, until you finish iterating. If you are not retrieving many large ads, consider usingquery()
instead to reduce load on the schedd.- param constraint
A query constraint. Only jobs matching this constraint will be returned. Defaults to
'true'
, which means all jobs will be returned.- type constraint
str or
ExprTree
- param projection
Attributes that will be returned for each job in the query. At least the attributes in this list will be returned, but additional ones may be returned as well. An empty list (the default) returns all attributes.
- type projection
list[str]
- param int limit
A limit on the number of matches to return. The default (
-1
) indicates all matching jobs should be returned.- param opts
Additional flags for the query, from
QueryOpts
.- type opts
- param str name
A tag name for the returned query iterator. This string will always be returned from the
QueryIterator.tag()
method of the returned iterator. The default value is the condor_schedd’s name. This tag is useful to identify different queries when using thepoll()
function.- return
An iterator for the matching job ads
- rtype
- act(action, job_spec, reason=None) object :
Change status of job(s) in the condor_schedd daemon. The return value is a ClassAd object describing the number of jobs changed.
This will throw an exception if no jobs are matched by the constraint.
- Parameters
action (
JobAction
) – The action to perform; must be of the enum JobAction.job_spec (list[str] or str) – The job specification. It can either be a list of job IDs, or an ExprTree or string specifying a constraint. Only jobs matching this description will be acted upon.
reason (str) – The reason for the action. If omitted, the reason will be “Python-initiated action”.
- edit(job_spec, attr, value, flags=0) EditResult :
Edit one or more jobs in the queue.
This will throw an exception if no jobs are matched by the
job_spec
constraint.- Parameters
job_spec (list[str] or str) – The job specification. It can either be a list of job IDs or a string specifying a constraint. Only jobs matching this description will be acted upon.
attr (str) – The name of the attribute to edit.
value (str or
ExprTree
) – The new value of the attribute. It should be a string, which will be converted to a ClassAd expression, or an ExprTree object. Be mindful of quoting issues; to set the value to the stringfoo
, one would set the value to''foo''
flags (
TransactionFlags
) – Flags controlling the behavior of the transaction, defaulting to 0.
- Returns
An EditResult containing the number of jobs that were edited.
- Return type
EditResult
- history(constraint, projection, match=-1, since=None) HistoryIterator :
Fetch history records from the condor_schedd daemon.
- Parameters
constraint (str or
ExprTree
) – A query constraint. Only jobs matching this constraint will be returned.None
will return all jobs.projection (list[str]) – Attributes that will be returned for each job in the query. At least the attributes in this list will be returned, but additional ones may be returned as well. An empty list returns all attributes.
match (int) – A limit on the number of jobs to include; the default (
-1
) indicates to return all matching jobs. The schedd may return fewer thanmatch
jobs because of its setting ofHISTORY_HELPER_MAX_HISTORY
(default 10,000).since (int, str, or
ExprTree
) – A cluster ID, job ID, or expression. If a cluster ID (passed as an int) or job ID (passed a str in the format{clusterID}.{procID}
), only jobs recorded in the history file after (and not including) the matching ID will be returned. If an expression (passed as a str orExprTree
), jobs will be returned, most-recently-recorded first, until the expression becomes true; the job making the expression become true will not be returned. Thus,1038
andclusterID == 1038
return the same set of jobs.
- Returns
All matching ads in the Schedd history, with attributes according to the
projection
keyword.- Return type
- jobEpochHistory(constraint, projection, match=-1, since=None, ad_type=None) HistoryIterator :
Fetch per job run instance (epoch) history records from the condor_schedd daemon.
- Parameters
constraint (str or
ExprTree
) – A query constraint. Only jobs matching this constraint will be returned.None
will return all jobs.projection (list[str]) – Attributes that will be returned for each job in the query. At least the attributes in this list will be returned, but additional ones may be returned as well. An empty list returns all attributes.
match (int) – A limit on the number of jobs to include; the default (
-1
) indicates to return all matching jobs. The schedd may return fewer thanmatch
jobs because of its setting ofHISTORY_HELPER_MAX_HISTORY
(default 10,000).since (int, str, or
ExprTree
) – A cluster ID, job ID, or expression. If a cluster ID (passed as an int) or job ID (passed a str in the format{clusterID}.{procID}
), only jobs recorded in the history file after (and not including) the matching ID will be returned. If an expression (passed as a str orExprTree
), jobs will be returned, most-recently-recorded first, until the expression becomes true; the job making the expression become true will not be returned. Thus,1038
andclusterID == 1038
return the same set of jobs.ad_type – DEPRECATED. Comma separated string of history Ad types to return. If
None
then return normal job ClassAds. DefaultNone
.
- Returns
All matching ads in the Schedd history, with attributes according to the
projection
keyword.- Return type
- submit(description, count=1, spool=False, ad_results=None, itemdata=None) object :
Submit one or more jobs to the condor_schedd daemon.
This method requires the invoker to provide a
Submit
object that describes the jobs to submit. The return value will be aSubmitResult
that contains the cluster ID and ClassAd of the submitted jobs.For backward compatibility, this method will also accept a
ClassAd
that describes a single job to submit, but use of this form of is DEPRECATED. If the deprecated form is used the return value will be the cluster ID, and ad_results will optionally be the actual job ClassAds that were submitted.- Parameters
description (
Submit
(or DEPRECATEDClassAd
)) – The Submit description or ClassAd describing the job cluster.count (int) – The number of jobs to submit to the job cluster. Defaults to
1
.spool (bool) – If
True
, jobs will be submitted in a spooling hold mode so that input files can be spooled to a remote condor_schedd daemon before starting the jobs. This parameter is necessary for jobs submitted to a remote condor_schedd that use HTCondor file transfer. When True, job will be left in the HOLD state until thespool()
method is called.ad_results (list[
ClassAd
]) – DEPRECATED. If set to a list and a raw job ClassAd is passed as the first argument, the list object will contain the job ads that were submitted.
- Returns
a
SubmitResult
, containing the cluster ID, cluster ClassAd and range of Job ids of the submitted job(s). If using the deprecated first argument, the return value will be an int and ad_results may contain submitted jobs ClassAds.- Return type
SubmitResult
or int
- submitMany(cluster_ad, proc_ads, spool=False, ad_results=None) int :
Submit multiple jobs to the condor_schedd daemon, possibly including several distinct processes.
- Parameters
cluster_ad (
ClassAd
) – The base ad for the new job cluster; this is the same format as in thesubmit()
method.proc_ads (list) – A list of 2-tuples; each tuple has the format of
(proc_ad, count)
. For each list entry, this will result in count jobs being submitted inheriting from bothcluster_ad
andproc_ad
.spool (bool) – If
True
, the client inserts the necessary attributes into the job for it to have the input files spooled to a remote condor_schedd daemon. This parameter is necessary for jobs submitted to a remote condor_schedd that use HTCondor file transfer. When True, job will be left in the HOLD state until thespool()
method is called.ad_results (list[
ClassAd
]) – If set to a list, the list object will contain the job ads resulting from the job submission.
- Returns
The newly created cluster ID.
- Return type
- spool(ad_list) None :
Spools the files specified in a list of job ClassAds to the condor_schedd.
- Parameters
ad_list (list[
ClassAds
]) – A list of job descriptions; typically, this is the list returned by thejobs()
method on the submit result object.- Raises
RuntimeError – if there are any errors.
- retrieve(arg1, arg2) None :
Retrieve the output sandbox from one or more jobs.
- Parameters
job_spec (str or list[
ClassAd
]) – An expression matching the list of job output sandboxes to retrieve.
- refreshGSIProxy(cluster, proc, proxy_filename, lifetime) int :
Refresh the GSI proxy of a job; the job’s proxy will be replaced the contents of the provided
proxy_filename
.Note
Depending on the lifetime of the proxy in
proxy_filename
, the resulting lifetime may be shorter than the desired lifetime.- Parameters
cluster (int) – Cluster ID of the job to alter.
proc (int) – Process ID of the job to alter.
proxy_filename (str) – The name of the file containing the new proxy for the job.
lifetime (int) – Indicates the desired lifetime (in seconds) of the delegated proxy. A value of
0
specifies to not shorten the proxy lifetime. A value of-1
specifies to use the value of configuration variableDELEGATE_JOB_GSI_CREDENTIALS_LIFETIME
.
- reschedule() None :
Send reschedule command to the schedd.
- export_jobs(job_spec, export_dir, new_spool_dir) object :
Export one or more job clusters from the queue to put those jobs into the externally managed state.
- Parameters
job_spec (list[str] or str or ExprTree) – The job specification. It can either be a list of job IDs or a string specifying a constraint. Only jobs matching this description will be acted upon.
export_dir (str) – The path to the directory that exported jobs will be written into.
new_spool_dir (str) – The path to the base directory that exported jobs will use as IWD while they are exported
- Returns
A ClassAd containing information about the export operation.
- Return type
- import_exported_job_results(import_dir) object :
Import results from previously exported jobs, and take those jobs back out of the externally managed state.
- unexport_jobs(job_spec) object :
Unexport one or more job clusters that were previously exported from the queue.
- class htcondor.JobAction
An enumeration describing the actions that may be performed on a job in queue.
The values of the enumeration are:
- Hold
Put a job on hold, vacating a running job if necessary. A job will stay in the hold state until explicitly acted upon by the admin or owner.
- Release
Release a job from the hold state, returning it to
Idle
.
- Suspend
Suspend the processes of a running job (on Unix platforms, this triggers a
SIGSTOP
). The job’s processes stay in memory but no longer get scheduled on the CPU.
- Continue
Continue a suspended jobs (on Unix,
SIGCONT
). The processes in a previously suspended job will be scheduled to get CPU time again.
- Remove
Remove a job from the Schedd’s queue, cleaning it up first on the remote host (if running). This requires the remote host to acknowledge it has successfully vacated the job, meaning
Remove
may not be instantaneous.
- RemoveX
Immediately remove a job from the schedd queue, even if it means the job is left running on the remote resource.
- Vacate
Cause a running job to be killed on the remote resource and return to idle state. With
Vacate
, jobs may be given significant time to cleanly shut down.
- VacateFast
Vacate a running job as quickly as possible, without providing time for the job to cleanly terminate.
- class htcondor.Transaction
Warning
Transaction was deprecated in version 10.7.0 and will be removed in a future release.
DEPRECATED. An ongoing transaction in the HTCondor schedd.
- class htcondor.TransactionFlags
Enumerated flags affecting the characteristics of a transaction.
The values of the enumeration are:
- NonDurable
Non-durable transactions are changes that may be lost when the condor_schedd crashes.
NonDurable
is used for performance, as it eliminates extrafsync()
calls.
- SetDirty
This marks the changed ClassAds as dirty, causing an update notification to be sent to the condor_shadow and the condor_gridmanager, if they are managing the job.
- ShouldLog
Causes any changes to the job queue to be logged in the relevant job event log.
- class htcondor.QueryOpts
Enumerated flags sent to the condor_schedd during a query to alter its behavior.
The values of the enumeration are:
- Default
Queries should use default behaviors, and return jobs for all users.
- AutoCluster
Instead of returning job ads, return an ad per auto-cluster.
- GroupBy
Instead of returning job ads, return an ad for each unique combination of values for the attributes in the projection. Similar to AutoCluster, but using the projection as the significant attributes for auto-clustering.
- DefaultMyJobsOnly
Queries should use all default behaviors, and return jobs only for the current user.
- SummaryOnly
Instead of returning job ads, return only the final summary ad.
- IncludeClusterAd
Query should return raw cluster ads as well as job ads if the cluster ads match the query constraint.
- IncludeJobsetAds
Query should return raw jobset ads as well as job ads if the jobset ads match the query constraint.
- ClusterAds
Query should return only raw cluster ads that match the query constraint.
- JobsetAds
Query should return only raw jobset ads that match the query constraint.
- class htcondor.BlockingMode
An enumeration that controls the behavior of query iterators once they are out of data.
The values of the enumeration are:
- Blocking
Sets the iterator to block until more data is available.
- NonBlocking
Sets the iterator to return immediately if additional data is not available.
- class htcondor.HistoryIterator
An iterator over ads in the history produced by
Schedd.history()
.
- class htcondor.QueryIterator
An iterator class for managing results of the
Schedd.query()
andSchedd.xquery()
methods.- nextAdsNonBlocking() list :
Retrieve as many ads are available to the iterator object.
If no ads are available, returns an empty list. Does not throw an exception if no ads are available or the iterator is finished.
- Returns
Zero-or-more job ads.
- Return type
list[
ClassAd
]
- tag() str :
Retrieve the tag associated with this iterator; when using the
poll()
method, this is useful to distinguish multiple iterators.- Returns
The query’s tag.
- watch() int :
Returns an
inotify
-based file descriptor; if this descriptor is given to aselect()
instance,select
will indicate this file descriptor is ready to read whenever there are more jobs ready on the iterator.If
inotify
is not available on this platform, this will return-1
.- Returns
A file descriptor associated with this query.
- Return type
- htcondor.poll(queries, timeout_ms=20000) BulkQueryIterator :
Wait on the results of multiple query iterators.
This function returns an iterator which yields the next ready query iterator. The returned iterator stops when all results have been consumed for all iterators.
- Parameters
active_queries (list[
QueryIterator
]) – Query iterators as returned by xquery().- Returns
An iterator producing the ready
QueryIterator
.- Return type
- class htcondor.BulkQueryIterator
Returned by
poll()
, this iterator produces a sequence ofQueryIterator
objects that have ads ready to be read in a non-blocking manner.Once there are no additional available iterators,
poll()
must be called again.
Submitting Jobs
- class htcondor.Submit
An object representing a job submit description. It uses the same submit language as condor_submit.
The submit description contains
key = value
pairs and implements the python dictionary protocol, including theget
,setdefault
,update
,keys
,items
, andvalues
methods. Values in the submit discription language have no data type; they are all stored as strings.object __init__(tuple args, dict kwds) :
- param input
Submit descriptors as a string containing the text of a submit file or as
key = value
pairs in a dictionary, or as keyword arguments.Only the single multi-line string form can contain a
QUEUE
statement.For example, these calls all produce identical submit descriptions:
from_file = htcondor.Submit( """ executable = /bin/sleep arguments = 5s log = $(ClusterId).log My.CustomAttribute = "foobar" """ ) # create an empty submit object, then populate it as a dict # use of classad.quote here insures that the value is properly escaped as a classad string submit_dict = htcondor.Submit() submit_dict["executable"] = "/bin/sleep" submit_dict["arguments"] = "5s" submit_dict["log"] = "$(ClusterId).log" submit_dict["My.CustomAttribute"] = classad.quote("foobar") # initialize a submit object from a python dict # note that values should be strings mydict = { "executable": "/bin/sleep", "arguments": "5s", "log": "$(ClusterId).log", "My.CustomAttribute": classad.quote("foobar"), } from_dict = htcondor.Submit(mydict) # initialize a submit object from keyword arguments # the **{} is a trick to get a keyword argument that contains a . from_kwargs = htcondor.Submit( executable="/bin/sleep", arguments="5s", log="$(ClusterId).log", **{ "My.CustomAttribute": classad.quote("foobar") } )
If a string initalizer is used, it may include a single condor_submit
QUEUE
statement at the end. If omitted, the submit description is initially empty.The arguments to the
QUEUE
statement will be stored in theQArgs
member of this class and can be passed toschedd.Submit()
as the itemdata iterator like thissub = htcondor.Submit( """ executable = /bin/sleep QUEUE arguments in (1s, 10s, 5m) """ ) schedd.Submit(sub, count=1, itemdata=sub.itemdata())
- type input
dict or str
- queue()
Warning
Submit.queue() was deprecated in version 10.7.0 and will be removed in a future release. Use Schedd.submit() instead.
queue( (Submit)self, (Transaction)txn [, (int)count=0 [, (object)ad_results=None]]) -> int :
This method is DEPRECATED. Use
Schedd.submit()
instead.Submit the current object to a remote queue.
- param txn
An active transaction object (see
Schedd.transaction()
).- type txn
- param int count
The number of jobs to create (defaults to
0
). If not specified, or a value of0
is given theQArgs
member of this class is used to determine the number of procs to submit. If noQArgs
were specified, one job is submitted.- param ad_results
A list to receive the ClassAd resulting from this submit. As with
Schedd.submit()
, this is often used to later spool the input files.- return
The ClusterID of the submitted job(s).
- rtype
int
- raises RuntimeError
if the submission fails.
- queue_with_itemdata()
Warning
Submit.queue_with_itemdata() was deprecated in version 10.7.0 and will be removed in a future release. Use Schedd.submit() instead.
queue_with_itemdata( (Submit)self, (Transaction)txn [, (int)count=1 [, (object)itemdata=None [, (bool)spool=False]]]) -> SubmitResult :
This method is DEPRECATED. Use
Schedd.submit()
instead.Submit the current object to a remote queue.
- param txn
An active transaction object (see
Schedd.transaction()
).- type txn
- param int count
A queue count for each item from the iterator, defaults to 1.
- param from
an iterator of strings or dictionaries containing the itemdata for each job as in
queue in
orqueue from
.- param bool spool
Modify the job ClassAds to indicate that it should wait for input before starting. defaults to false.
- return
a
SubmitResult
, containing the cluster ID, cluster ClassAd and range of Job ids Cluster ID of the submitted job(s).- rtype
- raises RuntimeError
if the submission fails.
- expand(attr) str :
Expand all macros for the given attribute.
- jobs(count=0, itemdata=None, clusterid=1, procid=0, qdate=0, owner='') SubmitJobsIterator :
Turn the current object into a sequence of simulated job ClassAds
- Parameters
count (int) – the queue count for each item in the from list, defaults to 1
from – a iterator of strings or dictionaries containing the itemdata for each job e.g. ‘queue in’ or ‘queue from’
clusterid (int) – the value to use for ClusterId when making job ads, defaults to 1
procid (int) – the initial value for ProcId when making job ads, defaults to 0
qdate (str) – a UNIX timestamp value for the QDATE attribute of the jobs, 0 means use the current time.
owner (str) – a string value for the Owner attribute of the job
- Returns
An iterator for the resulting job ads.
- Raises
RuntimeError – if valid job ads cannot be made
- procs(count=0, itemdata=None, clusterid=1, procid=0, qdate=0, owner='') SubmitJobsIterator :
Turn the current object into a sequence of simulated job proc ClassAds. The first ClassAd will be the cluster ad plus a ProcId attribute
- Parameters
count (int) – the queue count for each item in the from list, defaults to 1
from – a iterator of strings or dictionaries containing the foreach data e.g. ‘queue in’ or ‘queue from’
clusterid (int) – the value to use for ClusterId when making job ads, defaults to 1
procid (int) – the initial value for ProcId when making job ads, defaults to 0
qdate (str) – a UNIX timestamp value for the QDATE attribute of the jobs, 0 means use the current time.
owner (str) – a string value for the Owner attribute of the job
- Returns
An iterator for the resulting job ads.
- Raises
RuntimeError – if valid job ads cannot be made
- itemdata(qargs='') QueueItemsIterator :
Create an iterator over itemdata derived from a queue statement.
For example
itemdata("matching *.dat")
would return an iterator of filenames that end in.dat
from the current directory. This is the same iterator used by condor_submit when processingQUEUE
statements.- Parameters
queue (str) – a submit file queue statement, or the arguments to a submit file queue statement.
- Returns
An iterator for the resulting items
- getQArgs() str :
Returns arguments specified in the
QUEUE
statement passed to the constructor. These are the arguments that will be used by theSubmit.itemdata()
method if not overridden.
- setQArgs(args) None :
Sets the arguments to be used by subsequent calls to the
Submit.itemdata()
.- Parameters
args (str) – The arguments to pass to the
QUEUE
statement.
- static from_dag(filename, options={}) Submit :
Constructs a new
Submit
that could be used to submit the DAG described by the file found atfilename
.This static method essentially does the first half of the work that condor_submit_dag does: it produces the submit description for the DAGMan job that will execute the DAG. However, in addition to writing this submit description to disk, it also produces a
Submit
object with the same information that can be submitted via the normal Python bindings submit machinery.- Parameters
filename (str) – The path to the DAG description file.
options (dict) – Additional arguments to condor_submit_dag. Supports
dagman
(str),force
(bool),schedd-daemon-ad-file
(str),schedd-address-file
(str),AlwaysRunPost
(bool),maxidle
(int),maxjobs
(int),MaxPre
(int),MaxPost
(int),UseDagDir
(bool),debug
(int),outfile_dir
(str),config
(str),batch-name
(str),load_save
(str),AutoRescue
(bool),DoRescueFrom
(int),AllowVersionMismatch
(bool),do_recurse
(bool),update_submit
(bool),import_env
(bool),include_env
(str),insert_env
(str),DumpRescue
(bool),valgrind
(bool),priority
(int),suppress_notification
(bool),DoRecov
(bool)
- Returns
A
Submit
description for the DAG described infilename
- Return type
- setSubmitMethod(method_value=-1, allow_reserved_values=False) None :
Sets the Job Ad attribute
JobSubmitMethod
to passed over number.method_value
is recommended to be set to a value of100
or greater to avoid confusion to pre-set values. Negative numbers will result inJobSubmitMethod
to not be defined in the Job Ad. If wanted, any number can be set by passingTrue
toallow_reserved_values
. This allows any positive number to be set toJobSubmitMethod
. This includes all reserved numbers. Note~ Setting ofJobSubmitMethod
must occur before job is submitted to Schedd.
- class htcondor.QueueItemsIterator
An iterator over itemdata produced by
Submit.itemdata()
.
Interacting with Negotiators
- class htcondor.Negotiator(ad)
This class provides a query interface to the condor_negotiator. It primarily allows one to query and set various parameters in the fair-share accounting.
- Parameters
location_ad (
ClassAd
orDaemonLocation
) – A ClassAd or DaemonLocation describing the condor_negotiator location and version. If omitted, the default pool negotiator is assumed.
- deleteUser(user) None :
Delete all records of a user from the Negotiator’s fair-share accounting.
- Parameters
user (str) – A fully-qualified user name (
USER@DOMAIN
).
- getPriorities(rollup) list :
Retrieve the pool accounting information, one per entry. Returns a list of accounting ClassAds.
- getResourceUsage(user) list :
Get the resources (slots) used by a specified user.
- resetAllUsage() None :
Reset all usage accounting. All known user records in the negotiator are deleted.
- resetUsage(user) None :
Reset all usage accounting of the specified user.
- Parameters
user (str) – A fully-qualified user name (
USER@DOMAIN
).
- setBeginUsage(user, value) None :
Manually set the time that a user begins using the pool.
- setCeiling(user, ceiling) None :
Set the submitter ceiling of a specified user.
- setLastUsage(user, value) None :
Manually set the time that a user last used the pool.
- setFactor(user, factor) None :
Set the priority factor of a specified user.
- setPriority(user, prio) None :
Set the real priority of a specified user.
Managing Starters and Claims
- class htcondor.Startd(ad=None)
A class that represents a Startd.
- Parameters
locaton_ad – A ClassAd or DaemonLocation describing the the startd location and version. If omitted, the local startd is assumed.
- drainJobs(drain_type=0, on_completion=0, check_expr='true', start_expr='false', reason='by command') str :
Begin draining jobs from the startd.
- Parameters
drain_type (
DrainTypes
) – How fast to drain the jobs. Defaults toDRAIN_GRACEFUL
if not specified.on_completion (int) – Whether the startd should start accepting jobs again once draining is complete. Otherwise, it will remain in the drained state. Values are 0 for Nothing, 1 for Resume, 2 for Exit, 3 for Restart. Defaults to 0.
check_expr (str or
ExprTree
) – An expression string that must evaluate totrue
for all slots for draining to begin. Defaults to'true'
.start_expr (str or
ExprTree
) – The expression that the startd should use while draining.reason (str) – A string describing the reason for draining. defaults to “by command”
- Returns
An opaque request ID that can be used to cancel draining via
Startd.cancelDrainJobs()
- Return type
Security Management
- class htcondor.Credd(ad=None)
A class for sending Credential commands to a Credd, Schedd or Master.
- Parameters
location_ad (
ClassAd
orDaemonLocation
) – A ClassAd or DaemonLocation describing the Credd, Schedd or Master location. If omitted, the local schedd is assumed.
- add_password(password, user='') None :
Store the
password
in the Credd for the current user (or for the givenuser
).
- delete_password(user='') None :
Delete the
password
in the Credd for the current user (or for the givenuser
).- Parameters
user (str) – Which user to store the credential for (defaults to the current user).
- query_password(user='') bool :
Check to see if the current user (or the given
user
) has a password stored in the Credd.- Parameters
user (str) – Which user to store the credential for (defaults to the current user).
- Returns
bool
- add_user_cred(credtype, credential, user='') None :
Store a
credential
in the Credd for the current user (or for the givenuser
).
- delete_user_cred(credtype, user='') None :
Delete a credential of the given
credtype
for the current user (or for the givenuser
).
- query_user_cred(credtype, user='') int :
Query whether the current user (or the given user) has a credential of the given type stored.
- add_user_service_cred(credtype, credential, service, handle='', user='') None :
Store a credential in the Credd for the current user, or for the given user.
To specify multiple credential for the same service (e.g., you want to transfer files from two different accounts that are on the same service), give each a unique
handle
.
- delete_user_service_cred(credtype, service, handle='', user='') None :
Delete a credential of the given
credtype
for serviceservice
for the current user (or for the givenuser
).
- query_user_service_cred(credtype, service, handle='', user='') CredStatus :
Query whether the current user (or the given
user
) has a credential of the givencredtype
stored.- Parameters
- Returns
- check_user_service_creds(credtype, services, user='') CredCheck :
Check to see if the current user (or the given
user
) has a given set of service credentials, and if any credentials are missing, create a temporary URL that can be used to acquire the missing service credentials.- Parameters
credtype (
CredTypes
) – The type of credentials to check for.services (List[
classad.ClassAd
]) – The list of services that are needed.user (str) – Which user to store the credential for (defaults to the current user).
- Returns
- class htcondor.CredTypes
The types of credentials that can be managed by a condor_credd.
The values of the enumeration are:
- Password
- Kerberos
- OAuth
- class htcondor.CredCheck
- class htcondor.CredStatus
- class htcondor.SecMan(arg1)
A class that represents the internal HTCondor security state.
If a security session becomes invalid, for example, because the remote daemon restarts, reuses the same port, and the client continues to use the session, then all future commands will fail with strange connection errors. This is the only mechanism to invalidate in-memory sessions.
The
SecMan
can also behave as a context manager; when created, the object can be used to set temporary security configurations that only last during the lifetime of the security object.- invalidateAllSessions() None :
Invalidate all security sessions. Any future connections to a daemon will cause a new security session to be created.
- ping(ad, command='DC_NOP') ClassAd :
Perform a test authorization against a remote daemon for a given command.
- Parameters
ad (str or
ClassAd
) – The ClassAd of the daemon as returned byCollector.locate()
; alternately, the sinful string can be given directly as the first parameter.command – The DaemonCore command to try; if not given,
'DC_NOP'
will be used.
- Returns
An ad describing the results of the test security negotiation.
- Return type
- getCommandString(command_int) str :
Return the string name corresponding to a given integer command.
- Parameters
command_int (int) – The integer command to get the string name of.
- setConfig(key, value) None :
Set a temporary configuration variable; this will be kept for all security sessions in this thread for as long as the
SecMan
object is alive.
- setPoolPassword(new_pass) None :
Set the pool password.
- Parameters
new_pass (str) – Updated pool password to use for new security negotiations.
- setTag(tag) None :
Set the authentication context tag for the current thread.
All security sessions negotiated with the same tag will only be utilized when that tag is active.
For example, if thread A has a tag set to
'Joe'
and thread B has a tag set to'Jane'
, then all security sessions negotiated for thread A will not be used for thread B.- Parameters
tag (str) – New tag to set.
- class htcondor.Token(contents)
A class representing a generated HTCondor authentication token.
- Parameters
contents (str) – The contents of the token.
- write(tokenfile=None) None :
Write the contents of the token into the appropriate token directory on disk.
- Parameters
tokenfile – Filename inside the user token directory where the token will be written.
- class htcondor.TokenRequest(identity='', bounding_set=None, lifetime=-1)
A class representing a request for a HTCondor authentication token.
- Parameters
- done() bool :
Check to see if the token request has completed.
- Returns
True
if the request is complete;False
otherwise. May throw an exception.- Return type
- property request_id
The ID of the request at the remote daemon.
- result(timeout=0) Token :
Return the result of the token request. Will block until the token request is approved or the timeout is hit (a timeoute of 0, the default, indicates this method may block indefinitely).
- Returns
The token resulting from this request.
- Return type
Reading Job Events
The following is a complete example of submitting a job and waiting (forever) for it to finish. The next example implements a time-out.
#!/usr/bin/env python3
import htcondor
# Create a job description. It _must_ set `log` to create a job event log.
logFileName = "sleep.log"
submit = htcondor.Submit(
f"""
executable = /bin/sleep
transfer_executable = false
arguments = 5
log = {logFileName}
"""
)
# Submit the job description, creating the job.
result = htcondor.Schedd().submit(submit, count=1)
clusterID = result.cluster()
# Wait (forever) for the job to finish.
jel = htcondor.JobEventLog(logFileName)
for event in jel.events(stop_after=None):
# HTCondor appends to job event logs by default, so if you run
# this example more than once, there will be more than one job
# in the log. Make sure we have the right one.
if event.cluster != clusterID or event.proc != 0:
continue
if event.type == htcondor.JobEventType.JOB_TERMINATED:
if(event["TerminatedNormally"]):
print(f"Job terminated normally with return value {event['ReturnValue']}.")
else:
print(f"Job terminated on signal {event['TerminatedBySignal']}.");
break
if event.type in { htcondor.JobEventType.JOB_ABORTED,
htcondor.JobEventType.JOB_HELD,
htcondor.JobEventType.CLUSTER_REMOVE }:
print("Job aborted, held, or removed.")
break
# We expect to see the first three events in this list, and allow
# don't consider the others to be terminal.
if event.type not in { htcondor.JobEventType.SUBMIT,
htcondor.JobEventType.EXECUTE,
htcondor.JobEventType.IMAGE_SIZE,
htcondor.JobEventType.JOB_EVICTED,
htcondor.JobEventType.JOB_SUSPENDED,
htcondor.JobEventType.JOB_UNSUSPENDED }:
print(f"Unexpected job event: {event.type}!");
break
The following example includes a deadline for the job to finish. To
make it quick to run the example, the deadline is only ten seconds;
real jobs will almost always take considerably longer. You can change
arguments = 20
to arguments = 5
to verify that this example
correctly detects the job finishing. For the same reason, we check
once a second to see if the deadline has expired. In practice, you
should check much less frequently, depending on how quickly your
script needs to react and how long you expect the job to last. In
most cases, even once a minute is more frequent than necessary or
appropriate on shared resources; every five minutes is better.
#!/usr/bin/env python3
import time
import htcondor
# Create a job description. It _must_ set `log` to create a job event log.
logFileName = "sleep.log"
submit = htcondor.Submit(
f"""
executable = /bin/sleep
transfer_executable = false
arguments = 20
log = {logFileName}
"""
)
# Submit the job description, creating the job.
result = htcondor.Schedd().submit(submit, count=1)
clusterID = result.cluster()
def waitForJob(deadline):
jel = htcondor.JobEventLog(logFileName)
while time.time() < deadline:
# In real code, this should be more like stop_after=300; see above.
for event in jel.events(stop_after=1):
# HTCondor appends to job event logs by default, so if you run
# this example more than once, there will be more than one job
# in the log. Make sure we have the right one.
if event.cluster != clusterID or event.proc != 0:
continue
if event.type == htcondor.JobEventType.JOB_TERMINATED:
if(event["TerminatedNormally"]):
print(f"Job terminated normally with return value {event['ReturnValue']}.")
else:
print(f"Job terminated on signal {event['TerminatedBySignal']}.");
return True
if event.type in { htcondor.JobEventType.JOB_ABORTED,
htcondor.JobEventType.JOB_HELD,
htcondor.JobEventType.CLUSTER_REMOVE }:
print("Job aborted, held, or removed.")
return True
# We expect to see the first three events in this list, and allow
# don't consider the others to be terminal.
if event.type not in { htcondor.JobEventType.SUBMIT,
htcondor.JobEventType.EXECUTE,
htcondor.JobEventType.IMAGE_SIZE,
htcondor.JobEventType.JOB_EVICTED,
htcondor.JobEventType.JOB_SUSPENDED,
htcondor.JobEventType.JOB_UNSUSPENDED }:
print(f"Unexpected job event: {event.type}!");
return True
else:
print("Deadline expired.")
return False
# Wait no more than 10 seconds for the job finish.
waitForJob(time.time() + 10);
Note that which job events are terminal, expected, or allowed may vary somewhat from job to job; for instance, it’s possible to submit a job which releases itself from certain hold conditions.
- class htcondor.JobEventLog(filename)
Reads user job event logs from
filename
.By default, it blocks waiting for new events, but it may be used to poll for them:
import htcondor jel = htcondor.JobEventLog("file.log") # Read all currently-available events without blocking. for event in jel.events(stop_after=0): print(event) print("We found the the end of file")
A pickled
JobEventLog
resumes iterating over events where it left off if and only if, after being unpickled, the job event log file is identical except for appended events.- Parameters
filename (str) – A file containing a user job event log.
- events(stop_after) object :
Return an iterator over
JobEvent
objects from the filename given in the constructor. By default, the iterator blocks forever waiting for new events.- Parameters
stop_after (int) –
After how many seconds should the iterator stop waiting for new events?
If
None
(the default), wait forever.If
0
, never wait. Does not block.For any other value, wait (block) for that many seconds for a new event, raising
StopIteration
if one does not appear. (This does not invalidate the iterator.)
- close() None :
Closes any open underlying file. This object will no longer iterate.
- class htcondor.JobEvent
Represents a single job event from the job event log. Use
JobEventLog
to get an iterator over the job events from a file.Because all events have
type
,cluster
,proc
, andtimestamp
, those are accessed via attributes (see below).The rest of the information in the
JobEvent
can be accessed by key.JobEvent
behaves like a read-only Pythondict
, withget
,keys
,items
, andvalues
methods, and supportslen
andin
(if "attribute" in job_event
, for example).Attention
Although the attribute
type
is aJobEventType
type, when acting as dictionary, aJobEvent
object returns types as if it were aClassAd
, so comparisons to enumerated values must use the==
operator. (No current event type hasExprTree
values.)- type
The event type.
- Return type
- get(key, default=None) object :
As
dict.get()
.
- keys() list :
As
dict.keys()
.
- values() list :
As
dict.values()
.
- items() list :
As
dict.items()
.
- class htcondor.JobEventType
The type event of a user log event; corresponds to
ULogEventNumber
in the C++ source.The values of the enumeration are:
- SUBMIT
- EXECUTE
- EXECUTABLE_ERROR
- CHECKPOINTED
- JOB_EVICTED
- JOB_TERMINATED
- IMAGE_SIZE
- SHADOW_EXCEPTION
- GENERIC
- JOB_ABORTED
- JOB_SUSPENDED
- JOB_UNSUSPENDED
- JOB_HELD
- JOB_RELEASED
- NODE_EXECUTE
- NODE_TERMINATED
- POST_SCRIPT_TERMINATED
- GLOBUS_SUBMIT
- GLOBUS_SUBMIT_FAILED
- GLOBUS_RESOURCE_UP
- GLOBUS_RESOURCE_DOWN
- REMOTE_ERROR
- JOB_DISCONNECTED
- JOB_RECONNECTED
- JOB_RECONNECT_FAILED
- GRID_RESOURCE_UP
- GRID_RESOURCE_DOWN
- GRID_SUBMIT
- JOB_AD_INFORMATION
- JOB_STATUS_UNKNOWN
- JOB_STATUS_KNOWN
- JOB_STAGE_IN
- JOB_STAGE_OUT
- ATTRIBUTE_UPDATE
- PRESKIP
- CLUSTER_SUBMIT
- CLUSTER_REMOVE
- FACTORY_PAUSED
- FACTORY_RESUMED
- NONE
- FILE_TRANSFER
- RESERVE_SPACE
- RELEASE_SPACE
- FILE_COMPLETE
- FILE_USED
- FILE_REMOVED
HTCondor Configuration
- htcondor.param = <htcondor.htcondor._Param object>
Provides dictionary-like access the HTCondor configuration.
An instance of
_Param
. Upon importing thehtcondor
module, the HTCondor configuration files are parsed and populate this dictionary-like object.
- htcondor.reload_config() None :
Reload the HTCondor configuration from disk.
- class htcondor._Param
A dictionary-like object for the local HTCondor configuration; the keys and values of this object are the keys and values of the HTCondor configuration.
The
get
,setdefault
,update
,keys
,items
, andvalues
methods of this class have the same semantics as a Python dictionary.Writing to a
_Param
object will update the in-memory HTCondor configuration.
- class htcondor.RemoteParam(ad)
The
RemoteParam
class provides a dictionary-like interface to the configuration of an HTCondor daemon. Theget
,setdefault
,update
,keys
,items
, andvalues
methods of this class have the same semantics as a Python dictionary.- Parameters
ad (
ClassAd
) – An ad containing the location of the remote daemon.
- refresh() None :
Rebuilds the dictionary based on the current configuration of the daemon.
- htcondor.platform() str :
Returns the platform of HTCondor this module is running on.
- htcondor.version() str :
Returns the version of HTCondor this module is linked against.
HTCondor Logging
- htcondor.enable_debug() None :
Enable debugging output from HTCondor, where output is sent to
stderr
. The logging level is controlled by theTOOL_DEBUG
parameter.
- htcondor.enable_log() None :
Enable debugging output from HTCondor, where output is sent to a file. The log level is controlled by the parameter
TOOL_DEBUG
, and the file used is controlled byTOOL_LOG
.
- htcondor.log(level, msg) None :
Log a message using the HTCondor logging subsystem.
- class htcondor.LogLevel
The log level attribute to use with
log()
. Note that HTCondor mixes both a class (debug, network, all) and the header format (Timestamp, PID, NoHeader) within this enumeration.The values of the enumeration are:
- Always
- Audit
- Config
- DaemonCore
- Error
- FullDebug
- Hostname
- Job
- Machine
- Network
- NoHeader
- PID
- Priv
- Protocol
- Security
- Status
- SubSecond
- Terse
- Timestamp
- Verbose
Esoteric Functionality
- htcondor.send_command(ad, dc, target) None :
Send a command to an HTCondor daemon specified by a location ClassAd.
- Parameters
ad (
ClassAd
) – Specifies the location of the daemon (typically, found by usingCollector.locate()
).dc (
DaemonCommands
) – A command typetarget (str) – An additional command to send to a daemon. Some commands require additional arguments; for example, sending
DaemonOff
to a condor_master requires one to specify which subsystem to turn off.
- class htcondor.DaemonCommands
An enumeration of various state-changing commands that can be sent to a HTCondor daemon using
send_command()
.The values of the enumeration are:
- DaemonOn
- DaemonOff
- DaemonOffFast
- DaemonOffPeaceful
- DaemonsOn
- DaemonsOff
- DaemonsOffFast
- DaemonsOffPeaceful
- OffFast
- OffForce
- OffGraceful
- OffPeaceful
- Reconfig
- Restart
- RestartPeacful
- SetForceShutdown
- SetPeacefulShutdown
- htcondor.send_alive([ ad=None, pid=None, timeout=None) None :
Send a keep alive message to an HTCondor daemon.
This is used when the python process is run as a child daemon under the condor_master.
- Parameters
ad (
ClassAd
) – AClassAd
specifying the location of the daemon. This ad is typically found by usingCollector.locate()
.pid (int) – The process identifier for the keep alive. The default value of
None
uses the value fromos.getpid()
.timeout (int) – The number of seconds that this keep alive is valid. If a new keep alive is not received by the condor_master in time, then the process will be terminated. The default value is controlled by configuration variable
NOT_RESPONDING_TIMEOUT
.
- htcondor.set_subsystem(subsystem, type=htcondor.htcondor.SubsystemType(15)) None :
Set the subsystem name for the object.
The subsystem is primarily used for the parsing of the HTCondor configuration file.
- Parameters
name (str) – The subsystem name.
daemon_type (
SubsystemType
) – The HTCondor daemon type. The default value of Auto infers the type from the name parameter.
Exceptions
For backwards-compatibility, the exceptions in this module inherit from the built-in exceptions raised in earlier (pre-v8.9.9) versions.
- class htcondor.HTCondorException
Never raised. The parent class of all exceptions raised by this module.
- class htcondor.HTCondorEnumError
Raised when a value must be in an enumeration, but isn’t.
- class htcondor.HTCondorInternalError
Raised when HTCondor encounters an internal error.
- class htcondor.HTCondorLocateError
Raised when HTCondor cannot locate a daemon.
- class htcondor.HTCondorReplyError
Raised when HTCondor received an invalid reply from a daemon, or the daemon’s reply indicated that it encountered an error.
- class htcondor.HTCondorValueError
Raised instead of
ValueError
for backwards compatibility.
Thread Safety
Most of the htcondor
module is protected by a lock that prevents multiple
threads from executing locked functions at the same time.
When two threads both want to call locked functions or methods, they will wait
in line to execute them one at a time
(the ordering between threads is not guaranteed beyond “first come first serve”).
Examples of locked functions include:
Schedd.query()
, Submit.queue()
, and Schedd.edit()
.
Threads that are not trying to execute locked htcondor
functions will
be allowed to proceed normally.
This locking may cause unexpected slowdowns when using htcondor
from
multiple threads simultaneously.