condor_adstash

Gather schedd and/or startd job history ClassAds and push them via a search engine or file interface.

Synopsis

condor_adstash [--help ]

condor_adstash [--process_name NAME] [--standalone ] [--sample_interval SECONDS] [--checkpoint_file PATH] [--log_file PATH] [--log_level LEVEL] [--threads THREADS] [--interface {null,elasticsearch,jsonfile}] [--collectors COLLECTORS] [--schedds SCHEDDS] [--startds STARTDS] [--schedd_history ] [--startd_history ] [--ad_file PATH] [--schedd_history_max_ads NUM_ADS] [--startd_history_max_ads NUM_ADS] [--schedd_history_timeout SECONDS] [--startd_history_timeout SECONDS] [--se_host HOST[:PORT]] [--se_url_prefix PREFIX] [--se_username USERNAME] [--se_use_https ] [--se_timeout SECONDS] [--se_bunch_size NUM_DOCS] [--es_index_name INDEX_NAME] [--se_no_log_mappings] [--se_ca_certs PATH] [--json_dir PATH]

Description

condor_adstash is a tool that assists in monitoring usage by gathering job ClassAds (typically from condor_schedd and/or condor_startd history queries) and pushing the ClassAds as documents to some target (typically Elasticsearch).

Unless run in --standalone mode, condor_adstash expects to be invoked as a daemon by a condor_master, i.e. condor_adstash should be invoked in standalone mode when run on the command-line. Whether invoked by condor_master or run standalone, condor_adstash gets its configuration, in increasing priority, from the HTCondor configuration macros beginning with ADSTASH_ (when --process_name is not provided), then environment variables, and finally command-line options.

condor_adstash must be able to write its --checkpoint_file to a persistent location so that duplicate job ClassAds are not fetched from the daemons’ histories in consecutive polls.

A named Elasticsearch index will be created if it doesn’t exist, and may be modified if new fields (corresponding to ClassAd attribute names) need to be added. It is up to the administrator of the Elasticsearch instance to install rollover policies (e.g. ILM) on the named index and/or to set up the index as an alias.

Options

-h, --help

Display the help message and exit.

--process_name PREFIX

Give condor_adstash a different name for looking up HTCondor configuration and environment variable values (see examples).

--standalone

Run condor_adstash in standalone mode (runs once, does not attempt to contact condor_master)

--sample_interval SECONDS

Number of seconds between polling the list(s) of daemons (ignored in standalone mode)

--checkpoint_file PATH

Location of checkpoint file (will be created if missing)

--log_file PATH

Location of log file

--log_level LEVEL

Log level (uses Python logging library levels: CRITICAL/ERROR/WARNING/INFO/DEBUG)

--threads THREADS

Number of parallel threads to use when polling for job ClassAds and when pushing documents to Elasticsearch

--interface {null,elasticsearch,opensearch,jsonfile}

Push ads via the chosen interface

ClassAd source options

--schedd_history

Poll and push condor_schedd job histories

--startd_history

Poll and push condor_startd job histories

--ad_file PATH

Load Job ClassAds from a file instead of querying daemons (Ignores --schedd_history and --startd_history.)

Options for HTCondor daemon (Schedd, Startd, etc.) history sources

--collectors COLLECTORS

Comma-separated list of condor_collector addresses to contact to locate condor_schedd and condor_startd daemons

--schedds SCHEDDS

Comma-separated list of condor_schedd names to poll job histories from

--startds STARTDS

Comma-separated list of condor_startd machines to poll job histories from

--schedd_history_max_ads NUM_ADS

Abort after reading NUM_ADS from a condor_schedd

--startd_history_max_ads NUM_ADS

Abort after reading NUM_ADS from a condor_startd

--schedd_history_timeout SECONDS

Abort if reading from a condor_schedd takes more than this many seconds

--startd_history_timeout SECONDS

Abort if reading from a condor_startd takes more than this many seconds

Search engine (Elasticsearch, OpenSearch, etc.) interface options

--se_host HOST[:PORT]

Search engine host:port

--se_url_prefix PREFIX

Search engine URL prefix

--se_username USERNAME

Search engine username

--se_use_https

Use HTTPS when connecting to search engine

--se_timeout SECONDS

Max time to wait for search engine queries

--se_bunch_size NUM_DOCS

Group ads in bunches of this size to send to search engine

--se_index_name INDEX_NAME

Push ads to this search engine index or alias

--se_no_log_mappings

Don’t write a JSON file with mappings to the log directory

--se_ca_certs PATH

Path to root certificate authority file (will use certifi’s CA if not set)

JSON file interface options

--json_dir PATH

Directory to store JSON files, which are named by timestamp

Examples

Running condor_adstash in standalone mode on the command-line will result in condor_adstash reading its configuration from the current HTCondor configuration:

$ condor_adstash --standalone

By default, condor_adstash looks for HTCondor configuration variables with names are prefixed with ADSTASH_, e.g. ADSTASH_READ_SCHEDDS = *. These values can be overridden on the command-line:

$ condor_adstash --standalone --schedds=myschedd.localdomain

condor_adstash configuration variables can be also be named using custom prefixes, with the prefix passed in using -\-process_name=PREFIX. For example, if the HTCondor configuration contained FOO_SCHEDD_HISTORY = False and FOO_STARTD_HISTORY = True, condor_adstash can be invoked to read these instead of ADSTASH_SCHEDD_HISTORY and ADSTASH_STARTD_HISTORY:

$ condor_adstash --standalone --process_name=FOO

Providing a PREFIX to --process_name that does not match any HTCondor configuration variables will cause condor_adstash to fallback to a default set of configuration values, which may be useful in debugging.

The configuration values that condor_adstash reads from the current HTCondor configuration can be previewed by printing the help message. The values will be listed as the default values for each command-line option:

$ condor_adstash --help
$ condor_adstash --process_name=FOO --help