# htcondor.dags API Reference¶

Attention

This is not documentation for DAGMan itself! If you run into DAGMan jargon that isn’t explained here, see DAGMan Workflows.

## Creating DAGs¶

class htcondor.dags.DAG(dagman_config=None, dagman_job_attributes=None, max_jobs_by_category=None, dot_config=None, jobstate_log=None, node_status_file=None)

This object represents the entire DAGMan workflow, including both the execution graph and miscellaneous configuration options.

It contains the individual NodeLayer and SubDAG that are the “logical” nodes in the graph, created by the layer() and subdag() methods respectively.

Parameters
describe()

Return a tabular description of the DAG’s structure.

Return type

str

property edges: Iterator[Tuple[Tuple[htcondor.dags.node.BaseNode, htcondor.dags.node.BaseNode], htcondor.dags.edges.BaseEdge]]

Iterate over ((parent, child), edge) tuples, for every edge in the graph.

Return type
final(**kwargs)

Create the FINAL node of the DAG. A DAG can only have one FINAL node; if you call this method multiple times, it will override any previous calls. To customize the FINAL node after creation, modify the FinalNode instance that it returns.

Return type

FinalNode

glob(pattern)

Return a Nodes of the nodes in the DAG whose names match the glob pattern.

Return type

Nodes

layer(**kwargs)

Create a new NodeLayer in the graph with no parents or children. Keyword arguments are forwarded to NodeLayer.

Return type

NodeLayer

property leaves: htcondor.dags.node.Nodes

A Nodes of the nodes in the DAG that have no children.

Return type

Nodes

property node_to_children: Dict[htcondor.dags.node.BaseNode, htcondor.dags.node.Nodes]

Return a dictionary that maps each node to a Nodes containing its children. The Nodes will be empty if the node has no children.

Return type
property node_to_parents: Dict[htcondor.dags.node.BaseNode, htcondor.dags.node.Nodes]

Return a dictionary that maps each node to a Nodes containing its parents. The Nodes will be empty if the node has no parents.

Return type
property nodes: htcondor.dags.node.Nodes

Iterate over all of the nodes in the DAG, in no particular order.

Return type

Nodes

property roots: htcondor.dags.node.Nodes

A Nodes of the nodes in the DAG that have no parents.

Return type

Nodes

select(selector)

Return a Nodes of the nodes in the DAG that satisfy selector. selector should be a function which takes a single BaseNode and returns True (will be included) or False (will not be included).

Return type

Nodes

subdag(**kwargs)

Create a new SubDAG in the graph with no parents or children. Keyword arguments are forwarded to SubDAG.

Return type

SubDAG

walk(order=WalkOrder.WalkOrder.DEPTH_FIRST)

Iterate over all of the nodes in the DAG, starting from the roots (i.e., the nodes with no parents), in either depth-first or breadth-first order.

Sibling order is not specified, and may be different in different calls to this method.

Parameters

order (WalkOrder) – Walk depth-first (children before siblings) or breadth-first (siblings before children).

Return type
walk_ancestors(node, order=WalkOrder.WalkOrder.DEPTH_FIRST)

Iterate over all of the ancestors (i.e., parents, parents of parents, etc.) of some node, in either depth-first or breadth-first order.

Sibling order is not specified, and may be different in different calls to this method.

Parameters
• node (BaseNode) – The node to begin walking from. It will not be included in the results.

• order (WalkOrder) – Walk depth-first (parents before siblings) or breadth-first (siblings before parents).

Return type
walk_descendants(node, order=WalkOrder.WalkOrder.DEPTH_FIRST)

Iterate over all of the descendants (i.e., children, children of children, etc.) of some node, in either depth-first or breadth-first order.

Sibling order is not specified, and may be different in different calls to this method.

Parameters
• node (BaseNode) – The node to begin walking from. It will not be included in the results.

• order (WalkOrder) – Walk depth-first (children before siblings) or breadth-first (siblings before children).

Return type
class htcondor.dags.WalkOrder(value)

An enumeration for keeping track of which order to walk through a graph. Depth-first means that parents/children will be visited before siblings. Breadth-first means that siblings will be visited before parents/children.

DEPTH_FIRST = 'DEPTH'

### Nodes and Node-likes¶

class htcondor.dags.BaseNode(dag, *, name, dir=None, noop=False, done=False, retries=None, retry_unless_exit=None, pre=None, post=None, pre_skip_exit_code=None, priority=0, category=None, abort=None)

This is the superclass for all node-like objects (things that can be the logical nodes in a DAG, like NodeLayer and SubDAG).

Generally, you do not need to construct nodes yourself; instead, they are created by calling methods like DAG.layer(), DAG.subdag(), BaseNode.child_layer(), and so forth. These methods automatically attach the new node to the same DAG as the node you called the method on.

Parameters

Makes all of the nodes children of this node.

Parameters
Returns

self – This method returns self.

Return type

BaseNode

Makes all of the nodes parents of this node.

Parameters
Returns

self – This method returns self.

Return type

BaseNode

child_layer(edge=None, **kwargs)

Create a new NodeLayer which is a child of this node.

Parameters
Returns

node_layer – The newly-created node layer.

Return type

NodeLayer

child_subdag(edge=None, **kwargs)

Create a new SubDAG which is a child of this node.

Parameters
Returns

subdag – The newly-created sub-DAG.

Return type

SubDAG

property children: htcondor.dags.node.Nodes

Return a Nodes containing all of the children of this node.

Return type

Nodes

parent_layer(edge=None, **kwargs)

Create a new NodeLayer which is a parent of this node.

Parameters
Returns

node_layer – The newly-created node layer.

Return type

NodeLayer

parent_subdag(edge=None, **kwargs)

Create a new SubDAG which is a parent of this node.

Parameters
Returns

subdag – The newly-created sub-DAG.

Return type

SubDAG

property parents: htcondor.dags.node.Nodes

Return a Nodes containing all of the parents of this node.

Return type

Nodes

remove_children(*nodes)

Makes sure that the nodes are not children of this node.

Parameters

nodes – The nodes to remove edges from.

Returns

self – This method returns self.

Return type

BaseNode

remove_parents(*nodes)

Makes sure that the nodes are not parents of this node.

Parameters

nodes – The nodes to remove edges from.

Returns

self – This method returns self.

Return type

BaseNode

walk_ancestors(order=WalkOrder.WalkOrder.DEPTH_FIRST)

Walk over all of the ancestors of this node, in the given order.

Return type
walk_descendants(order=WalkOrder.WalkOrder.DEPTH_FIRST)

Walk over all of the descendants of this node, in the given order.

Return type
class htcondor.dags.NodeLayer(dag, *, submit_description=None, vars=None, **kwargs)

Represents a “layer” of actual JOB nodes that share a submit description and edge relationships. Each underlying actual node’s attributes may be customized using vars.

Parameters
class htcondor.dags.SubDAG(dag, *, dag_file, **kwargs)

Represents a SUBDAG in the graph.

See A DAG Within a DAG Is a SUBDAG for more information on sub-DAGs.

Parameters
class htcondor.dags.FinalNode(dag, submit_description=None, **kwargs)

Represents the FINAL node in a DAG.

See PROVISIONER node for more information on the FINAL node.

Parameters
class htcondor.dags.Nodes(*nodes)

This class represents an arbitrary collection of BaseNode. In many cases, especially when manipulating the structure of the graph, it can be used as a replacement for directly iterating over collections of nodes.

Parameters

nodes (Union[BaseNode, Iterable[BaseNode]]) – The logical nodes that will be in this Nodes.

Makes all of the nodes children of all of the nodes in this Nodes.

Parameters
Returns

self – This method returns self.

Return type

Nodes

Makes all of the nodes parents of all of the nodes in this Nodes.

Parameters
Returns

self – This method returns self.

Return type

Nodes

child_layer(type=None, **kwargs)

Create a new NodeLayer which is a child of all of the nodes in this Nodes.

Parameters
Returns

node_layer – The newly-created node layer.

Return type

NodeLayer

child_subdag(type=None, **kwargs)

Create a new SubDAG which is a child of all of the nodes in this Nodes.

Parameters
Returns

subdag – The newly-created sub-DAG.

Return type

SubDAG

parent_layer(type=None, **kwargs)

Create a new NodeLayer which is a parent of all of the nodes in this Nodes.

Parameters
Returns

node_layer – The newly-created node layer.

Return type

NodeLayer

parent_subdag(type=None, **kwargs)

Create a new SubDAG which is a parent of all of the nodes in this Nodes.

Parameters
Returns

subdag – The newly-created sub-DAG.

Return type

SubDAG

remove_children(*nodes)

Makes sure that the nodes are not children of all of the nodes in this Nodes.

Parameters

nodes – The nodes to remove edges from.

Returns

self – This method returns self.

Return type

Nodes

remove_parents(*nodes)

Makes sure that the nodes are not parents of any of the nodes in this Nodes.

Parameters

nodes – The nodes to remove edges from.

Returns

self – This method returns self.

Return type

Nodes

walk_ancestors(order=WalkOrder.WalkOrder.DEPTH_FIRST)

Walk over all of the ancestors of all of the nodes in this Nodes, in the given order.

walk_descendants(order=WalkOrder.WalkOrder.DEPTH_FIRST)

Walk over all of the descendants of all of the nodes in this Nodes, in the given order.

### Edges¶

class htcondor.dags.BaseEdge

An abstract class that represents the edge between two logical nodes in the DAG.

abstract get_edges(parent, child, join_factory)

This abstract method is used by the writer to figure out which nodes in the parent and child should be connected by an actual DAGMan edge. It should yield (or simply return an iterable of) individual edge specifications.

Each edge specification is a tuple containing two elements: the first is a group of parent node indices, the second is a group of child node indices. Either (but not both) may be replaced by a special JoinNode object provided by JoinFactory.get_join_node(). An instance of this class is passed into this function by the writer; you should not create one yourself.

You may yield any number of edge specifications, but the more compact you can make the representation (i.e., fewer edge specifications, each with fewer elements), the better. This is where join nodes are helpful: they can turn “many-to-many” relationships into a significantly smaller number of actual edges ($$2N$$ instead of $$N^2$$).

A SubDAG or a zero-vars NodeLayer both implicitly have a single node index, 0. See the source code of ManyToMany for a simple pattern for dealing with this.

Parameters
Return type

Iterable[Union[Tuple[Tuple[int], Tuple[int]], Tuple[Tuple[int], JoinNode], Tuple[JoinNode, Tuple[int]]]]

class htcondor.dags.OneToOne

This edge connects two layers “linearly”: each underlying node in the child layer is a child of the corresponding underlying node with the same index in the parent layer. The parent and child layers must have the same number of underlying nodes.

class htcondor.dags.ManyToMany

This edge connects two layers “densely”: every node in the child layer is a child of every node in the parent layer.

class htcondor.dags.Grouper(parent_chunk_size=1, child_chunk_size=1)

This edge connects two layers in “chunks”. The nodes in each layer are divided into chunks based on their respective chunk sizes (given in the constructor). Chunks are then connected like a OneToOne edge.

The number of chunks in each layer must be the same, and each layer must be evenly-divided into chunks (no leftover underlying nodes).

When both chunk sizes are 1 this is identical to a OneToOne edge, and you should use that edge instead because it produces a more compact representation.

Parameters
• parent_chunk_size (int) – The number of nodes in each chunk in the parent layer.

• child_chunk_size (int) – The number of nodes in each chunk in the child layer.

class htcondor.dags.Slicer(parent_slice=slice(None, None, None), child_slice=slice(None, None, None))

This edge connects individual nodes in the layers, selected by slices. Each node from the parent layer that is in the parent slice is joined, one-to-one, with the matching node from the child layer that is in the child slice.

Parameters

### Node Configuration¶

class htcondor.dags.Script(executable, arguments=None, retry=False, retry_status=1, retry_delay=0)
Parameters
class htcondor.dags.DAGAbortCondition(node_exit_value, dag_return_value=None)

Represents the configuration of a node’s DAG abort condition.

Parameters
• node_exit_value (int) – If the underlying node exits with this value, the DAG will be aborted.

• dag_return_value (Optional[int]) – If the DAG is aborted via this condition, it will exit with this code, if given. If not given, it will exit with the same return value that the node did.

### Writing a DAG to Disk¶

htcondor.dags.write_dag(dag, dag_dir, dag_file_name='dagfile.dag', node_name_formatter=None)

Write out the given DAG to the given directory. This includes the DAG description file itself, as well as any associated submit descriptions.

Parameters
Returns

dag_file_path – The path to the DAG description file; can be passed to htcondor.Submit.from_dag() if you convert it to a string, like Submit.from_dag(str(write_dag(...))).

Return type

pathlib.Path

class htcondor.dags.NodeNameFormatter

An abstract base class that represents a certain way of formatting and parsing underlying node names.

abstract generate(layer_name, node_index)

This method should generate a single node name, given the name of the layer and the index of the underlying node inside the layer.

Return type

str

abstract parse(node_name)

This method should convert a single node name back into a layer name and underlying node index. Node names must be invertible for rescue() to work.

Return type
class htcondor.dags.SimpleFormatter(separator=':', index_format='{:d}', offset=0)

A no-frills NodeNameFormatter that produces underlying node names like LayerName-5.

## DAG Configuration¶

class htcondor.dags.DotConfig(path, update=False, overwrite=True, include_file=None)

A DotConfig holds the configuration options for whether and how DAGMan will produce a DOT file representing its execution graph.

Parameters
• path (Path) – The path to write the DOT file to.

• update (bool) – If True, the DOT file will be updated as the DAG executes. If False, it will be written once at startup.

• overwrite (bool) – If True, the DOT file will be updated in-place. If False, new DOT files will be created next to the original.

• include_file (Optional[Path]) – Include the contents of the file at this path in the DOT file.

class htcondor.dags.NodeStatusFile(path, update_time=None, always_update=False)

A NodeStatusFile holds the configuration options for whether and how DAGMan will write a file containing node status information.

See Capturing the Status of Nodes in a File for more information.

Parameters

## Rescue DAGs¶

htcondor.dags can read information from a DAGMan rescue file and apply it to your DAG as it is being constructed.

htcondor.dags.rescue(dag, rescue_file, formatter=None)

Applies state recorded in a DAGMan rescue file to the dag. The dag will be modified in-place.

Warning

Running this function on a DAG replaces any existing DONE information on all of its nodes. Every node will have a dictionary for its done attribute. If you want to edit this information manually, always run this function first, then make the desired changes on top.

Warning

This function cannot detect changes in node names. If node names are different in the rescue file compared to the DAG, this function will not behave as expected.

Parameters
Return type

None

htcondor.dags.find_rescue_file(dag_dir, dag_file_name='dagfile.dag')

Finds the latest rescue file in a DAG directory (just like DAGMan itself would).

Parameters
Returns

rescue_file – The path to the latest rescue file found in the dag_dir.

Return type

pathlib.Path