Hold Reason Codes

Integer HoldReasonCode
[NumHoldsByReason Label]
Reason for Hold
HoldReasonSubCode
Suggestions to user to fix
1
[UserRequest]

The user put the job on hold with condor_hold.

3
[JobPolicy]

The PERIODIC_HOLD expression evaluated to True. Or, ON_EXIT_HOLD was true

User Provided

4
[CorruptedCredential]

The credentials for the job are invalid.

5
[JobPolicyUndefined]

A job policy expression evaluated to Undefined.

6
[FailedToCreateProcess]

The condor_starter failed to start the executable.

Unix errno

7
[UnableToOpenOutput]

The standard output file for the job could not be opened.

Unix errno

8
[UnableToOpenInput]

The standard input file for the job could not be opened.

Unix errno

9
[UnableToOpenOutputStream]

The standard output stream for the job could not be opened.

Unix errno

10
[UnableToOpenInputStream]

The standard input stream for the job could not be opened.

Unix errno

11
[InvalidTransferAck]

An internal HTCondor protocol error was encountered when transferring files.

12
[TransferOutputError]

An error occurred while transferring job output files or self-checkpoint files.

See note

13
[TransferInputError]

An error occurred while transferring job input files.

See note

14
[IwdError]

The initial working directory of the job cannot be accessed.

Unix errno

Verify initialdir exists and is writeable

15
[SubmittedOnHold]

The user requested the job be submitted on hold.

16
[SpoolingInput]

Input files are being spooled.

Wait for spooling to complete

17
[JobShadowMismatch]

A standard universe job is not compatible with the condor_shadow version available on the submitting machine.

18
[InvalidTransferGoAhead]

An internal HTCondor protocol error was encountered when transferring files.

19
[HookPrepareJobFailure]

<Keyword>_HOOK_PREPARE_JOB was defined but could not be executed or returned failure.

20
[MissedDeferredExecutionTime]

The job missed its deferred execution time and therefore failed to run.

21
[StartdHeldJob]

The job was put on hold because WANT_HOLD in the machine policy was true.

22
[UnableToInitUserLog]

Unable to initialize job event log.

Verify file in log lives in a writeable directory.

23
[FailedToAccessUserAccount]

Failed to access user account.

24
[NoCompatibleShadow]

No compatible shadow.

25
[InvalidCronSettings]

Invalid cron settings.

26
[SystemPolicy]

SYSTEM_PERIODIC_HOLD evaluated to true.

27
[SystemPolicyUndefined]

The system periodic job policy evaluated to undefined.

32
[MaxTransferInputSizeExceeded]

The maximum total input file transfer size was exceeded. (See MAX_TRANSFER_INPUT_MB

33
[MaxTransferOutputSizeExceeded]

The maximum total output file transfer size was exceeded. (See MAX_TRANSFER_OUTPUT_MB

34
[JobOutOfResources]

Job resource usage exceeded the provisioned limit.

Exceeded Resource

Memory: 102 Disk: 104

Resubmit with larger resource request i.e. request_memory or request_disk

35
[InvalidDockerImage]

Specified Docker image was invalid.

Verify docker_image is correct in submit file

36
[FailedToCheckpoint]

Job failed when sent the checkpoint signal it requested.

37
[EC2UserError]

User error in the EC2 universe:

Public key file not defined.

1

Private key file not defined.

2

Grid resource string missing EC2 service URL.

4

Failed to authenticate.

9

Can’t use existing SSH keypair with the given server’s type.

10

You, or somebody like you, cancelled this request.

20

38
[EC2InternalError]

Internal error in the EC2 universe:

Grid resource type not EC2.

3

Grid resource type not set.

5

Grid job ID is not for EC2.

7

Unexpected remote job status.

21

39
[EC2AdminError]

Administrator error in the EC2 universe:

EC2_GAHP not defined.

6

40
[EC2ConnectionProblem]

Connection problem in the EC2 universe

…while creating an SSH keypair.

11

…while starting an on-demand instance.

12

…while requesting a spot instance.

17

41
[EC2ServerError]

Server error in the EC2 universe:

Abnormal instance termination reason.

13

Unrecognized instance termination reason.

14

Resource was down for too long.

22

42
[EC2InstancePotentiallyLost]

Instance potentially lost due to an error in the EC2 universe:

Connection error while terminating an instance.

15

Failed to terminate instance too many times.

16

Connection error while terminating a spot request.

17

Failed to terminated a spot request too many times.

18

Spot instance request purged before instance ID acquired.

19

43
[PreScriptFailed]

Pre script failed.

44
[PostScriptFailed]

Post script failed.

45
[SingularityTestFailed]

Test of singularity runtime failed before launching a job

46
[JobDurationExceeded]

The job’s allowed duration was exceeded.

47
[JobExecuteExceeded]

The job’s allowed execution time was exceeded.

48
[HookShadowPrepareJobFailure]

Prepare job shadow hook failed when it was executed; status code indicated job should be held.

Note

For hold codes 12 [TransferOutputError] and 13 [TransferInputError]: file transfer may invoke file-transfer plug-ins. If it does, the hold subcodes may additionally be 62 (ETIME), if the file-transfer plug-in timed out; or the exit code of the plug-in shifted left by eight bits, otherwise.