Is it possible to rerun kubernetes job?

Question

I have the following Kubernetes Job configuration:

---
apiVersion: batch/v1
kind: Job
metadata:
  name: dbload
  creationTimestamp: 
spec:
  template:
    metadata:
      name: dbload
    spec:
      containers:
      - name: dbload
        image: sdvl3prox001:7001/pbench/tdload
        command: ["/opt/pbench/loadTpcdsData.sh",  "qas0063", "dbc", "dbc", "1"]
      restartPolicy: Never
      imagePullSecrets: 
        - name: pbenchregkey
status: {}

When I do kubectl create -f dbload-deployment.yml --record the job and a pod are created, Docker container runs to completion and I get this status:

$ kubectl get job dbload
NAME      DESIRED   SUCCESSFUL   AGE
dbload    1         1            1h
$ kubectl get pods -a
NAME           READY     STATUS      RESTARTS   AGE
dbload-0mk0d   0/1       Completed   0          1h

This job is one time deal and I need to be able to rerun it. If I attempt to rerun it with kubectl create command I get this error

$ kubectl create -f dbload-deployment.yml --record
Error from server: error when creating "dbload-deployment.yml": jobs.batch "dbload" already exists

Of course I can do kubectl delete job dbload and then run kubectl create but I'm wondering if I can somehow re-awaken the job that already exists?

score 77 · Answer 1 · edited Jan 05 '22 at 21:35

77

Simulate a rerun by replacing the job with itself:

Backup your job:

kubectl get job "your-job" -o json > your-job.json

Replace the job in place:

kubectl get job "your-job" -o json | kubectl replace --force -f -

If you get errors due to auto-generated labels or selectors, you can delete or edit them with jq:

kubectl get job "your-job" -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | kubectl replace --force -f -

UPDATED with Jeremy Huiskamp's suggestion

edited Jan 05 '22 at 21:35

Eddie Parker

571
1
4
10

answered Dec 19 '17 at 09:20

F. Santiago

871
6
4

8

Would strongly recommend saving a copy of the job json to a file first. `kubectl replace` deletes the job before running into the errors recreating it. – Jeremy Huiskamp Apr 04 '19 at 12:40
2

Save the json first and then recreate!! – deepdive Nov 14 '19 at 23:48
No need to save the json first if he still has the original job definition file? – Caesar Mar 31 '21 at 00:56
F Santiago can you update the answer to incorporate @JeremyHuiskamp. edit queue is full – Uzumaki D. Ichigo Jun 17 '21 at 15:18

score 60 · Accepted Answer · answered Apr 24 '17 at 18:10

60

No. There is definitely no way to rerun a kubernetes job. You need to delete it first.

answered Apr 24 '17 at 18:10

cohadar

764
7
6

7

Those like me who need more details - this is two a step process. First delete your job with `kubectl delete job ` and then `kubectl apply -f ` – Niks Jun 12 '20 at 14:58
5

It can be done in one step with `kubectl replace --force` - which will delete the job if it exists, and then (re-)create it unconditionally. See below. – Caesar Mar 31 '21 at 00:58
1

@Caesar An example? Because I get `error: must specify one of -f and -k` – DimiDak Dec 17 '21 at 14:55
1

don't use `kubectl replace --force` by default. If you do and it fails (because it includes autogenerated fields), the original job will have been already deleted and you will not be able to rerun the command. Use `kubectl get -o yaml` first and at that point using `kubectl apply` is easier anyway. – Ordoshsen Apr 28 '22 at 09:26

vp124 · Answer 3 · 2020-05-04T15:30:46.680

44

You can also avoid the error you mentioned by specifying

  generateName: dbload

instead of simply name

In that case, each job you submit with this yaml file will have a unique name that will look something like dbloada1b2c. Then you can decide whether you need to delete the old jobs, but you won't have to do it.

Here is a working yaml example:

apiVersion: batch/v1
kind: Job
metadata:
  generateName: netutils-
spec:
  parallelism: 1
  template:
    spec:
      containers:
      - image: amouat/network-utils 
        name: netutil
      restartPolicy: Never

This is the output from kubectl get job after two kubectl create -f example.yaml commands:

NAME             COMPLETIONS   DURATION   AGE
netutils-5bs2s   0/1           14s        14s
netutils-dsvfk   0/1           10s        10s

edited May 04 '20 at 15:30

answered Aug 15 '17 at 21:39

vp124

541
4
3

I believe generateName only applies to kind=pod and NOT job. – user518066 Aug 30 '17 at 14:28
5

No, it's a standard part of ObjectMeta and applies to both pod and job: [k8s reference](https://kubernetes.io/docs/resources-reference/v1.7/#objectmeta-v1-meta). I've been using it all the time, it's core to what I'm doing. – vp124 Sep 01 '17 at 14:52
6

Thank you very much for this dodge. Just for documentation this does only work with `kubectl create` – Ohmen Nov 29 '17 at 14:27
disagree. Tried to do this right now and it causes error "resource name may not be empty". – rudolfdobias Feb 08 '20 at 00:48
@rudolfdobias I suspect that the error you see is related to something different, not Job.metadata.name parameter. I've been using that for years, and it works both on GKE and AKS. Would you be able to share you yaml config? – vp124 Feb 10 '20 at 16:20
Doesn't work for me either, I get "resource name may not be empty ". Can you provide a full working k8s yaml snippet with generateName ? – Balint Bako May 03 '20 at 09:00
@BalintBako, I have added an example yaml that I have just tested on GKE. (Added to my original answer, as there seems to be no way to add a block of code in the comment.) My guess is that you're omitting `name` attribute in `spec.containers`. Note that `generateName` attribute belongs to `metadata`. – vp124 May 04 '20 at 15:30
thx @vp124, this is how I tried and doesn't work on every platform that runs k8s. It might be version dependent, I will check that. – Balint Bako May 04 '20 at 15:32
2

Are you creating your job with `kubectl create` or `kubectl apply`? The latter will not work, as per this reference: https://github.com/kubernetes/kubernetes/issues/44501. So you have to do `kubectl create`. I have also just tried on Azure, it works there as well. `kubectl version` shows `v1.15.10` on AKS and `v1.14.10-gke.27` ok GKE, but like I said, I've been doing this for quite a while, so it worked on earlier versions, too. – vp124 May 04 '20 at 16:06
This is a better solution than delete/re-create in my usecase. Thanks – Jinto Lonappan May 29 '22 at 06:55

score 6 · Answer 4 · edited Nov 15 '21 at 23:11

As an improvement on @F. Santiago's idea, you can simply use the value stored at the annotation "kubectl.kubernetes.io/last-applied-configuration" that holds the initial applied configuration without any auto generated field:

kubectl get job <jobname> -o json | \
jq -r '.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration"' | \
kubectl replace --save-config --force -f -

Note: for kubectl replace, remember to pass --save-config so it updates the annotation field with the last config applied.

score 3 · Answer 5 · edited Aug 21 '20 at 12:14

3

There is no way to run a job that completed but you can simulat a rerun by doing the following

Get yaml file of the existing job:

kubectl get job <job_name> -o yaml > <job_name>.yaml

Delete the existing job:
```
kubectl delete job <job_name>
```
Run the job again:
```
kubectl apply -f <job_name>.yaml
```

edited Aug 21 '20 at 12:14

Gerald Schneider

19,757
8
52
79

answered Aug 21 '20 at 09:38

Paul Chibulcuteanu

31
1

score 2 · Answer 6 · answered Sep 07 '20 at 07:55

Based on @Marcelo's idea I made it work with the following, without any processing of the template:

kubectl get job <job-name> -o custom-columns=:metadata.annotations.kubectl\.kubernetes\.io/last-applied-configuration > job.json
kubectl delete -f job.json
kubectl apply -f job.json

Please note the escaped dots (\.) in the annotation name: kubectl\.kubernetes\.io/last-applied-configuration. Without it, it returns <none>.

score 0 · Answer 7 · answered Jan 05 '22 at 02:31

I've implemented F. Santiago's method using a kubectl plugin. I have these two files in my PATH [1] so that kubectl picks it up.

Now issuing kubectl replacejob [job name] looks like this:

[w] eddie@eddie ~ $ kubectl replacejob my_job_name
Writing out backup file: my_job_name.bak.json
job.batch "my_job_name" deleted
job.batch/my_job_name replaced

[1] Files to make the plugin work:

kubectl-replacejob.cmd: Simple wrapper to call python with the same args

@echo off

pushd . 
cd %~dp0
python kubectl-replacejob.py %*
popd

kubectl-replacejob.py: Does the 'hard' work of replacement.

import sys
import subprocess
import json
import io

if len( sys.argv ) < 2:
    print("Error: please specify the job you wish to replace.")
    sys.exit(-1)

job_name = sys.argv[1]

# Fetch the job as json
job_as_json_text = subprocess.check_output(f'kubectl get job {job_name} -o json', shell=True).decode()
job_as_json = json.loads(job_as_json_text)

# Save out a backup
backup_file = f'{job_name}.bak.json'
print(f"Writing out backup file: {backup_file}")
with open(backup_file, 'w') as f:
    f.write(job_as_json_text)

# Remove references to controller uid that borks replace
def remove_key_if_present(obj, *keys):
    for i in range(len(keys)):
        key = keys[i]
        if key in obj:
            if i == len(keys)-1:
                del obj[key]
            else:
                obj = obj[key]
        else:
            print(f"WARNING: Failed to remove {'.'.join(keys)}: failed finding key at {key}!")
            return 


remove_key_if_present(job_as_json, 'spec', 'selector', 'matchLabels', 'controller-uid')
remove_key_if_present(job_as_json, 'spec', 'template', 'metadata', 'labels', 'controller-uid')
job_as_json_text = json.dumps(job_as_json)

# Pretty print for testing
#print(json.dumps(job_as_json, indent=4, sort_keys=True))

# Issue the replace
subprocess.run(f'kubectl replace --force -f -', shell=True, input=job_as_json_text.encode())

Is it possible to rerun kubernetes job?

7 Answers7