Troubleshooting

Mutation failed due to admission controller

There might be cases, when trying to mutate (kubectl apply, kubectl patch) an object, it will be disallowed by admission controller. In most cases, this is expected, but it can sometimes lead to blocking circumstances.

Faulty EventListener

The following example (extracted from SRVKP-2286) shows that an upgrade removed a field, but this field is validated when trying to mutate (and here delete) the object. Making the object not deletable.

'Failed to update status for "qcontrol-services": admission webhook "webhook.triggers.tekton.dev" denied the request: mutation failed: cannot decode incoming old object: json: unknown field "podTemplate"'

To work-around this issue, the idea is to be able to temporarly disable the admission controller(s), remove the faulty object and then re-enable the admission controller(s). Because the admission controller(s) (as well as most payload that makes OpenShift Pipelines) are managed by the operator, trying to delete the admission controller(s) will result of it being re-created in a minute or two by the operator. What we’ll do is : temporarly disable the operator, remove the admission controller(s), and re-enable the operator. When re-enabled, the operator will re-create the admission controller.

For the example above (aka faulty EventListener), the steps to follow in order are:

  1. Disable the operator (scale to 0)

    $ oc scale deploy -n openshift-operators openshift-pipelines-operator --replicas=0
    # […]
    # Validate that the deployment is scaled to 0
    $ oc get deploy openshift-pipelines-operator -n openshift-operators
    NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
    openshift-pipelines-operator   0/0     0            0           80m
  2. Delete the triggers webhook (aka unregister it)

    $ oc delete validatingwebhookconfigurations validation.webhook.triggers.tekton.dev
    # […]
    $ oc delete validatingwebhookconfigurations config.webhook.triggers.tekton.dev
    # […]
    $ oc delete mutatingwebhookconfigurations webhook.triggers.tekton.dev
    # […]
    $ oc get validatingwebhookconfigurations
    # should not contains any `*.webhook.triggers.tekton.dev` object
    $ oc get mutatingwebhookconfigurations
    # should not contains any `webhook.triggers.tekton.dev` object
  3. Delete the eventlistener

    $ oc delete eventlistener faulty-el
    # […]
  4. Re-enable the operator (scale to 1)

    $ oc scale deploy -n openshift-operators openshift-pipelines-operator --replicas=1
    # […]
    # Validate that the deployment is scaled to 1
    $ oc get deploy openshift-pipelines-operator -n openshift-operators
    NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
    openshift-pipelines-operator   0/0     0            0           80m
  5. Wait a bit and see the triggers webhook re-registered

    $ oc get validatingwebhookconfigurations
    # should contains two `*.webhook.triggers.tekton.dev` object
    NAME                                     WEBHOOKS   AGE
    # […]
    config.webhook.triggers.tekton.dev       1          5s
    # […]
    validation.webhook.triggers.tekton.dev   1          5s
    $ oc get mutatingwebhookconfigurations
    # should contains a `webhook.triggers.tekton.dev` object
    NAME                             WEBHOOKS   AGE
    # […]
    webhook.triggers.tekton.dev      1          45s
  6. Create the eventlistener (if need be)

This can be adapted to other objects as well. The main element to know here is, which admission controller affect which object.