High Availability for Tekton Pipelines Controller from OpenShift Pipelines 1.12

High Availability for Tekton Pipelines Controller

If you want to run Tekton Pipelines to support high concurrent workload scenarios, you need to follow following steps to enable high availability for tekton-pipelines-controller. With this approach, you divide the workqueue in buckets, and each replica owns a subset of those buckets and process the load if that replica is the leader of that bucket.

NOTE: This doc has been valid for OpenShift Pipelines from v1.12.0 onwards.

Enable High Availability With Operator on OpenShift:

  1. First, specify the number of buckets across which you want to distribute workqueue. The maximum supported value for bucket is 10. Also specify the number of replicas of tekton-pipelines-controller you need to share the workloads process by the controller. You need to update the config CR with the required number of buckets and replicas.

oc edit tektonconfig config
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
    name: config
spec:
    <<--->>
    pipeline:
        <<--->>
        performance:
            disable-ha: false
            buckets: < buckets required >
            replicas: < replicas required >
        <<--->>
  1. Now delete the existing leases so that old replica removes hold and new leases gets created which get acquired by different replicas.

oc delete -n openshift-pipelines $(oc get leases -n openshift-pipelines -o name | grep tekton-pipelines-controller)

Note:

  • If by chance the leases are not properly divided over multiple pods, you can try recreating them.

  • Configmap config-leader-election is by default getting used in knative controllers. So the change in data of this gets read by all other components too. It is fixed for pipelines-controller in 1.11.1 and for others in 1.13.0

Disable High Availability With Operator on OpenShift:

  1. Remove the bucket and replicas field on the TektonConfig CR to remove the workqueue distribution or set them both to 1

oc edit tektonconfig config
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
    name: config
spec:
    <<--->>
    pipeline:
        <<--->>
        performance:
            disable-ha: false
        <<--->>
  1. Next is to delete the dangling leases which were acquired previously by the replicas

oc delete -n openshift-pipelines $(oc get leases -n openshift-pipelines -o name | grep tekton-pipelines-controller)