High Availability for Tekton Pipelines Controller from OpenShift Pipelines 1.12
High Availability for Tekton Pipelines Controller
If you want to run Tekton Pipelines to support high concurrent workload
scenarios, you need to follow following steps to enable high
availability for tekton-pipelines-controller
. With this approach, you
divide the workqueue in buckets, and each replica owns a subset of those
buckets and process the load if that replica is the leader of that
bucket.
NOTE: This doc has been valid for |
Enable High Availability With Operator on OpenShift:
-
First, specify the number of buckets across which you want to distribute workqueue. The maximum supported value for bucket is
10
. Also specify the number of replicas of tekton-pipelines-controller you need to share the workloads process by the controller. You need to update theconfig
CR with the required number of buckets and replicas.
oc edit tektonconfig config
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
name: config
spec:
<<--->>
pipeline:
<<--->>
performance:
disable-ha: false
buckets: < buckets required >
replicas: < replicas required >
<<--->>
-
Now delete the existing leases so that old replica removes hold and new leases gets created which get acquired by different replicas.
oc delete -n openshift-pipelines $(oc get leases -n openshift-pipelines -o name | grep tekton-pipelines-controller)
Note:
-
If by chance the leases are not properly divided over multiple pods, you can try recreating them.
-
Configmap config-leader-election is by default getting used in knative controllers. So the change in data of this gets read by all other components too. It is fixed for pipelines-controller in 1.11.1 and for others in 1.13.0
Disable High Availability With Operator on OpenShift:
-
Remove the bucket and replicas field on the TektonConfig CR to remove the workqueue distribution or set them both to 1
oc edit tektonconfig config
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
name: config
spec:
<<--->>
pipeline:
<<--->>
performance:
disable-ha: false
<<--->>
-
Next is to delete the dangling leases which were acquired previously by the replicas
oc delete -n openshift-pipelines $(oc get leases -n openshift-pipelines -o name | grep tekton-pipelines-controller)