Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert result watcher deployment to statefulset ordinals #2616

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mbpavan
Copy link
Contributor

@mbpavan mbpavan commented Mar 4, 2025

Changes

Transforms the results watcher from a deployment to a statefulset, offering a horizontal scaling option, instead of using leader election.
Limitation: This PR is specific to results watcher. However for results api, making the deployment to a statefulset ordinal does not increase performance due to sticky grpc sessions from the client, we would need to figure out another way to load balance incoming requests.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide for more details.

Release Notes

@tekton-robot tekton-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Mar 4, 2025
@tekton-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign concaf after the PR has been reviewed.
You can assign the PR to them by writing /assign @concaf in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot
Copy link
Contributor

Hi @mbpavan. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 4, 2025
@jkhelil
Copy link
Member

jkhelil commented Mar 4, 2025

/ok-to-test

@tekton-robot tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 4, 2025
@tekton-robot
Copy link
Contributor

The following is the coverage report on the affected files.
Say /test pull-tekton-operator-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/kubernetes/tektoninstallerset/client/cleanup.go 45.8% 40.3% -5.5
pkg/reconciler/kubernetes/tektonresult/metrics.go 68.0% 60.7% -7.3
pkg/reconciler/kubernetes/tektonresult/transform.go 78.8% 77.6% -1.2
pkg/reconciler/shared/tektonconfig/result/result.go 71.4% 69.7% -1.7

Copy link
Member

@jkhelil jkhelil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to validate these scenarios

  • updating the config from deployment to sts, will delete IS, creataes new one, and creates the sts
  • updating the config from sts to deployment(disabling statefulset-ordinals), switche the is deletes the sts and creates a deployment
  • Upgrade scenario works fine (check that upgrade is not broken, by the feature, means upgrade in case of sts, upgraades in case we have a deployment)

disabled: false
is_external_db: false
options: {}
performance:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to see the PipelinePerformanceProperties type being reused. We could move this type to the common package and use it wherever needed, instead of creating new types. This would ensure a more coherent configuration across the project
@PuneetPunamiya @khrm wdyt ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the options struct can set many attributes. This includes Deployment, StatefulSet, and even horizontalPodAutoscalers.

// additional options will be updated on the manifests
// these values will be final
type AdditionalOptions struct {
Disabled *bool `json:"disabled,omitempty"`
ConfigMaps map[string]corev1.ConfigMap `json:"configMaps,omitempty"`
Deployments map[string]appsv1.Deployment `json:"deployments,omitempty"`
HorizontalPodAutoscalers map[string]autoscalingv2.HorizontalPodAutoscaler `json:"horizontalPodAutoscalers,omitempty"`
StatefulSets map[string]appsv1.StatefulSet `json:"statefulSets,omitempty"`
WebhookConfigurationOptions map[string]WebhookConfigurationOptions `json:"webhookConfigurationOptions,omitempty"`
}

apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
  name: config
spec:
  pipeline:
    options:
      disabled: false
      deployments:
        tekton-events-controller:
          spec:
            replicas: 1
      horizontalPodAutoscalers:
        tekton-pipelines-remote-resolvers:
          spec:
            maxReplicas: 5
            metrics:
              - resource:
                  name: cpu
                  target:
                    averageUtilization: 50
                    type: Utilization
                type: Resource
              - resource:
                  name: memory
                  target:
                    averageUtilization: 50
                    type: Utilization
                type: Resource
            minReplicas: 1
            scaleTargetRef:
              apiVersion: apps/v1
              kind: Deployment
              name: tekton-pipelines-remote-resolvers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned for the first time that there is a dedicated structure called PipelinePerformanceProperties. 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@l-qing PipelinePerformanceProperties holds performance configuration for controllers(pipelines and resolvers) like bucket, replicas, statefulset-ordinals etc, I think it is not named well, it should be called, scalingconfiguration ous omething simular, it is may be somehow related to what you mention under options, may be it needs a big refactor

@@ -54,6 +54,7 @@ type TektonResultSpec struct {
CommonSpec `json:",inline"`
ResultsAPIProperties `json:",inline"`
LokiStackProperties `json:",inline"`
Performance ResultPerformanceProperties `json:"performance,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my comment regarding reusing PipelinePerformanceProperties

if len(list.Items) != 1 {
logger.Errorf("found more than 1 installerSet for %s something fishy, cleaning up all", isType)
if len(list.Items) > 1 {
logger.Errorf("Found more than 1 installerSet for %s; cleaning up all", isType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Found/found

@@ -34,6 +34,7 @@ const (
InstallerTypePre = "pre"
InstallerTypePost = "post"
InstallerTypeCustom = "custom"
InstallerTypeResult = "result"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need this

}

// If the TektonResult's Spec for statefulset Replicas is non-nil and greater than 0, it updates the StatefulSet's replicas.
func UpdateStatefulSetReplicasForResultWatcher(tr *v1alpha1.TektonResult) mf.Transformer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use common method

@tekton-robot tekton-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 10, 2025
@mbpavan mbpavan force-pushed the pbheeman-result-statefulset branch from ec956c0 to 7657020 Compare March 10, 2025 10:14
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 10, 2025
@tekton-robot
Copy link
Contributor

The following is the coverage report on the affected files.
Say /test pull-tekton-operator-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/kubernetes/tektoninstallerset/client/cleanup.go 45.8% 40.3% -5.5
pkg/reconciler/kubernetes/tektonresult/metrics.go 68.0% 60.7% -7.3
pkg/reconciler/kubernetes/tektonresult/transform.go 78.8% 77.6% -1.2
pkg/reconciler/shared/tektonconfig/result/result.go 71.4% 69.7% -1.7

@tekton-robot tekton-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 10, 2025
@tekton-robot
Copy link
Contributor

The following is the coverage report on the affected files.
Say /test pull-tekton-operator-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/operator/v1alpha1/performance_validation.go Do not exist 100.0%
pkg/apis/operator/v1alpha1/tektonpipeline_validation.go 86.0% 81.2% -4.8
pkg/apis/operator/v1alpha1/tektonresult_validation.go 26.7% 56.2% 29.6
pkg/reconciler/common/utils.go 83.3% 65.2% -18.1
pkg/reconciler/kubernetes/tektoninstallerset/client/cleanup.go 45.8% 40.3% -5.5
pkg/reconciler/kubernetes/tektonpipeline/transform.go 87.6% 87.0% -0.6
pkg/reconciler/kubernetes/tektonresult/metrics.go 68.0% 60.7% -7.3
pkg/reconciler/kubernetes/tektonresult/transform.go 78.8% 74.0% -4.8
pkg/reconciler/shared/tektonconfig/result/result.go 71.4% 69.7% -1.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants