Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update api version, vendor & enhance testing #605

Merged

Conversation

jchunkins
Copy link
Contributor

@jchunkins jchunkins commented Mar 14, 2021

Description of the change:

The original purpose for this PR was to update the api project to v0.6.1 to pick up validator changes, but I decided to use this opportunity to add further testing enhancements and improve documentation. I wanted to run the e2e tests locally without needing to setup a TLS enabled registry, and discovered some code paths which are capable of handling "skip TLS" but did not expose the feature. I've executed unit tests and e2e tests using my local development machine (a Mac) and all run cleanly (with exception of one e2e test which requires a linux executable to function correctly, so it should work in the GitHub Actions environment).

  • update api project to v0.7.1 to pick up validator (and other) changes
  • update code paths which support skip tls but did not expose it
  • add test case and test data for annotation case sensitive validation
  • update test case to handle multiple errors
  • add .gitignore for artifacts generated from e2e tests
  • use mkcerts for better crossplatform SSL cert generation
  • create dedicated script for starting regsitry with & without SSL
  • implement e2e ctx (i.e. context) package
  • Provision impl for kind servers
  • Provision impl for kubeconfig (i.e. all other server types)
  • adjust makefile to account for kind build tag
  • allow e2e target to use focus flag
  • allow e2e target to use skip tls flag for local testing
  • adjust test.yml git action workflow to use start_registry script
  • adjust test.yml git action to remove kind (since its now dynamic)
  • add e2e documentation for all scenarios
  • update vendor

Implements #568

Closes #568

Signed-off-by: John Hunkins [email protected]

Motivation for the change:

See Description

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /docs
  • Commit messages sensible and descriptive

Sorry, something went wrong.

@openshift-ci-robot
Copy link

Hi @jchunkins. Thanks for your PR.

I'm waiting for a operator-framework member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 14, 2021
@codecov
Copy link

codecov bot commented Mar 14, 2021

Codecov Report

Merging #605 (425be61) into master (88eda1a) will not change coverage.
The diff coverage is 0.00%.

❗ Current head 425be61 differs from pull request most recent head 9fc4771. Consider uploading reports for the commit 9fc4771 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master     #605   +/-   ##
=======================================
  Coverage   50.59%   50.59%           
=======================================
  Files          68       68           
  Lines        5483     5483           
=======================================
  Hits         2774     2774           
  Misses       2097     2097           
  Partials      612      612           
Impacted Files Coverage Δ
pkg/lib/bundle/exporter.go 0.00% <0.00%> (ø)
pkg/lib/indexer/indexer.go 6.93% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 88eda1a...9fc4771. Read the comment docs.

@dinhxuanvu
Copy link
Member

/ok-to-test

@openshift-ci-robot openshift-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 20, 2021
@jchunkins
Copy link
Contributor Author

/retest

@jchunkins
Copy link
Contributor Author

/test ?

@openshift-ci-robot
Copy link

@jchunkins: The following commands are available to trigger jobs:

  • /test e2e-aws
  • /test images
  • /test unit

Use /test all to run all jobs.

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jchunkins
Copy link
Contributor Author

@dinhxuanvu any idea on how to restart e2e-minikube and e2e-kind ?

@timflannagan
Copy link
Member

You need to manually kick those failed jobs. I'm not entirely sure prow has the ability to requeue GitHub actions, but I requeued them in the meantime. When you click on one of the failed checks, there's an option to re-test - It's possible that requires some level of permissions in the organization.

@jchunkins
Copy link
Contributor Author

@timflannagan Thanks for the assist. Since I am not in the org, I don't see the UI controls to allow restarting GitHub Actions. Maybe someday I'll get added. At any rate, thanks again for your help.

@timflannagan
Copy link
Member

Hmm, it looks like that recently added bundle unpack command has been producing more flakes recently. Going to retest the jobs again, but we may need to start diving into those errors.

@jchunkins
Copy link
Contributor Author

I edited the make file on my local system and added --untilItFails like so:

$(GO) run github.com/onsi/ginkgo/ginkgo --v --untilItFails --randomizeAllSpecs --randomizeSuites --race $(if $(TEST),-focus '$(TEST)') $(TAGS) ./test/e2e -- $(if $(SKIPTLS),-skip-tls true) 

And then executed the test:

KUBECONFIG="/tmp/kindconfig" DOCKER_REGISTRY_HOST=localhost:5000 make build e2e TEST='fails to unpack'

And it repeatedly runs without failing. Must be some combination of running with other tests that makes it fail?

@timflannagan
Copy link
Member

Same, I wasn't able to reproduce it consistently when I was attempting to dive into those errors locally a couple of weeks back. It seems like it pops up often enough in the e2e-[minikube,kind] tests.

@njhale Any idea why that bundle unpack command occasionally fails the e2e suite? It seems like we're expecting that command will purposefully fail when feeding it an invalid bundle, but the error output is nil.

@jchunkins
Copy link
Contributor Author

@timflannagan @njhale I am not sure if the unpack flake is due to TLS in some oddball circumstances? Or maybe it only happens on linux? Since @timflannagan saw the issue previously it seems unlikely to be related to my changes in this PR.

While I am thinking about TLS, I have included some changes relating to skip-tls handling in some of the code paths of this PR as an enhancement for testing. I think it is worthwhile discussing the actual argument to enable/disable TLS when running the actual commands. Note that at this line the parent command (in this case it would be the root command) gets its flag setup for skip-tls.

This effectively makes the skip-tls argument apply to the root and all of its children, which as far as I can tell is the intended behavior. It just seems odd to do it in the index command.... if it is truly an argument that applies to all sub commands (i.e. global) then why not just define it in cmd/opm/root/cmd.go? This would make it much more obvious. If it is supposed to be global then there is no need to define the same variable in other places like add.go and unpack.go

To see all of the places where skip-tls shows up, use this query https://github.com/operator-framework/operator-registry/search?q=%22skip-tls%22

Is there anyone else I should tag on this PR to discuss this? I can always open up a separate PR to handle this sort of change if we think its worthwhile.

@jchunkins
Copy link
Contributor Author

Regarding the flakes in fails to unpack, I tried to see if this was a linux vs. Mac difference and setup a ubuntu 20.04 system and used -untilItFails like I indicated earlier and it also runs without errors continuously. Getting the feeling that maybe running other tests at the same time somehow causes the environment to become dirty in some way, leading to failure depending on what test might proceed fails to unpack. Seems possible given that --randomizeAllSpecs and --randomizeSuites is used. Obviously this is just a total guess on my part.

@timflannagan
Copy link
Member

Yeah, agreed and we just merged a fix that disabled those problematic/flaky tests, with the intention of re-enabling in the next couple of weeks. It looks like GitHub actions will run out-of-date workflows/tests, so we may need to open/close this PR and re-apply any labels if the flakes are too overwhelming, or force an empty commit.

@timflannagan
Copy link
Member

But just to clarify, those bundle unpack errors are known and tracked, and we won't block merging PR content with e2e failures that fail solely for something that's difficult to reproduce locally.

@jchunkins
Copy link
Contributor Author

OK... I updated github.com/operator-framework/api v0.7.1 since that was now available and rebased / squashed commits. Hopefully the tests run cleanly and this can get reviewed / merged.

@jchunkins
Copy link
Contributor Author

Excellent, all tests pass now. Now we just need review labels.

@jchunkins
Copy link
Contributor Author

@hasbro17 or @anik120 Can either of you review this PR please?

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly a bunch of non-blocking nits, though I'm interested in the choice to match error strings in their entirety.

Also, is it necessary to include the api and k8s bump in the same PR as the other changes? It seems like it could be a good idea to make that dependency bump a standalone PR.

@timflannagan timflannagan reopened this Apr 14, 2021
@timflannagan
Copy link
Member

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 14, 2021
@timflannagan
Copy link
Member

Guess I don't understand what I should do to fix the coverage problem. I'll just wait til your change is complete.

@jchunkins Looks like the checks are being populated with a Required status now, but there's still an issue with our tide configuration somewhere where it's still not merging a failed, yet optional check. When I dived into that codecov/project, I tried to re-test using a more up-to-date master commit but it looks like I'm either getting rate limited or simply don't have the requisite permissions for that action. Can you rebase this PR against master, and I'll re-apply the lgtm?

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Apr 14, 2021
@jchunkins
Copy link
Contributor Author

@timflannagan rebased and pushed

@timflannagan
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 14, 2021
@jchunkins
Copy link
Contributor Author

@timflannagan can you poke the git actions for e2e-kind and e2e-minikube it appears Podman installs are not happening via apt-get. Hopefully whatever was wrong with the repos is fixed or its a transient issue 🤷

@timflannagan
Copy link
Member

@jchunkins I'm working on fixing that in the background right now as it's affecting other PRs too. Ever since I joined the team, that podman dependency has been real problematic, so hopefully it's as simple as reverting the changes (i.e. workarounds) that were made over the last month or so.

@timflannagan
Copy link
Member

@jchunkins Alright those fixes landed in #635 - I re-spun the e2e-* jobs but they're still running tests against a stale master commit, which makes sense as you're also introducing changes to those same workflows that were touched. Can you rebase again 😆?

@timflannagan
Copy link
Member

As an aside, this is one of the obvious pain points of migrating from largely prow-based tests to GH actions, but hopefully, we're almost there with merging this.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@timflannagan
Copy link
Member

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 15, 2021
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Apr 15, 2021
@jchunkins
Copy link
Contributor Author

OK merged again... we'll see how it goes

Verified

This commit was signed with the committer’s verified signature.
jchunkins John Hunkins
- update api project to v0.7.1 to pick up validator (and other) changes
- update code paths which support skip tls but did not expose it
- add test case and test data for annotation case sensitive validation
- update test case to handle multiple errors
- add .gitignore for artifacts generated from e2e tests
- use mkcerts for better crossplatform SSL cert generation
- create dedicated script for starting regsitry with & without SSL
- implement e2e ctx (i.e. context) package
- Provision impl for kind servers
- Provision impl for kubeconfig (i.e. all other server types)
- adjust makefile to account for `kind` build tag
- allow e2e target to use focus flag
- allow e2e target to use skip tls flag for local testing
- adjust test.yml git action workflow to use start_registry script
- adjust test.yml git action to remove kind (since its now dynamic)
- add e2e documentation for all scenarios
- use regex to determine possible kind control plane names
- add --name argument for kind load docker-image call
- update vendor

Implements operator-framework#568

Signed-off-by: John Hunkins <[email protected]>
@timflannagan
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 15, 2021
@timflannagan
Copy link
Member

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 15, 2021
@openshift-merge-robot openshift-merge-robot merged commit fb26b32 into operator-framework:master Apr 15, 2021
@jchunkins
Copy link
Contributor Author

@timflannagan Thank you very much for your help getting this across the finish line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect common invalid annotations: e.g. olm.skiprange
7 participants