Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting order of operator bundles not being honored in 1.12.x of opm #328

Closed
jdockter opened this issue May 14, 2020 · 13 comments
Closed

Sorting order of operator bundles not being honored in 1.12.x of opm #328

jdockter opened this issue May 14, 2020 · 13 comments
Labels
triage/unresolved Indicates an issue that can not or will not be resolved.

Comments

@jdockter
Copy link

I've been building an example catalog with opm 1.11.1 using five bundles and five different channels, with the following two step process:

  1. build empty catalog
opmv1.11.1 index add --bundles '' --tag docker.io/camdockr/ibm-operator-catalog:latest --binary-image quay.io/operator-framework/upstream-registry-builder:v1.11.1 --container-tool podman
  1. build from empty catalog
opmv1.11.1 index add --bundles docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:75f32c70457d533093d5842f6d499424bf2fbf274f259ea5446740eacb10ee5a,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:22ac5a2cb32974430312bc1f2ea8f3f2434afd0295f0379037d5bf0e821c85a9,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:02a81cc0f5ad790b4eee1b69eaa989fe07da999137a77647dbe8aae34cefb824,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:cd130a3a84fc8775dafd0483b66433cacb6b66eebe1841ea5b4383bc980cbb9c,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:8003642768538cea7f1f28284a6a030ef96d645df6e7b1fb74dd1e4c55f73f45
 --tag docker.io/camdockr/ibm-operator-catalog:2020-05-14-145213     --binary-image quay.io/operator-framework/upstream-registry-builder:v1.11.1 --container-tool podman --mode replaces --from-index docker.io/camdockr/ibm-operator-catalog:latest

When I check the registry from the API I see the following:

{
  "name": "ibm-sample-panamax",
  "channels": [
    {
      "name": "V0.0",
      "csvName": "ibm-sample-panamax.v0.0.12"
    },
    {
      "name": "V1.0",
      "csvName": "ibm-sample-panamax.v1.0.2"
    },
    {
      "name": "V1.1",
      "csvName": "ibm-sample-panamax.v1.1.1"
    },
    {
      "name": "V1.2",
      "csvName": "ibm-sample-panamax.v1.2.1"
    },
    {
      "name": "V2.0",
      "csvName": "ibm-sample-panamax.v2.0.2"
    }
  ],
  "defaultChannelName": "V2.0"
}

The operatorbundle table also seem to be in order:

--operatorbundle--
+ sqlite3 catalog-build-2020-05-14-145213/index-2020-05-14-145213.db 'select name,bundlepath,length(csv),length(bundle) from operatorbundle;'
ibm-sample-panamax.v0.0.12|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:75f32c70457d533093d5842f6d499424bf2fbf274f259ea5446740eacb10ee5a|5416|7319
ibm-sample-panamax.v1.0.2|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:22ac5a2cb32974430312bc1f2ea8f3f2434afd0295f0379037d5bf0e821c85a9|5457|7360
ibm-sample-panamax.v1.1.1|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:02a81cc0f5ad790b4eee1b69eaa989fe07da999137a77647dbe8aae34cefb824|5457|7360
ibm-sample-panamax.v1.2.1|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:cd130a3a84fc8775dafd0483b66433cacb6b66eebe1841ea5b4383bc980cbb9c|5457|7360
ibm-sample-panamax.v2.0.2|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:8003642768538cea7f1f28284a6a030ef96d645df6e7b1fb74dd1e4c55f73f45|11491|13394

However in my testing moving to opm 1.12.x I get random results on the default channel and the bundles are not kept in the order I provide them, see below.

opmv1.12.3 index add --bundles docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:75f32c70457d533093d5842f6d499424bf2fbf274f259ea5446740eacb10ee5a,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:22ac5a2cb32974430312bc1f2ea8f3f2434afd0295f0379037d5bf0e821c85a9,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:02a81cc0f5ad790b4eee1b69eaa989fe07da999137a77647dbe8aae34cefb824,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:cd130a3a84fc8775dafd0483b66433cacb6b66eebe1841ea5b4383bc980cbb9c,docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:8003642768538cea7f1f28284a6a030ef96d645df6e7b1fb74dd1e4c55f73f45
 --tag docker.io/camdockr/ibm-operator-catalog:2020-05-14-150203     --binary-image quay.io/operator-framework/upstream-opm-builder:v1.12.3 --container-tool podman --mode replaces --from-index docker.io/camdockr/ibm-operator-catalog:latest
{
  "name": "ibm-sample-panamax",
  "channels": [
    {
      "name": "V0.0",
      "csvName": "ibm-sample-panamax.v0.0.12"
    },
    {
      "name": "V1.0",
      "csvName": "ibm-sample-panamax.v1.0.2"
    },
    {
      "name": "V1.1",
      "csvName": "ibm-sample-panamax.v1.1.1"
    },
    {
      "name": "V1.2",
      "csvName": "ibm-sample-panamax.v1.2.1"
    },
    {
      "name": "V2.0",
      "csvName": "ibm-sample-panamax.v2.0.2"
    }
  ],
  "defaultChannelName": "V1.2"
}
--operatorbundle--
+ sqlite3 catalog-build-2020-05-14-150203/index-2020-05-14-150203.db 'select name,bundlepath,length(csv),length(bundle) from operatorbundle;'
ibm-sample-panamax.v2.0.2|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:8003642768538cea7f1f28284a6a030ef96d645df6e7b1fb74dd1e4c55f73f45|11491|13394
ibm-sample-panamax.v0.0.12|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:75f32c70457d533093d5842f6d499424bf2fbf274f259ea5446740eacb10ee5a|5416|7319
ibm-sample-panamax.v1.0.2|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:22ac5a2cb32974430312bc1f2ea8f3f2434afd0295f0379037d5bf0e821c85a9|5457|7360
ibm-sample-panamax.v1.1.1|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:02a81cc0f5ad790b4eee1b69eaa989fe07da999137a77647dbe8aae34cefb824|5457|7360
ibm-sample-panamax.v1.2.1|docker.io/ibmcom/ibm-sample-panamax-operator-bundle@sha256:cd130a3a84fc8775dafd0483b66433cacb6b66eebe1841ea5b4383bc980cbb9c|5457|7360

I have also tried build without using the --from-index and using --mode semver but neither keep the bundle version order, thus the default channel is randomly set. Please provide any guidance as to why this might be happening.

Attached files in zip
opm-testing.zip
index-2020-05-14-145213.db is from opm 1.11.1 test
index-2020-05-14-150203.db is from opm 1.12.3 test
opm1.11.1-build.log
opm1.12.3-build.log

Using podman version 1.9.2

@cdjohnson
Copy link
Contributor

@shawn-hurley @ecordell This seems like what the --mode semver would do if we had it set, which we don't, so it feels like a regression caused by that feature.

@kevinrizza
Copy link
Member

@cdjohnson Trying to understand the context of this, are you expecting the bundles to be inserted in a specific order for some reason? The grpc API for the registry is the source of truth for this data, not the contents of the database itself. I would not expect that we would retain an ordering here, since that seems arbitrary.

@cdjohnson
Copy link
Contributor

@kevinrizza Yes. The order DOES matter because the Default Channel is set by the last added bundle.

Also when using --mode replace (the default), the order does matter otherwise any replace logic will fail because the bundles are added out of order.

So, we need the array order to be consistent as if you performed opm registry add multiple times in a row, which is inefficient.

@kevinrizza
Copy link
Member

The insert was updated to use an update algorithm by determining the correct order to generate the graph. The order of that database table itself is orthogonal to "did the tip of the graph determine the package level metadata" -- that behavior should have been maintained. Additionally, the order of the elements in that table does not describe the graph's understanding of what replaces what -- that is currently described by the channel_entry table which has keys pointing to what bundles replace what other bundles in specific channels.

The intention was just to allow arbitrary ordering to the -b flag so that users didn't need to specify the correct ordering (#285). If there has been some regression in behavior around how metadata is ingested then I would be very interested in any information you have there, but the database table being in a different order itself doesn't lead me to think that is the case.

@cdjohnson
Copy link
Contributor

cdjohnson commented May 21, 2020

@kevinrizza If we run the same command multiple times, we get a different default channel each time. That's the problem. Before the update, the LAST element in the bundle array set the default channel and it was predictable.

Right now we have NO WAY of setting the default channel predictably unless we call opm index add multiple times where we order them in semver order (assuming that the highest semver is what we want as the default channel)

I have opened issue #330 to discuss that problem, but there doesn't appear to be a valid workaround yet other than using the old initializer approach, which uses the defaultChannel from the package.yaml

If we can get #330 resolved by introducing a way to set the default channel, the order of the bundle array won't matter.

@kevinrizza
Copy link
Member

Okay, so that is definitely a bug. The "latest" package in a set should be the last one to define package level metadata. We'll definitely take a look at that.

For the other issue you linked, I agree that it would be ideal if there were package level flags used as input to supersede this bundle level metadata/make it optional. Part of the reason those things don't exist yet is just because of backwards compatibility reasons. Definitely happy to address those in your other open issue

@jdockter
Copy link
Author

@kevinrizza any idea if this is a major fix or just bring functionality that was in 1.11.1 back into 1.12.x? If this is days away I can wait but weeks we will have to move forward with 1.11.1. Ideally would like to get the benefits of the smaller image with 1.12.x.

@stale
Copy link

stale bot commented Jul 21, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 21, 2020
@kevinrizza
Copy link
Member

Let's leave this open for now

@stale stale bot removed the wontfix label Jul 21, 2020
@stale
Copy link

stale bot commented Sep 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Sep 19, 2020
@openshift-ci-robot openshift-ci-robot added triage/unresolved Indicates an issue that can not or will not be resolved. and removed wontfix labels Sep 20, 2020
@stale
Copy link

stale bot commented Nov 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@cdjohnson
Copy link
Contributor

@njhale I noticed this PR:
#479

Does this PR help choosing the default channel at all? Or is this still an outstanding problem?

That is: Will the last entry in the array set the default channel? Or is it still non-determinisitic?

@ecordell
Copy link
Member

1.15.3 includes fixes for ordering and default channel selection - see #503 for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/unresolved Indicates an issue that can not or will not be resolved.
Projects
None yet
Development

No branches or pull requests

5 participants