-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operatorhub-catalog in crashloop backoff without clear error message #1087
Comments
Thanks @mjpitz - could you share the logs from the operatorhubio pod? |
@ecordell : here they are
|
Looking at the event log, it seems that the liveness probes are causing a restart. Given they're both the grpc probe, I assume it's looking at the global health of the service. Is there a way to change the verbosity of the logs? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I am facing the exact same issue, where the logs don't tell much. The image that I am using is
Tried to re-deploy the catalog multiple times but not luck. @mjpitz were you able to get around this issue? |
No. We wound up uninstalling OLM and working with upstream helm-charts instead. Sorry. |
I had the same issue and it caused deleting namespaces to hang. This was a helpful PR for uninstall: Too bad it was never approved and merged. |
I see this an error on the minikube 1.7.1, 1.5.2 and openshift 4:
|
Experiencing this issue - appears to be related to the liveness probe as detailed in this comment: Any tips for fixing appreciated! |
Same issue w/ >=0.14.2 on Kubernetes v1.16 |
|
same issue on k3s |
I'm seeing the same problem with Minikube v1.12.0 using the olm addon. I wonder if the $ minikube version
minikube version: v1.12.0
commit: c83e6c47124b71190e138dbc687d2556d31488d6
$ minikube start --vm-driver=hyperkit --addons=olm
😄 minikube v1.12.0 on Darwin 10.15.5
...
🌟 Enabled addons: default-storageclass, olm, storage-provisioner
🏄 Done! kubectl is now configured to use "minikube"
$ operator-sdk version
operator-sdk version: "v0.19.0", commit: "8e28aca60994c5cb1aec0251b85f0116cc4c9427", kubernetes version: "v1.18.2", go version: "go1.14.4 darwin/amd64"
$ operator-sdk olm status
INFO[0000] Fetching CRDs for version "0.14.1"
INFO[0001] Fetching resources for version "0.14.1"
INFO[0001] Successfully got OLM status for version "0.14.1"
NAME NAMESPACE KIND STATUS
olm Namespace Installed
operatorgroups.operators.coreos.com CustomResourceDefinition Installed
catalogsources.operators.coreos.com CustomResourceDefinition Installed
subscriptions.operators.coreos.com CustomResourceDefinition Installed
installplans.operators.coreos.com CustomResourceDefinition Installed
aggregate-olm-edit ClusterRole Installed
catalog-operator olm Deployment Installed
olm-operator olm Deployment Installed
operatorhubio-catalog olm CatalogSource Installed
olm-operators olm OperatorGroup Installed
aggregate-olm-view ClusterRole Installed
operators Namespace Installed
global-operators operators OperatorGroup Installed
olm-operator-serviceaccount olm ServiceAccount Installed
packageserver olm ClusterServiceVersion Installed
system:controller:operator-lifecycle-manager ClusterRole Installed
clusterserviceversions.operators.coreos.com CustomResourceDefinition Installed
olm-operator-binding-olm ClusterRoleBinding Installed
$ The status of the $ kubectl get pod -l olm.catalogSource=operatorhubio-catalog -n olm -w
NAME READY STATUS RESTARTS AGE
operatorhubio-catalog-vxxhr 0/1 Pending 0 0s
operatorhubio-catalog-vxxhr 0/1 Pending 0 0s
operatorhubio-catalog-vxxhr 0/1 ContainerCreating 0 0s
operatorhubio-catalog-vxxhr 0/1 Terminating 0 1s
operatorhubio-catalog-vxxhr 0/1 Terminating 0 1s
operatorhubio-catalog-88d7p 0/1 Pending 0 0s
operatorhubio-catalog-88d7p 0/1 Pending 0 0s
operatorhubio-catalog-88d7p 0/1 ContainerCreating 0 0s
operatorhubio-catalog-88d7p 0/1 Running 0 9s
operatorhubio-catalog-88d7p 1/1 Running 0 20s
operatorhubio-catalog-88d7p 0/1 OOMKilled 0 33s # <- ?
operatorhubio-catalog-88d7p 0/1 Running 1 34s
operatorhubio-catalog-88d7p 1/1 Running 1 40s
... The resource specs for the $ kubectl get pod -lolm.catalogSource=operatorhubio-catalog -ojsonpath='{.items[*].spec.containers[*].resources}'
map[limits:map[cpu:100m memory:100Mi] requests:map[cpu:10m memory:50Mi]]
$ It feels like the catalog from operatorhub.io might bump into limits as discussed in this comment. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Facing same issue. I am trying it on IBM's PPC architecture.
It seems image is unsupported. How can I build it locally?
|
We have a few PRs merged to resolve this issue (operator-framework/operator-registry#227 and operator-framework/operator-registry#227). So I think this issue may have been resolved. I would like to ask if you can retest this on a newer version of OLM (0.19.0+) to see if this issue still exists. |
Hi, |
same issue I am facing with minikube |
I have same issue but only on k8s nodes with Rocky9/CentOS9 OS. On dhat nodes catalog POD starts more than 75s. |
@joelanford FYI |
Bug Report
I quite frequently find the operatorhub-catalog process in a crashloop backoff with no useful information for debugging how it ended up in that state. It would be nice if we could get some more diagnostic information as to why this pod winds up in that state.
What did you do?
Deployed OLM using the provided helm template and wait for some time.
What did you expect to see?
I would expect the process to run healthy unless there was an issue. Upon encountering an issue, I would expect the log of operatorhub-catalog to communicate what is wrong with the pod instead of serving a single starting gRPC server message.
Environment
Possible Solution
Additional context
The text was updated successfully, but these errors were encountered: