Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync "olm" failed: no catalog sources available #740

Closed
denyskril opened this issue Mar 5, 2019 · 17 comments
Closed

Sync "olm" failed: no catalog sources available #740

denyskril opened this issue Mar 5, 2019 · 17 comments

Comments

@denyskril
Copy link

Hello!
I installed OLM on my kubernetes cluster and I see errors in catalog-operator

time="2019-03-05T13:52:24Z" level=info msg="retrying olm"
E0305 13:52:24.962371 1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-05T13:52:25Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=8q7aJ source=olm-operators
time="2019-03-05T13:52:25Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=8q7aJ source=olm-operators

and operators can't installed
Please help.

kubernetes 1.12.1
olm - https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/quickstart/olm.yaml

@njhale
Copy link
Member

njhale commented Mar 5, 2019

@deniskril thanks for letting us know about your issue! Could you provide some more information about:

  • What CatalogSources are on your cluster and in what namespaces (manifests please)
  • What operator(s) are you attempting to install?
  • What Subscriptions have you created and in what namespace (manifests please)

@denyskril
Copy link
Author

denyskril commented Mar 5, 2019

i just create only https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/quickstart/olm.yaml
kubectl -n olm get Subscription -o yaml

items:
- apiVersion: operators.coreos.com/v1alpha1
  kind: Subscription
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"operators.coreos.com/v1alpha1","kind":"Subscription","metadata":{"annotations":{},"name":"packageserver","namespace":"olm"},"spec":{"channel":"alpha","name":"packageserver","source":"olm-operators","sourceNamespace":"olm"}}
    creationTimestamp: "2019-03-05T14:06:35Z"
    generation: 1
    name: packageserver
    namespace: olm
    resourceVersion: "73295346"
    selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/olm/subscriptions/packageserver
    uid: e33d61b3-3f4f-11e9-a5b7-005056b2b12a
  spec:
    channel: alpha
    name: packageserver
    source: olm-operators
    sourceNamespace: olm
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""```

kubectl -n olm get CatalogSource -o yaml

```apiVersion: v1
items:
- apiVersion: operators.coreos.com/v1alpha1
  kind: CatalogSource
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"operators.coreos.com/v1alpha1","kind":"CatalogSource","metadata":{"annotations":{},"name":"olm-operators","namespace":"olm"},"spec":{"configMap":"olm-operators","displayName":"OLM Operators","publisher":"Red Hat","sourceType":"internal"}}
    creationTimestamp: "2019-03-05T14:06:35Z"
    generation: 1
    name: olm-operators
    namespace: olm
    resourceVersion: "73304294"
    selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/olm/catalogsources/olm-operators
    uid: e2db89c7-3f4f-11e9-a5b7-005056b2b12a
  spec:
    configMap: olm-operators
    displayName: OLM Operators
    publisher: Red Hat
    sourceType: internal
  status:
    configMapReference:
      name: olm-operators
      namespace: olm
      resourceVersion: "73295390"
      uid: e2b9ac53-3f4f-11e9-a5b7-005056b2b12a
    lastSync: "2019-03-05T14:42:38Z"
    registryService:
      createdAt: "2019-03-05T14:42:38Z"
      port: "50051"
      protocol: grpc
      serviceName: olm-operators
      serviceNamespace: olm
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""```


and try create RedisEnterpriseCluster:


```apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: operatorhubio-catalog
  namespace: olm
spec:
  sourceType: grpc
  image: quay.io/operatorframework/upstream-community-operators:latest
  displayName: Community Operators
  publisher: OperatorHub.io
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: my-redis-enterprise
  namespace: operators
spec:
  channel: alpha
  name: redis-enterprise
  source: operatorhubio-catalog
  sourceNamespace: olm```

@denyskril
Copy link
Author

and packageserver cannot start :(

@njhale
Copy link
Member

njhale commented Mar 6, 2019

@deniskril could you check if the operatorhubio-catalog CatalogSource has a status? If it does, please check if an operatorhubio-catalog-* pod exists. Also, please post the logs from the failing package-server pods.

@denyskril
Copy link
Author

denyskril commented Mar 6, 2019

logs from operatorhubio-catalog-zdmdl:
time="2019-03-05T18:15:34Z" level=info msg="serving registry" database=bundles.db port=50051

kubectl -n olm get CatalogSource operatorhubio-catalog -o yaml

kind: CatalogSource
metadata:
  creationTimestamp: "2019-03-05T14:54:28Z"
  generation: 1
  name: operatorhubio-catalog
  namespace: olm
  resourceVersion: "73682607"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/olm/catalogsources/operatorhubio-catalog
  uid: 93b226e2-3f56-11e9-a5b7-005056b2b12a
spec:
  displayName: Community Operators
  image: quay.io/operatorframework/upstream-community-operators:latest
  publisher: OperatorHub.io
  sourceType: grpc
status:
  lastSync: "2019-03-06T15:41:13Z"
  registryService:
    createdAt: "2019-03-06T15:41:12Z"
    port: "50051"
    protocol: grpc
    serviceName: operatorhubio-catalog
    serviceNamespace: olm```

@denyskril
Copy link
Author

denyskril commented Mar 6, 2019

in olm namespace deploy packageserver isn't created

and
kubectl -n olm get event

5m9s        Warning   Unhealthy   Pod    Liveness probe failed: timeout: failed to connect service "localhost:50051" within 1s
14m         Warning   Unhealthy   Pod    Readiness probe failed: timeout: failed to connect service "localhost:50051" within 1s```

@njhale
Copy link
Member

njhale commented Mar 6, 2019

@deniskril Do olm-operators-* pods exist? These are different from olm-operator-* pods. If they do, please grab the logs from one via kubectl logs -f ....

@denyskril
Copy link
Author

denyskril commented Mar 7, 2019

kubectl -n olm logs olm-operators-pp52b

time="2019-03-07T04:05:05Z" level=info msg="loading CRDs" configmap=olm-operators ns=olm
time="2019-03-07T04:05:05Z" level=info msg="loading Bundles" configmap=olm-operators ns=olm
time="2019-03-07T04:05:05Z" level=info msg="loading Packages" configmap=olm-operators ns=olm
time="2019-03-07T04:05:05Z" level=info msg="extracting provided API information" configmap=olm-operators ns=olm
time="2019-03-07T04:05:05Z" level=info msg="serving registry" configMapName=olm-operators configMapNamespace=olm port=50051

@ramukima
Copy link

I am having the same issue. Events show "12m Warning Unhealthy Pod Liveness probe failed: timeout: failed to connect service "localhost:50051" within 1s"

Is there a way to change the service registry to external repo instead of local one ?

@ecordell
Copy link
Member

I suspect something about the OLM install may have gone awry. Could you try with the newer install instructions and see if you still have an issue? https://github.com/operator-framework/operator-lifecycle-manager/releases/tag/0.10.0

@kirankt
Copy link

kirankt commented Jun 18, 2019

I have seen the same issue as well. This happens on the latest 0.10.0 release.

$ kubectl get pods -n olm
NAME READY STATUS RESTARTS AGE
catalog-operator-6c6dd66dc9-kqfb8 1/1 Running 0 2m19s
olm-operator-54bfd7d46c-62hwb 1/1 Running 0 2m19s
olm-operators-6h76b 0/1 CrashLoopBackOff 6 7m54s
operatorhubio-catalog-cstb2 0/1 CrashLoopBackOff 6 7m38s

$ kubectl describe pod olm-operators-6h76b
...snip...
Type Reason Age From Message


Normal Scheduled 8m45s default-scheduler Successfully assigned olm/olm-operators-6h76b to node1.example.com
Normal Started 8m19s (x2 over 8m44s) kubelet, node1.example.com Started container configmap-registry-server
Warning Unhealthy 7m51s (x6 over 8m41s) kubelet, node1.example.com Readiness probe failed: timeout: failed to connect service "localhost:50051" within 1s
Normal Pulling 7m50s (x3 over 8m45s) kubelet, node1.example.com Pulling image "quay.io/operator-framework/configmap-operator-registry:latest"
Warning Unhealthy 7m50s (x6 over 8m40s) kubelet, node1.example.com Liveness probe failed: timeout: failed to connect service "localhost:50051" within 1s
Normal Killing 7m50s (x2 over 8m20s) kubelet, node1.example.com Container configmap-registry-server failed liveness probe, will be restarted
Normal Pulled 7m49s (x3 over 8m44s) kubelet, node1.example.com Successfully pulled image "quay.io/operator-framework/configmap-operator-registry:latest"
Normal Created 7m49s (x3 over 8m44s) kubelet, node1.example.com Created container configmap-registry-server
Warning BackOff 3m41s (x13 over 6m20s) kubelet, node1.example.com Back-off restarting failed container

Any ideas?

@jberkus
Copy link

jberkus commented Jul 14, 2019

Verified that this also happens on 0.10.1. It has not been fixed.

Steps to reproduce:

  1. Create 2-node Katacoda scenario on Kubernetes 1.14.1 (current standard for upstream katacoda setups)
  2. Attempt to install OLM 0.10.1, either using setup.sh or the "manual" method (both have the same result)

Symptoms are identical to described above.

@sunbinnnnn
Copy link

The same problem reproduce on v0.10.1 with 2 nodes K8S.

@sunbinnnnn
Copy link

Em...I have fixed that caused by myself, I lanuched 2 K8S nodes in OpenStack cluster and I haven't config security group, and caused network problem between master and minion, It worked when I fix the network problem.

@taylanerden
Copy link

Hi,
We have encountered the same problem and we are also thinking that we have the same network problem. Could you please explain how you have solve it?
Thanks

@marcusportmann
Copy link

marcusportmann commented Apr 24, 2020

Hi,

I have a similar problem. It seems related to the way in which the liveness check is configured. If I run the liveness check manually with localhost as the hostname, as per the pod configuration, I receive the same error.

root@devops:~/operator-lifecycle-manager# kubectl -n olm exec -it operatorhubio-catalog-lnh9k  -- grpc_health_probe -addr=localhost:50051 -connect-timeout 10s -v
parsed options:
> addr=localhost:50051 conn_timeout=10s rpc_timeout=1s
> tls=false
establishing connection
timeout: failed to connect service "localhost:50051" within 10s
command terminated with exit code 2

If I run the liveness check manually without the hostname specified then it works.

root@devops:~/operator-lifecycle-manager# kubectl -n olm exec -it operatorhubio-catalog-lnh9k  -- grpc_health_probe -addr=:50051 -connect-timeout 10s -v
parsed options:
> addr=:50051 conn_timeout=10s rpc_timeout=1s
> tls=false
establishing connection
time elapsed: connect=567.314µs rpc=844.357µs
status: SERVING

I looked at the liveness checks for all the other pods and while they are all http-get, none of them specify a hostname e.g.

Liveness:     http-get https://:5443/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get https://:5443/healthz delay=0s timeout=1s period=10s #success=1 #failure=3

Is it possible to configure OLM not to specify localhost as the hostname for the liveness checks on the registry-server?

@brnavneet
Copy link

Hi , were you able to resolve this issue? I have same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants