Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatcar Azure images are booting with broken /etc/resolv.conf #1346

Closed
invidian opened this issue Nov 28, 2023 · 6 comments · Fixed by #1347
Closed

Flatcar Azure images are booting with broken /etc/resolv.conf #1346

invidian opened this issue Nov 28, 2023 · 6 comments · Fixed by #1347
Labels
area/provider/azure Issues or PRs related to azure provider kind/bug Categorizes issue or PR as related to a bug.

Comments

@invidian
Copy link
Member

I'm opening this issue for visibility and to discuss what mitigation we should provide to users.

As described in flatcar/Flatcar#1265, image-builder Flatcar images with Flatcar version stable 3602.2.2 have currently broken DNS resolution when booted.

We could possibly mitigate it on image-builder level with the following patch:

diff --git images/capi/packer/azure/packer.json images/capi/packer/azure/packer.json
index a472e5950..553923f70 100644
--- images/capi/packer/azure/packer.json
+++ images/capi/packer/azure/packer.json
@@ -192,7 +192,7 @@
       ],
       "inline": [
         "if [[ $BUILD_NAME != \"flatcar\"* ]]; then exit 0; fi",
-        "sudo bash -c \"/usr/share/oem/python/bin/python /usr/share/oem/bin/waagent -force -deprovision+user && sync\""
+        "sudo bash -c \"/usr/share/oem/python/bin/python /usr/share/oem/bin/waagent -force -deprovision+user && ln -sf ../run/systemd/resolve/resolv.conf /etc/resolv.conf && sync\""
       ],
       "inline_shebang": "/bin/bash -x",
       "remote_folder": "{{user `provisioner_remote_folder`}}",

Right now CAPZ e2e tests are failing because of this. With this patch, we should be able to rebuild Azure images and fix tests without temporarily fixing Flatcar templates.

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 28, 2023
@AverageMarcus
Copy link
Member

Just to confirm, based on the linked issue this seems to be a problem with the waagent command removing the file, is that correct?

If so, this is limited to Azure I think as it's the only place it's used. Do you know the purpose of this command? I'm not familiar with it.

@invidian
Copy link
Member Author

Just to confirm, based on the linked issue this seems to be a problem with the waagent command removing the file, is that correct?
If so, this is limited to Azure I think as it's the only place it's used.

Yes, right now the impact is limited to only Azure images.

Do you know the purpose of this command? I'm not familiar with it.

As explained in Flatcar issue:

With image-builder project, as part of building Flatcar CAPI images, we run /usr/share/oem/bin/waagent -force -deprovision+user before shutting down building machine to ensure resulting OS image can be used for booting multiple instances.

@AverageMarcus
Copy link
Member

ensure resulting OS image can be used for booting multiple instances.

That's the bit I don't understand, sorry, I'm not that experienced with Azure. Why is this required for Azure and not other providers? (I just want to understand, not trying to block this 😄)

/cc @mboersma as you might be able to offer better insight from Azure perspective.

/label area/provider/azure

@k8s-ci-robot
Copy link
Contributor

@AverageMarcus: The label(s) /label area/provider/azure cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

ensure resulting OS image can be used for booting multiple instances.

That's the bit I don't understand, sorry, I'm not that experienced with Azure. Why is this required for Azure and not other providers? (I just want to understand, not trying to block this 😄)

/cc @mboersma as you might be able to offer better insight from Azure perspective.

/label area/provider/azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AverageMarcus AverageMarcus added the area/provider/azure Issues or PRs related to azure provider label Nov 28, 2023
invidian added a commit to kinvolk/image-builder that referenced this issue Nov 29, 2023
@invidian
Copy link
Member Author

I created #1347 with a fix, as it will be faster than waiting for new Flatcar release with a proper fix, but I'm still hesitant removing current, broken images and rebuilding them, so I plan to discuss this in flatcar/Flatcar#1270.

invidian added a commit to kinvolk/image-builder that referenced this issue Nov 29, 2023
invidian added a commit to kinvolk/image-builder that referenced this issue Nov 29, 2023
@AverageMarcus
Copy link
Member

@invidian Do you think we should keep this issue open to keep track until the upstream issue is resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/azure Issues or PRs related to azure provider kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants