-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi controller install fails on OL9 #648
Comments
This is probably unrelated, but: - src: /var/lib/k0s/images/k0s-airgap-bundle-v1.28.5.tar
dstDir: /var/lib/k0s/images/
perm: 075 That will make the permissions |
In the logs, I see these:
Based on that, it seems the second controller is having tough time joining the cluster. I'd look into the status of k0s in that node to see if there's any hints on why it's having tough time. Log into that machine and look into the logs:
@kke would it be possible/make sense if k0sctl could do something like this automatically when it sees k0s is not getting up as expected? |
I re-ran it again and I noticed that the script failed a bit faster while trying to acquire a lock, adding to this there was no logs in the nodes due to never reaching the step where the node is installed or setup.
|
Hmm, interesting idea, so it would try to dig up some diagnostics logs on failure 🤔 That could be handy. |
INFO ==> Running phase: Initialize the k0s cluster
INFO [ssh] 192.168.15.216:22: installing k0s controller
INFO * Running clean-up for phase: Acquire exclusive host lock
INFO * Running clean-up for phase: Initialize the k0s cluster
INFO [ssh] 192.168.15.216:22: cleaning up
INFO ==> Apply failed No error displayed? That's not nice. The lock file is just for avoiding two instances of k0sctl operating at the same time, maybe it should be more quiet about it. The actual problem is somewhere else. |
New discovery! I copied the install command and ran it as a standalone command in the server where the logs specified (without escaping) and it is causing a Null Pointer Exception. log line:
the current content of /etc/k0s/k0s.yaml is the following: apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
spec:
api:
address: 192.168.15.216
sans:
- 192.168.15.216
- 192.168.14.186
- 192.168.15.88
- 127.0.0.1
controllerManager: {}
extensions: null
installConfig: null
konnectivity:
adminPort: 8133
agentPort: 8132
network:
calico:
mode: vxlan
mtu: 0
overlay: always
vxlanPort: 4789
vxlanVNI: 4096
wireguard: true
clusterDomain: cluster.local
dualStack: {}
kubeProxy:
mode: iptables
podCIDR: 10.244.0.0/16
provider: calico
serviceCIDR: 10.96.0.0/12
podSecurityPolicy:
defaultPolicy: 00-k0s-privileged
scheduler: {}
telemetry:
enabled: false
status: {} command without escaping:
I think this is great as we are not running blind anymore. |
Great news!, I was able to solve the issues. The above issue is due too a validation error while processing the yaml configuration, basically After this I faced another issue with etcd where I noticed that the the systemctl service was using the os hostname and not the hostname from the configuration. I'm also using this value in the config but I'm not sure if the value is being used internally in k0s or it is setting the hostname in the OS. Please feel free to close this issue or keep it open to track the validations. |
That should already be happening here - I haven't figured out yet why it isn't.
That should be validated already: https://github.com/k0sproject/k0sctl/blob/main/phase/validate_hosts.go#L54-L60
That is only used as K0s does not look at that when starting etcd but will always use It looks like you found two k0s bugs 🥇 |
When attempting to install k0s via k0sctl using the multi-controller setup it fails to install, this doesn't happen if only one controller (or controller+worker) and the rest of the nodes are workers. I have tested using node local load balancing and no load balancing but same issue arrises on both cases.
System Information:
os-release
kernel:
Linux fwd-oracle 5.14.0-362.13.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 21 22:34:57 PST 2023 x86_64 x86_64 x86_64 GNU/Linux
k0sctl config:
logs:
k0sctl.log
Based on this issue: k0sproject/k0s#3337 (comment)
The text was updated successfully, but these errors were encountered: