Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using cilium in Docker requires patching mounts #5286

Open
4 tasks done
Jip-Hop opened this issue Nov 22, 2024 · 4 comments
Open
4 tasks done

Using cilium in Docker requires patching mounts #5286

Jip-Hop opened this issue Nov 22, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@Jip-Hop
Copy link

Jip-Hop commented Nov 22, 2024

Before creating an issue, make sure you've checked the following:

  • You are running the latest released version of k0s
  • Make sure you've searched for existing issues, both open and closed
  • Make sure you've searched for PRs too, a fix might've been merged already
  • You're looking at docs for the released version, "main" branch docs are usually ahead of released versions.

Platform

`docker info`
Client:
 Version:    27.3.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  ai: Ask Gordon - Docker Agent (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/redacted/.docker/cli-plugins/docker-ai
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.18.0-desktop.2
    Path:     /Users/redacted/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.30.3-desktop.1
    Path:     /Users/redacted/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.37
    Path:     /Users/redacted/.docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Alpha) (Docker Inc.)
    Version:  v0.0.15
    Path:     /Users/redacted/.docker/cli-plugins/docker-desktop
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /Users/redacted/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.27
    Path:     /Users/redacted/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.5
    Path:     /Users/redacted/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.4.0
    Path:     /Users/redacted/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/redacted/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.15.0
    Path:     /Users/redacted/.docker/cli-plugins/docker-scout

Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 3
Server Version: 27.3.1
Storage Driver: overlayfs
driver-type: io.containerd.snapshotter.v1
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 472731909fa34bd7bc9c087e4c27943f9835f111
runc version: v1.1.13-0-g58aa920
init version: de40ad0
Security Options:
seccomp
Profile: unconfined
cgroupns
Kernel Version: 6.10.14-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.656GiB
Name: docker-desktop
ID: redacted
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Labels:
com.docker.desktop.address=unix:///Users/redacted/Library/Containers/com.docker.docker/Data/docker-cli.sock
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Version

v1.31.2+k0s.0

Sysinfo

`k0s sysinfo`
Total memory: 7.7 GiB (pass)
File system of /var/lib/k0s: ext4 (pass)
Disk space available for /var/lib/k0s: 53.3 GiB (pass)
Relative disk space available for /var/lib/k0s: 91% (pass)
Name resolution: localhost: [::1 127.0.0.1] (pass)
Operating system: Linux (pass)
  Linux kernel release: 6.10.14-linuxkit (pass)
  Max. file descriptors per process: current: 1048576 / max: 1048576 (pass)
  AppArmor: unavailable (pass)
  Executable in PATH: modprobe: /sbin/modprobe (pass)
  Executable in PATH: mount: /bin/mount (pass)
  Executable in PATH: umount: /bin/umount (pass)
  /proc file system: mounted (0x9fa0) (pass)
  Control Groups: version 2 (pass)
    cgroup controller "cpu": available (is a listed root controller) (pass)
    cgroup controller "cpuacct": available (via cpu in version 2) (pass)
    cgroup controller "cpuset": available (is a listed root controller) (pass)
    cgroup controller "memory": available (is a listed root controller) (pass)
    cgroup controller "devices": available (device filters attachable) (pass)
    cgroup controller "freezer": available (cgroup.freeze exists) (pass)
    cgroup controller "pids": available (is a listed root controller) (pass)
    cgroup controller "hugetlb": available (is a listed root controller) (pass)
    cgroup controller "blkio": available (via io in version 2) (pass)
  CONFIG_CGROUPS: Control Group support: built-in (pass)
    CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
    CONFIG_CPUSETS: Cpuset support: built-in (pass)
    CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
    CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
      CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
        CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
    CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
  CONFIG_NAMESPACES: Namespaces support: built-in (pass)
    CONFIG_UTS_NS: UTS namespace: built-in (pass)
    CONFIG_IPC_NS: IPC namespace: built-in (pass)
    CONFIG_PID_NS: PID namespace: built-in (pass)
    CONFIG_NET_NS: Network namespace: built-in (pass)
  CONFIG_NET: Networking support: built-in (pass)
    CONFIG_INET: TCP/IP networking: built-in (pass)
      CONFIG_IPV6: The IPv6 protocol: built-in (pass)
    CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
      CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
      CONFIG_NF_CONNTRACK: Netfilter connection tracking support: built-in (pass)
      CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: built-in (pass)
        CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: built-in (pass)
        CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: built-in (pass)
        CONFIG_NETFILTER_XT_MARK: nfmark target and match support: built-in (pass)
        CONFIG_NETFILTER_XT_SET: set target and match support: built-in (pass)
        CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: built-in (pass)
        CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: built-in (pass)
        CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: built-in (pass)
        CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: built-in (pass)
        CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: built-in (pass)
        CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: built-in (pass)
        CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: built-in (pass)
      CONFIG_NETFILTER_NETLINK: built-in (pass)
      CONFIG_NF_NAT: built-in (pass)
      CONFIG_IP_SET: IP set support: built-in (pass)
        CONFIG_IP_SET_HASH_IP: hash:ip set support: built-in (pass)
        CONFIG_IP_SET_HASH_NET: hash:net set support: built-in (pass)
      CONFIG_IP_VS: IP virtual server support: built-in (pass)
        CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
        CONFIG_IP_VS_SH: Source hashing scheduling: built-in (pass)
        CONFIG_IP_VS_RR: Round-robin scheduling: built-in (pass)
        CONFIG_IP_VS_WRR: Weighted round-robin scheduling: built-in (pass)
      CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: built-in (pass)
      CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
      CONFIG_IP_NF_IPTABLES: IP tables support: built-in (pass)
        CONFIG_IP_NF_FILTER: Packet filtering: built-in (pass)
          CONFIG_IP_NF_TARGET_REJECT: REJECT target support: built-in (pass)
        CONFIG_IP_NF_NAT: iptables NAT support: built-in (pass)
        CONFIG_IP_NF_MANGLE: Packet mangling: built-in (pass)
      CONFIG_NF_DEFRAG_IPV4: built-in (pass)
      CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
      CONFIG_IP6_NF_IPTABLES: IP6 tables support: built-in (pass)
        CONFIG_IP6_NF_FILTER: Packet filtering: built-in (pass)
        CONFIG_IP6_NF_MANGLE: Packet mangling: built-in (pass)
        CONFIG_IP6_NF_NAT: ip6tables NAT support: built-in (pass)
      CONFIG_NF_DEFRAG_IPV6: built-in (pass)
    CONFIG_BRIDGE: 802.1d Ethernet Bridging: built-in (pass)
      CONFIG_LLC: built-in (pass)
      CONFIG_STP: built-in (pass)
  CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: built-in (pass)
  CONFIG_PROC_FS: /proc file system support: built-in (pass)

What happened?

I've tried the latest k0s version with the new entrypoint script by @twz123. I was hoping it would fix the issues I'm having with running k0s + cilium but unfortunately the entrypoint doesn't solve them. The log is full with these errors:

failed to generate spec: path \\\"/sys/fs/bpf\\\" is mounted on \\\"/sys\\\" but it is not a shared mount\"" component=containerd stream=stderr

And the cilium pods never start.

Steps to reproduce

Run this script to reproduce the issue:

#!/bin/bash
set -euo pipefail

KUBE_API_SERVER=https://localhost:6443

docker compose -f - up --build -d <<EOF
services:
  k0s:
    build:
        context: .
        dockerfile_inline: |
            FROM docker.io/k0sproject/k0s:v1.31.2-k0s.0
            # Use the new entrypoint script
            ADD --chmod=0755 https://raw.githubusercontent.com/k0sproject/k0s/18d3545594b8abac7e50aa70720dea44337f25fa/docker-entrypoint.sh /entrypoint.sh
    command: k0s controller --single --disable-components metrics-server --config=/etc/k0s/config.yaml
    volumes:
      - /var/lib/k0s
      - /var/log/pods
      - /lib/modules:/lib/modules:ro # required to get cilium working
    container_name: k0s
    hostname: k0s
    privileged: true
    tmpfs:
      - /run
      - /tmp
    ports:
      - 80:80
      - 443:443
      - 6443:6443
    network_mode: "bridge"
    restart: no
    environment:
      K0S_CONFIG: |-
        apiVersion: k0s.k0sproject.io/v1beta1
        kind: ClusterConfig
        metadata:
          name: k0s
        spec:
          telemetry:
            enabled: false
          network:
            kubeProxy:
              disabled: true
            provider: custom
          api:
            sans:
            - localhost
EOF

while [ ! "$(curl -k -s -o /dev/null -w "%{http_code}" https://localhost:6443)" -eq 401 ]; do
    echo "Sleep..."
    sleep 1
done

KUBECONFIG=$(mktemp -t kubeconfig)
export KUBECONFIG

docker exec k0s k0s kubeconfig admin >"$KUBECONFIG"
kubectl config set clusters.local.server "$KUBE_API_SERVER"

echo "Written kubeconfig to: $KUBECONFIG"

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml

cilium install --version 1.16.4 \
    --set k8sServiceHost=localhost \
    --set k8sServicePort=6443 \
    --set kubeProxyReplacement=true \
    --set gatewayAPI.enabled=true \
    --set gatewayAPI.hostNetwork.enabled=true \
    --set envoy.securityContext.capabilities.keepCapNetBindService=true \
    --set cgroup.autoMount.enabled=false \
    --set cgroup.hostRoot=/sys/fs/cgroup

cilium status --wait --wait-duration=10m

# Apply the echo example from https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/splitting/
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/echo.yaml
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/splitting.yaml

# Wait for details deployments to be ready
kubectl rollout status deployment echo-1
kubectl rollout status deployment echo-2

sleep 1

curl --fail -s http://localhost/echo && echo "Cilium working as expected!"

Expected behavior

For cilium pods to come up and the script to continue past cilium status --wait --wait-duration=10m.

Actual behavior

The script times out waiting for cilium pods to come up.

Screenshots and logs

No response

Additional context

This below version of the script has the workarounds in place to make cilium work:

  • cgroup: host in the compose file
  • mount --make-rshared / as part of the entrypoint

With just mount --make-rshared / and not using cgroup: host I still run into these errors (and cilium pods don't start):

kubelet.go:1566] \"Failed to start ContainerManager\" err=\"cannot enter cgroupv2 \\\"/sys/fs/cgroup/kubepods\\\" with domain controllers -- it is in an invalid state\"" component=kubelet stream=stderr

Working script:

#!/bin/bash
set -euo pipefail

KUBE_API_SERVER=https://localhost:6443

docker compose -f - up --build -d <<EOF
services:
  k0s:
    cgroup: host # still seems to be required even with the new entrypoint magic
    build:
        context: .
        dockerfile_inline: |
            FROM docker.io/k0sproject/k0s:v1.31.2-k0s.0
            # Use the new entrypoint script
            ADD --chmod=0755 https://raw.githubusercontent.com/k0sproject/k0s/18d3545594b8abac7e50aa70720dea44337f25fa/docker-entrypoint.sh /entrypoint.sh
    command: |-
      sh -c '

        # This command is required in order to fix cilium inside the container
        mount --make-rshared /

        k0s controller --single \
            --disable-components metrics-server \
            --config=/etc/k0s/config.yaml \
      '
    volumes:
      - /var/lib/k0s
      - /var/log/pods
      - /lib/modules:/lib/modules:ro # required to get cilium working
    container_name: k0s
    hostname: k0s
    privileged: true
    tmpfs:
      - /run
      - /tmp
    ports:
      - 80:80
      - 443:443
      - 6443:6443
    network_mode: "bridge"
    restart: no
    environment:
      K0S_CONFIG: |-
        apiVersion: k0s.k0sproject.io/v1beta1
        kind: ClusterConfig
        metadata:
          name: k0s
        spec:
          telemetry:
            enabled: false
          network:
            kubeProxy:
              disabled: true
            provider: custom
          api:
            sans:
            - localhost
EOF

while [ ! "$(curl -k -s -o /dev/null -w "%{http_code}" https://localhost:6443)" -eq 401 ]; do
    echo "Sleep..."
    sleep 1
done

KUBECONFIG=$(mktemp -t kubeconfig)
export KUBECONFIG

docker exec k0s k0s kubeconfig admin >"$KUBECONFIG"
kubectl config set clusters.local.server "$KUBE_API_SERVER"

echo "Written kubeconfig to: $KUBECONFIG"

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml

cilium install --version 1.16.4 \
    --set k8sServiceHost=localhost \
    --set k8sServicePort=6443 \
    --set kubeProxyReplacement=true \
    --set gatewayAPI.enabled=true \
    --set gatewayAPI.hostNetwork.enabled=true \
    --set envoy.securityContext.capabilities.keepCapNetBindService=true \
    --set cgroup.autoMount.enabled=false \
    --set cgroup.hostRoot=/sys/fs/cgroup

cilium status --wait --wait-duration=10m

# Apply the echo example from https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/splitting/
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/echo.yaml
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/splitting.yaml

# Wait for details deployments to be ready
kubectl rollout status deployment echo-1
kubectl rollout status deployment echo-2

sleep 1

curl --fail -s http://localhost/echo && echo "Cilium working as expected!"

References:

@Jip-Hop Jip-Hop added the bug Something isn't working label Nov 22, 2024
@twz123 twz123 changed the title Using cilium requires patching mounts Using cilium in Docker requires patching mounts Nov 25, 2024
@twz123
Copy link
Member

twz123 commented Nov 25, 2024

The new entrypoint script will only try to remount the cgroup fs if it's read-only. Since you're running the container with privileged: true, I guess it's read-write, so it should be no different from the previous version. You might want to try to set the environment variable K0S_ENTRYPOINT_REMOUNT_CGROUP2FS=1 and check if that makes any difference.

That said, the script and the docs are not covering anything BPF related. They're tailored to make the things work that are part of stock k0s. So if you know what's necessary to make cilium work for you inside Docker, then this might be worth an additional section in the docs.

@Jip-Hop
Copy link
Author

Jip-Hop commented Nov 26, 2024

Thanks for replying!

Adding K0S_ENTRYPOINT_REMOUNT_CGROUP2FS=1 (without using cgroup: host) causes an error about not being able to find /sys/fs/cgroup in /proc/mounts (even though it's there according to my debug logging by catting /proc/mounts).

+ mount --make-rslave /
+ cat /proc/mounts
+ grep sys
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
+ mount -o remount,rw /sys/fs/cgroup
mount: can't find /sys/fs/cgroup in /proc/mounts

When I don't use the k0s entrypoint at all, but instead rely on k3d entrypoints then my cilium setup works as expected (without using cgroup: host).

#!/bin/bash
set -euo pipefail

KUBE_API_SERVER=https://localhost:6443

docker compose --verbose -f - up --build -d <<EOF
services:
  k0s:
    # Don't use the included k0s entrypoint  
    entrypoint: sh

    # Add the k3d entrypoints to fix the cgroups and mounts
    build:
        context: .
        dockerfile_inline: |
            FROM docker.io/k0sproject/k0s:v1.31.2-k0s.0
            # Add k3d entrypoints
            ADD --chmod=0755 https://raw.githubusercontent.com/k3d-io/k3d/60695db835ea8d3f0f6c95cc320a068d8aa5c44d/pkg/types/fixes/assets/k3d-entrypoint-cgroupv2.sh /entrypoint-cgroupv2.sh
            ADD --chmod=0755 https://raw.githubusercontent.com/k3d-io/k3d/8d54019838f3a516e6f28fdb5ac15aff2246986e/pkg/types/fixes/assets/k3d-entrypoint-mounts.sh /entrypoint-mounts.sh

    command: |-
      -c '
        set -euxo pipefail

        # Run the k3d entrypoints to fix the cgroups and mounts
        ./entrypoint-cgroupv2.sh
        ./entrypoint-mounts.sh

        k0s controller --single \
            --disable-components metrics-server \
            --config=/etc/k0s/k0s.yaml \
      '
    volumes:
      - /var/lib/k0s
      - /var/log/pods
      - /lib/modules:/lib/modules:ro # required to get cilium working
    container_name: k0s
    hostname: k0s
    privileged: true
    tmpfs:
      - /run
      - /tmp
    ports:
      - 80:80
      - 443:443
      - 6443:6443
    network_mode: "bridge"
    restart: no
    configs:
      - source: k0s.yaml
        target: /etc/k0s/k0s.yaml
configs:
  k0s.yaml:
    content: |
      apiVersion: k0s.k0sproject.io/v1beta1
      kind: ClusterConfig
      metadata:
        name: k0s
      spec:
        telemetry:
          enabled: false
        network:
          kubeProxy:
            disabled: true
          provider: custom
        api:
          sans:
          - localhost
EOF

while [ ! "$(curl -k -s -o /dev/null -w "%{http_code}" https://localhost:6443)" -eq 401 ]; do
  echo "Sleep..."
  sleep 1
done

KUBECONFIG=$(mktemp -t kubeconfig)
export KUBECONFIG

docker exec k0s k0s kubeconfig admin >"$KUBECONFIG"
kubectl config set clusters.local.server "$KUBE_API_SERVER"

echo "Written kubeconfig to: $KUBECONFIG"

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml

cilium install --version 1.16.4 \
  --set k8sServiceHost=localhost \
  --set k8sServicePort=6443 \
  --set kubeProxyReplacement=true \
  --set gatewayAPI.enabled=true \
  --set gatewayAPI.hostNetwork.enabled=true \
  --set envoy.securityContext.capabilities.keepCapNetBindService=true \
  --set cgroup.autoMount.enabled=false \
  --set cgroup.hostRoot=/sys/fs/cgroup

cilium status --wait --wait-duration=10m

# Apply the echo example from https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/splitting/
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/echo.yaml
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/splitting.yaml

# Wait for details deployments to be ready
kubectl rollout status deployment echo-1
kubectl rollout status deployment echo-2

sleep 1

curl --fail -s http://localhost/echo && echo "Cilium working as expected!"

Could the steps the k3d entrypoints perform (or functionally equal) be added to the k0s entrypoint?

@Jip-Hop
Copy link
Author

Jip-Hop commented Nov 26, 2024

Turns out the new entrypoint already covers fixing the cgroups through the enable_cgroupv2_nesting function. It wasn't triggered in my case. Auto detection didn't work because I used sh -c to run mount --make-rshared / before starting k0s.

So the only change/addition required to the entrypoint to make cilium work in docker is running:

mount --make-rshared /

Which is what k3d has been doing for a while now.

Could this please be included in the new entrypoint?

Working example with the new entrypoint, without cgroup: host, by explicitly setting K0S_ENTRYPOINT_ROLE: controller+worker and running mount --make-rshared /.

#!/bin/bash
set -euo pipefail

KUBE_API_SERVER=https://localhost:6443

docker compose --verbose -f - up --build -d <<EOF
services:
  k0s:

    build:
        context: .
        dockerfile_inline: |
            FROM docker.io/k0sproject/k0s:v1.31.2-k0s.0
            # Use the new entrypoint script
            ADD --chmod=0755 https://raw.githubusercontent.com/k0sproject/k0s/18d3545594b8abac7e50aa70720dea44337f25fa/docker-entrypoint.sh /entrypoint.sh

    command: |-
      sh -c '
        set -euxo pipefail

        # This command is required in order to fix cilium inside the container
        mount --make-rshared /

        k0s controller --single \
            --disable-components metrics-server \
            --config=/etc/k0s/k0s.yaml \
      '
    volumes:
      - /var/lib/k0s
      - /var/log/pods
      - /lib/modules:/lib/modules:ro # required to get cilium working
    container_name: k0s
    hostname: k0s
    privileged: true
    tmpfs:
      - /run
      - /tmp
    ports:
      - 80:80
      - 443:443
      - 6443:6443
    network_mode: "bridge"
    restart: no
    environment:
      # Role currently can't be auto-detected since I'm using sh -c to run additional commands
      # before starting k0s itself
      K0S_ENTRYPOINT_ROLE: controller+worker
    configs:
      - source: k0s.yaml
        target: /etc/k0s/k0s.yaml
configs:
  k0s.yaml:
    content: |
      apiVersion: k0s.k0sproject.io/v1beta1
      kind: ClusterConfig
      metadata:
        name: k0s
      spec:
        telemetry:
          enabled: false
        network:
          kubeProxy:
            disabled: true
          provider: custom
        api:
          sans:
          - localhost
EOF

while [ ! "$(curl -k -s -o /dev/null -w "%{http_code}" https://localhost:6443)" -eq 401 ]; do
  echo "Sleep..."
  sleep 1
done

KUBECONFIG=$(mktemp -t kubeconfig)
export KUBECONFIG

docker exec k0s k0s kubeconfig admin >"$KUBECONFIG"
kubectl config set clusters.local.server "$KUBE_API_SERVER"

echo "Written kubeconfig to: $KUBECONFIG"

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml

cilium install --version 1.16.4 \
  --set k8sServiceHost=localhost \
  --set k8sServicePort=6443 \
  --set kubeProxyReplacement=true \
  --set gatewayAPI.enabled=true \
  --set gatewayAPI.hostNetwork.enabled=true \
  --set envoy.securityContext.capabilities.keepCapNetBindService=true \
  --set cgroup.autoMount.enabled=false \
  --set cgroup.hostRoot=/sys/fs/cgroup

cilium status --wait --wait-duration=10m

# Apply the echo example from https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/splitting/
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/echo.yaml
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.16.4/examples/kubernetes/gateway/splitting.yaml

# Wait for details deployments to be ready
kubectl rollout status deployment echo-1
kubectl rollout status deployment echo-2

sleep 1

curl --fail -s http://localhost/echo && echo "Cilium working as expected!"

@Jip-Hop
Copy link
Author

Jip-Hop commented Nov 26, 2024

Perhaps like this?

Jip-Hop@dc0402e

I didn't have time yet to make a PR and meet all the contributor requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants