Kubernetes Part 13: Replacing docker(shim) with containerd

 



This post is a little sidestep, since I was intending to write a post explaining how to configure a plex deployment on your kubernetes home cluster. Last weekend however, I decided to make my kubernetes "futureproof" by replacing docker with containerd as a container runtime engine.

As you may have noticed, kubernetes will drop the support for docker as a container runtime engine. (see here), and it's basically a pure technical reason. It basically comes down to this (or at least that's how I understand it), the docker runtime isn't fully compatible with the kuberneters Container Runtime Interface, therefore an additional interface was written called dockershim. Since maintaining dockershim costs time, while other container runtime engines (like containerd, CRI-O) get the exact same job done without the effort of maintaining a seperate interface. The container formats used by docker, containerd, CRI-O are all according the same standard from the Open Container Initiative , which basically means that there are zero compatibility issues between the container runtimes for the used images.

For a detailed explanation of about this topic, I can recommend this excellent video from JustMe & Opensource.

Now back to our K8S cluster. In the procedure below I will describe how I have replaced the docker engine with a containerd engine on my kubernetes cluster.  I will first describe it how to do it on a worker node, and after that the master node. I would advise to start with a worker, just to get acquainted with the procedure.

To start we want to get an overview of our cluster, so type the command below on your master server.


kubectl get nodes

You should see something like this




Now open een ssh session with your first worker node (for example k8s-worker-01)

A. Replacing the docker engine on a worker node

The following command need to be run on the worker node.

Step 1 - Drain the worker node
The first thing we do is to put the worker node in drain mode. This move all containers to the other nodes, so you so have no containers running on the node. As in previous post, the red values are example values.


kubectl drain k8s-worker-01 --ignore-daemonsets --delete-emptydir-data

Step 2 - Stop kubernetes on the (worker) node
Run the next command to stop kubernetes running on the node

sudo systemctl stop kubelet

Step 3 - Remove docker from the  node
You can remove docker from the working via de following commands:

sudo apt remove docker-ce docker-ce-cli
sudo apt autoremove

Step 4 - Configure the prequisites for containerd
For containerd to run properly with kubernetes the modules overlay and br_netfilter need to be loaded. 

The following command will create the  containerd.conf which load these modules at startup.

cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

You can verify if the file has been created properly via the command

cat /etc/modules-load.d/containerd.conf

The next command will load the required modules

sudo modprobe overlay
sudo modprobe br_netfilter

Add another file needs to be created for additional settings to be set on startup.

cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF

You can verify if the file has been created properly via the command

cat /etc/sysctl.d/99-kubernetes-cri.conf

Now, apply these settings without a reboot

sudo sysctl --system

Install the following packages to allow apt to use a repository over HTTPS. On my system these packages were already installed, but it wouldn't harm to run this command.

sudo apt-get update && sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common

The next step is to add Docker's official GPG key. Now you might think "Why do I need to do that ?". This is because containerd will (just like docker) also uses the docker container registry to retrieve it's images.

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key --keyring /etc/apt/trusted.gpg.d/docker.gpg add -

The next step to add the docker repository.  Now before you that, check if the repository has not been added previously, otherwise you will see errors when you run "sudo apt update". You can check this by verifying if the file  /etc/apt/sources.list.d/docker.list exists, do not run the command below.

sudo add-apt-repository "deb [arch=arm64] https://download.docker.com/linux/ubuntu focal stable"

Step 5 - Install containerd

sudo apt-get update && sudo apt-get install -y containerd.io

 Step 6 - Configure containerd

With commands below you will create a config folder and place the default containerd configuration in there.

sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml

Step 7 - Make containerd to use the systemd cgroup driver

You config this by modifing the /etc/containerd/config.toml, via the command

sudo nano /etc/containerd/config.toml

Now look for the following the line, and add the text in red, just like in the example below.


[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

Step 8 - Start containerd

Now, enable and start containerd as a service, by entering the following commands.

sudo systemctl enable containerd
sudo systemctl start containerd
sudo systemctl status containerd

Step 9 - Test containerd (optional). 

It it not required, but if you want to test containerd, you can find the instruction 

sudo ctr image ls

You should see something like this (list of the download images)




If you want more examples to test. You can find them here

Step 10 - Configure kubelet to use containerd

REPLACE (not adding) the file to /var/lib/kubelet/kubeadm-flag.env with the following command.
 
sudo nano /var/lib/kubelet/kubeadm-flags.env

And REPLACE the content of the file with the following line

KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock"

Step 11 - Start Kubernetes (fingers crossed!)

sudo systemctl daemon-reload
sudo systemctl start kubelet

Run the command 

kubectl get nodes



If the migration went ok, you should see the status of the worker node (k8s-worker-01) is "Ready, SchedulingDisabled". It it went wrong, the Status is NotReady, and you need to check if you have executed all step properly.


Step 12 - Enable Scheduling worker node

If the status of the workernode (k8s-worker-01) is Ready, than you enable scheduling and make enable it again to host pods/containers.

You do this by running the following command on the master server

kubectl uncordon k8s-worker-01

By running  the kubectl get nodes command again, you should see that the master and worker nodes all have (only) the status Ready. 

If everything went ok, repeat the same steps for all workers nodes.


B. Replacing the docker engine on a worker node

After all the workers nodes have there container runtime engine replace by containerd, the last step is to replace the container engine on the master. This is basically the same procedure with a few additions.


Step 1 - Stop the master node

Since you can't drain the master node you only need to stop kubernetes on the master node. Your worker nodes and pods will continue to function, but they won't managed.

sudo systemctl stop kubelet

Now execute step 3 to 11, as described in the procedure for the worker node.


Step 12 - Modify the kubernetes master config file

The final step is modify the config file from kubernetes. Although the master and workers nodes in this stage already are using the containerd runtime engine, you need to adjust the config settings also, otherwise you will get error when running kubeadm commands (for example to upgrade kubernetes to a newer version)

To do this run the following command on the master node.

kubectl edit node k8s-master

Change line the following line 

kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock 
to

kubeadm.alpha.kubernetes.io/cri-socket: /run/containerd/containerd.sock

You can run the following to check if kubeadm is working properly. By running the following, this command will just check if you cluster can be upgraded. If you haven't made the change as described above, you will get an error. If not, the modification went ok

sudo kubeadm upgrade plan
This should generate something similar to this.



I hope this article was helpful. If you have any questions, do not hesitate to leave a comment.

All command can be found on my github here

Comments

  1. sudo nano /var/lib/kubelet/kubeadm-flag.env > sudo nano /var/lib/kubelet/kubeadm-flags.env

    -Missed an 's'

    ReplyDelete
    Replies
    1. Anonymous8:59 PM

      Thanks for the information. I have corrected it.

      Delete
  2. Also recommend doing a '[sudo systemctl disable kubelet], on the disable part and rebooting before you enable/start kubelet back, at least on my Centos8 boxes this was required.

    ReplyDelete
    Replies
    1. Anonymous9:03 PM

      On my Ubuntu 20.04 it wasn't required. But it's recommended to reboot. Another tip is to cleanup docker completely to remove the directories (after a reboot)

      /var/lib/docker
      /var/lib/dockershim

      Thanks for your response, and I hope the blog was helpful.

      Delete
  3. You're missing a step to cordon the node prior to draining it.

    ReplyDelete
    Replies
    1. Anonymous9:29 AM

      The drain command automatically sets the cordon mode for the worker, so you don't have to enter this command seperatly.

      Delete

Post a Comment