1

I am trying to migrate from flannel to calico in k8s cluster. I am able to do it successfully in 3 node cluster. Live migration from flannel to calico is working as described in the documentation.

But migration from flannel to calico on single node k8s cluster is not supported as per this issue.

I have to do live migration from flannel to calico on single node, any suggestions on approaches is appreciated.

Siddharood
  • 61
  • 5
  • It's clearly stated that "Flannel migration controller would need to run on a node which is not currently migrating. It won't work on a single node cluster.". In this issue, there is also a possible workaround posted - "one thing you might be able to try is to launch a second node for the duration of the upgrade, and then scale back down to 1". – p10l Jan 25 '22 at 10:42
  • Yes, I am not planning to do this activity with migration job as mentioned in document. I want to carry out this operation manually, probably writing own migration logic. Adding another node to the cluster and scaling down to 1 is not feasible option for me, I must do it in single node. – Siddharood Jan 25 '22 at 12:18
  • At this point, why not just create a new cluster with calico, and move resources to it? – p10l Jan 25 '22 at 12:23
  • Thank you for the suggestions. I will have to create new cluster with Calico, and that must work fine. We do have to support live migration of single node as we dont have to loose the existing data. – Siddharood Jan 27 '22 at 04:31

2 Answers2

1

As you already found out, migrating Flannel to Calico, on single-node cluster, is not supported. This is due to the Flannel migration controller needs to be scheduled on node that is not currently migrating, which is impossible in single-node cluster.

This can be worked around by creating temporary second node, and scale down back to 1 after the migration is complete.

Other solution is to create a completely new cluster, install Calico, and move resources from old to new cluster.

As a last resort, you can try to uninstall Flannel manually, and install Calico over it.


Warning: All of the below may not work as intended. Doing things this way is obviously not supported by either Flannel, nor Calico. It may break, and render your cluster unusable. Try this solution in testing environment first, adjust to your env, and only then, try this on prod.
You have been warned.


  1. Remove Flannel with kubectl delete -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
  2. SSH into your node
  3. Stop kubelet service systemctl stop kubelet
  4. Stop containerd systemctl stop containerd if used as container runtime. (replace with docker if Docker Engine is used)
  5. Remove any CNI related directories
    rm -rf /var/lib/cni
    rm -rf /run/flannel
    rm -rf /etc/cni
    
  6. Look for any CNI/Flannel related interfaces, and remove them
    ip link
    
    for each interface do the following
    ifconfig <name of the interface from ip link> down
    ip link delete <name of the interface from ip link>
    
  7. Restart container runtime
  8. Restart kubelet
  9. Install Calico as you would on a new cluster.
p10l
  • 386
  • 1
  • 7
  • Hello @Siddharood and welcome to ServerFault! Please remember to [react to answers for your questions](https://stackoverflow.com/help/someone-answers). That way we know if the answers were helpful and other community members could also benefit from them. Try to [accept answer](https://stackoverflow.com/help/accepted-answer) that is the final solution for your issue, upvote answers that are helpful and comment on those which could be improved or require additional attention. Enjoy your stay! – Wytrzymały Wiktor Jan 27 '22 at 10:25
  • @p10l thanks for the suggestions. I followed most of your steps. I did deviate from the steps mentioned. I am posting answer below which worked for me. – Siddharood Jan 31 '22 at 04:57
0

Below are steps which worked for me while migrating from flannel to calico. I followed most of the steps mentioned in @p1ol answer.

  1. Remove Flannel

  2. Stop kubelet

  3. Bring down ifconfig interface and delete ip link related flannel

  4. Restart Kubelet

  5. Install calico

Calico installation was not creating calico-node pod and calico-kube-controllers-** pod was stuck in pod initialising or container creating state.

I did refer to blog where its suggested to update calico yaml for CIDR range and also I had to configure IP_AUTODETECTION_METHOD as suggested here

At the end, I did restart my node.

Siddharood
  • 61
  • 5