Life of a Packet in Kubernetes — Part 2

Topics — Part 2

  1. Requirements
  2. Modules and its functions
  3. Routing modes
  4. Installation (calico and calicoctl)

CNI Requirements

  1. Create veth-pair and move the same inside container
  2. Identify the right POD CIDR
  3. Create a CNI configuration file
  4. Assign and manage IP address
  5. Add default routes inside the container
  6. Advertise the routes to all the peer nodes (Not applicable for VxLan)
  7. Add routes in the HOST server
  8. Enforce Network Policy
Basic Kubernetes network requirement

BIRD (BGP)

The bird is a per-node BGP daemon that exchanges route information with BGP daemons running on other nodes. The common topology could be node-to-node mesh, where each BGP peers with every other.

ConfD

ConfD is a simple configuration management tool that runs in the Calico node container. It reads values (BIRD configuration for Calico) from etcd, and writes them to disk files. It loops through pools (networks and subnetworks) to apply configuration data (CIDR keys), and assembles them in a way that BIRD can use. So whenever there is a change in the network, BIRD can detect and propagate routes to other nodes.

Felix

The Calico Felix daemon runs in the Calico node container and brings the solution together by taking several actions:

  • Reads information from the Kubernetes etcd
  • Builds the routing table
  • Configures the IPTables (kube-proxy mode IPTables)
  • Configures IPVS (kube-proxy mode IPVS)
Deployment with ‘NoSchedule’ Toleration
  1. Pod in master tries to ping the IP address 10.0.2.11
  2. Pod sends an ARP request to the gateway.
  3. Get’s the ARP response with the MAC address.
  4. Wait, who sent the ARP response?
master $ cat /proc/sys/net/ipv4/conf/cali123/proxy_arp
1
  1. The packet reaches the worker node kernel.
  2. Kernel puts the packet into the cali123.

Routing Modes

Calico supports 3 routing modes; in this section, we will see the pros and cons of each method and where we can use them.

  • IP-in-IP: default; encapsulated
  • Direct/NoEncapMode: unencapsulated (Preferred)
  • VXLAN: encapsulated (No BGP)

IP-in-IP (Default)

IP-in-IP is a simple form of encapsulation achieved by putting an IP packet inside another. A transmitted packet contains an outer header with host source and destination IPs and an inner header with pod source and destination IPs.

NoEncapMode

In this mode, send packets as if they came directly from the pod. Since there is no encapsulation and de-capsulation overhead, direct is highly performant.

VXLAN

VXLAN routing is supported in Calico 3.7+.

Demo — IPIP and UnEncapMode

Check the cluster state before the Calico installation.

master $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controlplane NotReady master 40s v1.18.0
node01 NotReady <none> 9s v1.18.0

master $ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-52tkd 0/1 Pending 0 32s
kube-system coredns-66bff467f8-g5gjb 0/1 Pending 0 32s
kube-system etcd-controlplane 1/1 Running 0 34s
kube-system kube-apiserver-controlplane 1/1 Running 0 34s
kube-system kube-controller-manager-controlplane 1/1 Running 0 34s
kube-system kube-proxy-b2j4x 1/1 Running 0 13s
kube-system kube-proxy-s46lv 1/1 Running 0 32s
kube-system kube-scheduler-controlplane 1/1 Running 0 33s
master $ cd /etc/cni
-bash: cd: /etc/cni: No such file or directory
master $ cd /opt/cni/bin
master $ ls
bridge dhcp flannel host-device host-local ipvlan loopback macvlan portmap ptp sample tuning vlan
master $ ip route
default via 172.17.0.1 dev ens3
172.17.0.0/16 dev ens3 proto kernel scope link src 172.17.0.32
172.18.0.0/24 dev docker0 proto kernel scope link src 172.18.0.1 linkdown
curl https://docs.projectcalico.org/manifests/calico.yaml -O
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
cni_network_config: |-
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico", >>> Calico's CNI plugin
"log_level": "info",
"log_file_path": "/var/log/calico/cni/cni.log",
"datastore_type": "kubernetes",
"nodename": "__KUBERNETES_NODE_NAME__",
"mtu": __CNI_MTU__,
"ipam": {
"type": "calico-ipam" >>> Calico's IPAM instaed of default IPAM
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "__KUBECONFIG_FILEPATH__"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always" >> Set this to 'Never' to disable IP-IP
# Enable or Disable VXLAN on the default IP pool.
- name: CALICO_IPV4POOL_VXLAN
value: "Never"
master $ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-799fb94867-6qj77 0/1 ContainerCreating 0 21s
kube-system calico-node-bzttq 0/1 PodInitializing 0 21s
kube-system calico-node-r6bwj 0/1 PodInitializing 0 21s
kube-system coredns-66bff467f8-52tkd 0/1 Pending 0 7m5s
kube-system coredns-66bff467f8-g5gjb 0/1 ContainerCreating 0 7m5s
kube-system etcd-controlplane 1/1 Running 0 7m7s
kube-system kube-apiserver-controlplane 1/1 Running 0 7m7s
kube-system kube-controller-manager-controlplane 1/1 Running 0 7m7s
kube-system kube-proxy-b2j4x 1/1 Running 0 6m46s
kube-system kube-proxy-s46lv 1/1 Running 0 7m5s
kube-system kube-scheduler-controlplane 1/1 Running 0 7m6s
master $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controlplane Ready master 7m30s v1.18.0
node01 Ready <none> 6m59s v1.18.0
master $ cd /etc/cni/net.d/
master $ ls
10-calico.conflist calico-kubeconfig
master $
master $
master $ cat 10-calico.conflist
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"log_file_path": "/var/log/calico/cni/cni.log",
"datastore_type": "kubernetes",
"nodename": "controlplane",
"mtu": 1440,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
master $ ls
bandwidth bridge calico calico-ipam dhcp flannel host-device host-local install ipvlan loopback macvlan portmap ptp sample tuning vlan
master $
master $ cd /usr/local/bin/
master $ curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.16.3/calicoctl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 633 100 633 0 0 3087 0 --:--:-- --:--:-- --:--:-- 3087
100 38.4M 100 38.4M 0 0 5072k 0 0:00:07 0:00:07 --:--:-- 4325k
master $ chmod +x calicoctl
master $ export DATASTORE_TYPE=kubernetes
master $ export KUBECONFIG=~/.kube/config
# Check endpoints - it will be empty as we have't deployed any POD
master $ calicoctl get workloadendpoints
WORKLOAD NODE NETWORKS INTERFACE
master $
master $ calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+-------------+
| 172.17.0.40 | node-to-node mesh | up | 00:24:04 | Established |
+--------------+-------------------+-------+----------+-------------+
cat > busybox.yaml <<"EOF"
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox-deployment
spec:
selector:
matchLabels:
app: busybox
replicas: 2
template:
metadata:
labels:
app: busybox
spec:
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
containers:
- name: busybox
image: busybox
command: ["sleep"]
args: ["10000"]
EOF
master $ kubectl apply -f busybox.yaml
deployment.apps/busybox-deployment created
master $ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox-deployment-8c7dc8548-btnkv 1/1 Running 0 6s 192.168.196.131 node01 <none> <none>
busybox-deployment-8c7dc8548-x6ljh 1/1 Running 0 6s 192.168.49.66 controlplane <none> <none>
master $ calicoctl get workloadendpoints
WORKLOAD NODE NETWORKS INTERFACE
busybox-deployment-8c7dc8548-btnkv node01 192.168.196.131/32 calib673e730d42
busybox-deployment-8c7dc8548-x6ljh controlplane 192.168.49.66/32 cali9861acf9f07
master $ ifconfig cali9861acf9f07
cali9861acf9f07: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1440
inet6 fe80::ecee:eeff:feee:eeee prefixlen 64 scopeid 0x20<link>
ether ee:ee:ee:ee:ee:ee txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 5 bytes 446 (446.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
master $ kubectl exec busybox-deployment-8c7dc8548-x6ljh -- ifconfig
eth0 Link encap:Ethernet HWaddr 92:7E:C4:15:B9:82
inet addr:192.168.49.66 Bcast:192.168.49.66 Mask:255.255.255.255
UP BROADCAST RUNNING MULTICAST MTU:1440 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:446 (446.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
master $ kubectl exec busybox-deployment-8c7dc8548-x6ljh -- ip route
default via 169.254.1.1 dev eth0
169.254.1.1 dev eth0 scope link

master $ kubectl exec busybox-deployment-8c7dc8548-x6ljh -- arp
master $
master $ ip route
default via 172.17.0.1 dev ens3
172.17.0.0/16 dev ens3 proto kernel scope link src 172.17.0.32
172.18.0.0/24 dev docker0 proto kernel scope link src 172.18.0.1 linkdown
blackhole 192.168.49.64/26 proto bird
192.168.49.65 dev calic22dbe57533 scope link
192.168.49.66 dev cali9861acf9f07 scope link
192.168.196.128/26 via 172.17.0.40 dev tunl0 proto bird onlink
master $ kubectl exec busybox-deployment-8c7dc8548-x6ljh -- ping 192.168.196.131 -c 1
PING 192.168.196.131 (192.168.196.131): 56 data bytes
64 bytes from 192.168.196.131: seq=0 ttl=62 time=0.823 ms
master $ kubectl exec busybox-deployment-8c7dc8548-x6ljh -- arp
? (169.254.1.1) at ee:ee:ee:ee:ee:ee [ether] on eth0
master $ cat /proc/sys/net/ipv4/conf/cali9861acf9f07/proxy_arp
1
node01 $ ip route
default via 172.17.0.1 dev ens3
172.17.0.0/16 dev ens3 proto kernel scope link src 172.17.0.40
172.18.0.0/24 dev docker0 proto kernel scope link src 172.18.0.1 linkdown
192.168.49.64/26 via 172.17.0.32 dev tunl0 proto bird onlink
blackhole 192.168.196.128/26 proto bird
192.168.196.129 dev calid4f00d97cb5 scope link
192.168.196.130 dev cali257578b48b6 scope link
192.168.196.131 dev calib673e730d42 scope link
master $ calicoctl get ippool default-ipv4-ippool -o yaml > ippool.yaml
master $ vi ippool.yaml
master $ calicoctl apply -f ippool.yaml
Successfully applied 1 'IPPool' resource(s)
master $ ip route
default via 172.17.0.1 dev ens3
172.17.0.0/16 dev ens3 proto kernel scope link src 172.17.0.32
172.18.0.0/24 dev docker0 proto kernel scope link src 172.18.0.1 linkdown
blackhole 192.168.49.64/26 proto bird
192.168.49.65 dev calic22dbe57533 scope link
192.168.49.66 dev cali9861acf9f07 scope link
192.168.196.128/26 via 172.17.0.40 dev ens3 proto bird
master $ kubectl exec busybox-deployment-8c7dc8548-x6ljh -- ping 192.168.196.131 -c 1
PING 192.168.196.131 (192.168.196.131): 56 data bytes
64 bytes from 192.168.196.131: seq=0 ttl=62 time=0.653 ms
--- 192.168.196.131 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.653/0.653/0.653 ms

Note: Source IP check should be disabled in AWS environment to use this mode.

Demo — VXLAN

Re-initiate the cluster and download the calico.yaml file to apply the following changes,

  1. Remove bird from livenessProbe and readinessProbe
livenessProbe:
exec:
command:
- /bin/calico-node
- -felix-live
- -bird-live >> Remove this
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
readinessProbe:
exec:
command:
- /bin/calico-node
- -felix-ready
- -bird-ready >> Remove this
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config
namespace: kube-system
data:
# Typha is disabled.
typha_service_name: "none"
# Configure the backend to use.
calico_backend: "vxlan"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Never" >> Set this to 'Never' to disable IP-IP
# Enable or Disable VXLAN on the default IP pool.
- name: CALICO_IPV4POOL_VXLAN
value: "Never"
master $ ip route
default via 172.17.0.1 dev ens3
172.17.0.0/16 dev ens3 proto kernel scope link src 172.17.0.15
172.18.0.0/24 dev docker0 proto kernel scope link src 172.18.0.1 linkdown
192.168.49.65 dev calif5cc38277c7 scope link
192.168.49.66 dev cali840c047460a scope link
192.168.196.128/26 via 192.168.196.128 dev vxlan.calico onlink
vxlan.calico: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1440
inet 192.168.196.128 netmask 255.255.255.255 broadcast 192.168.196.128
inet6 fe80::64aa:99ff:fe2f:dc24 prefixlen 64 scopeid 0x20<link>
ether 66:aa:99:2f:dc:24 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 11 overruns 0 carrier 0 collisions 0
master $ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox-deployment-8c7dc8548-8bxnw 1/1 Running 0 11s 192.168.49.67 controlplane <none> <none>
busybox-deployment-8c7dc8548-kmxst 1/1 Running 0 11s 192.168.196.130 node01 <none> <none>
master $ kubectl exec busybox-deployment-8c7dc8548-8bxnw -- ip route
default via 169.254.1.1 dev eth0
169.254.1.1 dev eth0 scope link
master $ kubectl exec busybox-deployment-8c7dc8548-8bxnw -- arp
master $ kubectl exec busybox-deployment-8c7dc8548-8bxnw -- ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=116 time=3.786 ms
^C
master $ kubectl exec busybox-deployment-8c7dc8548-8bxnw -- arp
? (169.254.1.1) at ee:ee:ee:ee:ee:ee [ether] on eth0
master $

Disclaimer

This article does not provide any technical advice or recommendation; if you feel so, it is my personal view, not the company I work for.

References

https://docs.projectcalico.org/
https://www.openstack.org/videos/summits/vancouver-2018/kubernetes-networking-with-calico-deep-dive
https://kubernetes.io/
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/manage_network/calico.html
https://github.com/coreos/flannel

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store