Life of a Packet in Kubernetes — Part 3

This is part 3 of the series on Life of a Packet in Kubernetes. We’ll be tackling how Kubernetes’s kube-proxy component uses iptables to control the traffic. It’s important to know the role of kube-proxy in Kubernetes environment and how it uses iptables to control the traffic.

Note: There are many other plugins/tools to control the traffic flow, but in this article will look at the kube-proxy + iptables combo.

We’ll start with various communication models provided by Kubernetes and their implementation. If you are already aware of the magic words ‘Service, ClusterIP and NodePort’ concept, please jump to the kube-proxy/iptables section.

Part 1 Basic container networking

Part 2 — Calico CNI

Part 3:

Pod-to-Pod

kube-proxy is not involved in Pod-to-Pod communication as the CNI would configure the Node and Pod with required routes. All the containers can communicate with all other containers without NAT; all nodes can communicate with all containers (and vice-versa) without NAT.

Note: Pod’s IP address is not static (There are ways to get the static IP, but the default configuration doesn’t guarantee a static IP address). CNI will allocate a new IP address upon Pod restart as CNI never maintains IP address and Pod mapping. FYI, Pod name itself not static in case of Deployment.

Practically, the Pods in a Deployment should use a Load-Balancer type of entity to expose the application as the app is stateless, and there will be more than one Pod hosting the application. Load-Balancer type of entity is called ‘Service’ in Kubernetes.

Pod-to-external

For the traffic that goes from pod to external addresses, Kubernetes uses SNAT. What it does is replace the pod’s internal source IP:port with the host’s IP:port. When the return packet comes back to the host, it rewrites the pod’s IP:port as the destination and sends it back to the original pod. The whole process is transparent to the original pod, who doesn’t know the address translation.

Pod-to-Service

ClusterIP

Kubernetes has a concept called “service,” which is simply an L4 load balancer in front of pods. There are several different types of services. The most basic type is called ClusterIP. This type of service has a unique VIP address that is only routable inside the cluster.

It would not be easy to send traffic to a particular application using just pod IPs. The dynamic nature of a Kubernetes cluster means pods can be moved, restarted, upgraded, or scaled in and out of existence. Additionally, some services will have many replicas, so we need some way to load balance between them.

Kubernetes solves this problem with Services. A Service is an API object that maps a single virtual IP (VIP) to a set of pod IPs. Additionally, Kubernetes provides a DNS entry for each service’s name and virtual IP so that services can be easily addressed by name.

The mapping of virtual IPs to pod IPs within the cluster is coordinated by the kube-proxy process on each node. This process sets up either iptables or IPVS to automatically translate VIPs into pod IPs before sending the packet out to the cluster network. Individual connections are tracked, so packets can be properly de-translated when they return. IPVS and iptables can load balancing of a single service virtual IP into multiple pod IPs, though IPVS has much more flexibility in the load balancing algorithms it can use. Virtual IP doesn’t actually exist in the system interface; it lives in iptable.

‘Service’ definition from the Kubernetes document — An abstract way to expose an application running on a set of Pods as a network service. With Kubernetes you don’t need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.

FrontEnd Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
labels:
app: webapp
spec:
replicas: 2
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: nginx
image: nginx:1.14.2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80

Backend Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
name: auth
labels:
app: auth
spec:
replicas: 2
selector:
matchLabels:
app: auth
template:
metadata:
labels:
app: auth
spec:
containers:
- name: nginx
image: nginx:1.14.2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80

Service:

---
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
app: frontend
spec:
ports:
- port: 80
protocol: TCP
type: ClusterIP
selector:
app: webapp
---
apiVersion: v1
kind: Service
metadata:
name: backend
labels:
app: backend
spec:
ports:
- port: 80
protocol: TCP
type: ClusterIP
selector:
app: auth
...

Now the FrontEnd Pod can connect to the backend via the ClusterIP or the DNS entry added by the Kubernetes. A cluster-aware DNS server, such as CoreDNS, watches the Kubernetes API for new Services and creates a set of DNS records for each one. If DNS has been enabled throughout your cluster, all Pods should automatically resolve Services by their DNS name.

NodePort (External-to-Pod)

Now we have the DNS that can be used to communicate between the services in the cluster. However, the external requests can’t reach the service that lives inside the cluster as the IP address are virtual and Private.

Let’s try to reach the frontEnd Pod IP address from the external server. (Note: At this point, no service has been created for the FrontEnd service)

NodePort 1.1

Can’t reach the Pod IP as it is a private IP address that can’t be routable.

Let’s create a NodePort service to expose the FrontEnd service to the external world. If you set the type field to NodePort, the Kubernetes control plane allocates a port from a range specified by --service-node-port-range flag (default: 30000-32767). Each node proxies that port (the same port number on every Node) into your Service. Your Service reports the allocated port in its .spec.ports[*].nodePort field.

---
apiVersion
: v1
kind: Service
metadata:
name: frontend
spec:
type: NodePort
selector:
app: webapp
ports:
# By default and for convenience, the `targetPort` is set to the same value as the `port` field.
- port: 80
targetPort: 80
# Optional field
# By default and for convenience, the Kubernetes control plane will allocate a port from a range (default: 30000-32767)
nodePort: 31380
...
NodePort 1.2

Now we can access the frontend service via <anyClusterNode>:<nodePort>. If you want a specific port number, you can specify a value in the nodePort field. The control plane will either allocate you that port or report that the API transaction failed. This means that you need to take care of possible port collisions yourself. You also have to use a valid port number, one that's inside the range configured for NodePort use.

External Traffic Policy

ExternalTrafficPolicy denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. “Local” preserves the client source IP and avoids a second hop for NodePort type services, but risks potentially imbalanced traffic spreading. “Cluster” obscures the client source IP and may cause a second hop to another node, but should have good overall load-balancing

Cluster Traffic Policy

This is the default external traffic policy for Kubernetes Services. The assumption here is that you always want to route traffic to all pods (across all the nodes) running a service with equal distribution.

One of the caveats of using this policy is that you may see unnecessary network hops between nodes as you ingress external traffic. For example, if you receive external traffic via a NodePort, the NodePort SVC may (randomly) route traffic to a pod on another host when it could have routed traffic to a pod on the same host, avoiding that extra hop out to the network.

Packet flow in Cluster traffic policy is as follows,

NodePort 1.3

Local Traffic Policy

With this external traffic policy, kube-proxy will add proxy rules on a specific NodePort (30000–32767) only for pods that exist on the same node (local) instead of every pod for a service regardless of where it was placed.

You’ll notice that if you try to set externalTrafficPolicy: Local on your Service, the Kubernetes API will require you are using the LoadBalancer or NodePort type. This is because the “Local” external traffic policy is only relevant for external traffic, which only applies to those two types.

If you set service.spec.externalTrafficPolicy to the value Local, kube-proxy only proxies proxy requests to local endpoints and does not forward traffic to other nodes. This approach preserves the original source IP address. If there are no local endpoints, packets sent to the node are dropped, so you can rely on the correct source-ip in any packet processing rules you might apply a packet that makes it through to the endpoint.

---
apiVersion
: v1
kind: Service
metadata:
name: frontend
spec:
type: NodePort
externalTrafficPolicy: Local
selector:
app: webapp
ports:
# By default and for convenience, the `targetPort` is set to the same value as the `port` field.
- port: 80
targetPort: 80
# Optional field
# By default and for convenience, the Kubernetes control plane will allocate a port from a range (default: 30000-32767)
nodePort: 31380
...

Packet flow in Local traffic policy as follows,

NodePort 1.4
NodePort 1.5

Local traffic policy in LoadBalancer Service type

If you’re running on Google Kubernetes Engine/GCE, setting the same service.spec.externalTrafficPolicy field to Local forces nodes without Service endpoints to remove themselves from the list of nodes eligible for load-balanced traffic by deliberately failing health checks. So there won’t be any traffic drops. This model is great for applications that ingress a lot of external traffic and avoid unnecessary hops on the network to reduce latency. We can also preserve true client IPs since we no longer need SNAT traffic from a proxying node! However, the biggest downsides to using the “Local” external traffic policy, as mentioned in the Kubernetes docs, is that traffic to your application may be imbalanced.

Kube-Proxy (iptable mode)

The component in Kubernetes that implements ‘Service’ is called kube-proxy. It sits on every node and programs complicated iptables rules to do all kinds of filtering and NAT between pods and services. If you go to a Kubernetes node and type iptables-save, you’ll see the rules inserted by Kubernetes or other programs. The most important chains are KUBE-SERVICES, KUBE-SVC-* and KUBE-SEP-*.

For DNAT, conntrack kicks in and tracks the connection state using a state machine. The state is needed because it needs to remember the destination address it changed to, and changed it back when the returning packet came back. Iptables could also rely on the conntrack state (ctstate) to decide the destiny of a packet. Those 4 conntrack states are especially important:

This is how the TCP connection works between pod and service; The sequence of events is:

GIF visualization:

iptables

In the Linux operating system, the firewalling is taken care of using netfilter. Which is a kernel module that decides what packets are allowed to come in or to go outside.iptables are just the interface to netfilter. The two might often be thought of as the same thing. A better perspective would be to think of it as a backend (netfilter) and a frontend (iptables).

chains

Each chain is responsible for a specific task,

FORWARD chain only works if the ip_forward enabled in the Linux server, that’s the reason the following command is important while setting up and debugging the Kubernetes cluster.

node-1# sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
node-1# cat /proc/sys/net/ipv4/ip_forward
1

The above change is not persistent. To permanently enable the IP forwarding on your Linux system, edit /etc/sysctl.conf and add the following line:

net.ipv4.ip_forward = 1

tables

We are going to focus on the NAT table, but the following are the available tables.

Please read THIS article for more detailed info on iptables.

iptable configuration in Kubernetes

Let’s deploy an Nginx application with replica count two in minikube and dump the iptable rules.

ServiceType: NodePort

master # kubectl get svc webapp
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
webapp NodePort 10.103.46.104 <none> 80:31380/TCP 3d13h
master # kubectl get ep webapp
NAME ENDPOINTS AGE
webapp 10.244.120.102:80,10.244.120.103:80 3d13h
master #

The ClusterIP doesn’t exist anywhere, its a virtual IP exists in iptable Kubernetes adds a DNS entry in CoreDNS.

master # kubectl exec -i -t dnsutils -- nslookup webapp.defaultServer:  10.96.0.10Address: 10.96.0.10#53Name: webapp.default.svc.cluster.localAddress: 10.103.46.104

To hook into packet filtering and NAT, Kubernetes will create a custom chain KUBE-SERVICES from iptables; it will redirect all PREROUTING AND OUTPUT traffic to custom chain KUBE-SERVICES, refer to below,

$ sudo iptables -t nat -L PREROUTING | column -t
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
cali-PREROUTING all -- anywhere anywhere /* cali:6gwbT8clXdHdC1b1 */
KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */
DOCKER all -- anywhere anywhere ADDRTYPE match dst-type LOCAL

After using KUBE-SERVICES chain hook into packet filtering and NAT, Kubernetes can inspect traffics to its services and apply SNAT/DNAT accordingly. At the end of the KUBE-SERVICES chain, it will install another custom chain KUBE-NODEPORTS to handle traffics for a specific service type NodePort.

If the traffic is for ClusterIP, the KUBE-SVC-2IRACUALRELARSND chain will process the traffic; else, the next chain will process the traffic, that is KUBE-NODEPORTS.

$ sudo iptables -t nat -L KUBE-SERVICES | column -t
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.103.46.104 /* default/webapp cluster IP */ tcp dpt:www
KUBE-SVC-2IRACUALRELARSND tcp -- anywhere 10.103.46.104 /* default/webapp cluster IP */ tcp dpt:www
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Let’s check what the chains are part of KUBE-NODEPORTS,

$ sudo iptables -t nat -L KUBE-NODEPORTS | column -t
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- anywhere anywhere /* default/webapp */ tcp dpt:31380
KUBE-SVC-2IRACUALRELARSND tcp -- anywhere anywhere /* default/webapp */ tcp dpt:31380

From this point, the processing is the same for ClusterIP and NodePort. Please take a look at the iptable flow diagram as follows.

# statistic  mode  random -> Random load-balancing between endpoints.
$ sudo iptables -t nat -L KUBE-SVC-2IRACUALRELARSND | column -t
Chain KUBE-SVC-2IRACUALRELARSND (2 references)
target prot opt source destination
KUBE-SEP-AO6KYGU752IZFEZ4 all -- anywhere anywhere /* default/webapp */ statistic mode random probability 0.50000000000
KUBE-SEP-PJFBSHHDX4VZAOXM all -- anywhere anywhere /* default/webapp */

$ sudo iptables -t nat -L KUBE-SEP-AO6KYGU752IZFEZ4 | column -t
Chain KUBE-SEP-AO6KYGU752IZFEZ4 (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.244.120.102 anywhere /* default/webapp */
DNAT tcp -- anywhere anywhere /* default/webapp */ tcp to:10.244.120.102:80

$ sudo iptables -t nat -L KUBE-SEP-PJFBSHHDX4VZAOXM | column -t
Chain KUBE-SEP-PJFBSHHDX4VZAOXM (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.244.120.103 anywhere /* default/webapp */
DNAT tcp -- anywhere anywhere /* default/webapp */ tcp to:10.244.120.103:80

$ sudo iptables -t nat -L KUBE-MARK-MASQ | column -t
Chain KUBE-MARK-MASQ (24 references)
target prot opt source destination
MARK all -- anywhere anywhere MARK or 0x4000

Note: Trimmed the output to show only the required rules for readability.

ClusterIP:

KUBE-SERVICES → KUBE-SVC-XXX → KUBE-SEP-XXX

NodePort:

KUBE-SERVICES → KUBE-NODEPORTS → KUBE-SVC-XXX → KUBE-SEP-XXX

Note: The NodePort service will have a ClusterIP assigned to handle internal and external traffic.

Visual representation of above iptable rules,

ExtrenalTrafficPolicy: Local

As discussed before, using “externalTrafficPolicy: Local” will preserve source IP and drop packets from the agent node has no local endpoint. Let’s take a look at the iptable rules in the node with no local endpoint.

master # kubectl get nodes
NAME STATUS ROLES AGE VERSION
minikube Ready master 6d1h v1.19.2
minikube-m02 Ready <none> 85m v1.19.2

Deploy Nginx with externalTrafficPolicy Local.

master # kubectl get pods nginx-deployment-7759cc5c66-p45tz -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-7759cc5c66-p45tz 1/1 Running 0 29m 10.244.120.111 minikube <none> <none>

Check externalTrafficPolicy,

master # kubectl get svc webapp -o wide -o jsonpath={.spec.externalTrafficPolicy}
Local

Get the service,

master # kubectl get svc webapp -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
webapp NodePort 10.111.243.62 <none> 80:30080/TCP 29m app=webserver

Let’s check the iptable rules in node minikube-m02; there should be a DROP rule to drop the packets as there is no local endpoint.

$ sudo iptables -t nat -L KUBE-NODEPORTSChain KUBE-NODEPORTS (1 references)target prot opt source destinationKUBE-MARK-MASQ tcp — 127.0.0.0/8 anywhere /* default/webapp */ tcp dpt:30080KUBE-XLB-2IRACUALRELARSND tcp — anywhere anywhere /* default/webapp */ tcp dpt:30080

Check KUBE-XLB-2IRACUALRELARSND chain,

$ sudo iptables -t nat -L KUBE-XLB-2IRACUALRELARSNDChain KUBE-XLB-2IRACUALRELARSND (1 references)target prot opt source destinationKUBE-SVC-2IRACUALRELARSND all — 10.244.0.0/16 anywhere /* Redirect pods trying to reach external loadbalancer VIP to clusterIP */KUBE-MARK-MASQ all — anywhere anywhere /* masquerade LOCAL traffic for default/webapp LB IP */ ADDRTYPE match src-type LOCALKUBE-SVC-2IRACUALRELARSND all — anywhere anywhere /* route LOCAL traffic for default/webapp LB IP to service chain */ ADDRTYPE match src-type LOCALKUBE-MARK-DROP all — anywhere anywhere /* default/webapp has no local endpoints */

If you take a closer look, there is no issue with the Cluster level traffic; only the nodePort traffic will be dropped on this node.

minikube’ node iptable rules,

$ sudo iptables -t nat -L KUBE-NODEPORTSChain KUBE-NODEPORTS (1 references)target prot opt source destinationKUBE-MARK-MASQ tcp — 127.0.0.0/8 anywhere /* default/webapp */ tcp dpt:30080KUBE-XLB-2IRACUALRELARSND tcp — anywhere anywhere /* default/webapp */ tcp dpt:30080$ sudo iptables -t nat -L KUBE-XLB-2IRACUALRELARSNDChain KUBE-XLB-2IRACUALRELARSND (1 references)target prot opt source destinationKUBE-SVC-2IRACUALRELARSND all — 10.244.0.0/16 anywhere /* Redirect pods trying to reach external loadbalancer VIP to clusterIP */KUBE-MARK-MASQ all — anywhere anywhere /* masquerade LOCAL traffic for default/webapp LB IP */ ADDRTYPE match src-type LOCALKUBE-SVC-2IRACUALRELARSND all — anywhere anywhere /* route LOCAL traffic for default/webapp LB IP to service chain */ ADDRTYPE match src-type LOCALKUBE-SEP-5T4S2ILYSXWY3R2J all — anywhere anywhere /* Balancing rule 0 for default/webapp */$ sudo iptables -t nat -L KUBE-SVC-2IRACUALRELARSNDChain KUBE-SVC-2IRACUALRELARSND (3 references)target prot opt source destinationKUBE-SEP-5T4S2ILYSXWY3R2J all — anywhere anywhere /* default/webapp */

Headless Services

-Copied from Kubernetes documentation-

Sometimes you don’t need load-balancing and a single service IP. In this case, you can create what is termed “headless” Services by explicitly specifying "None" for the cluster IP (.spec.clusterIP).

You can use a headless Service to interface with other service discovery mechanisms without being tied to Kubernetes’ implementation.

For headless Services, a cluster IP is not allocated, kube-proxy does not handle these Services, and there is no load balancing or proxying done by the platform. How DNS is automatically configured depends on whether the Service has selectors defined:

With selectors

For headless services that define selectors, the endpoints controller creates Endpoints records in the API, and modifies the DNS configuration to return records (addresses) that point directly to the Pods backing the Service.

master # kubectl get svc webapp-hsNAME        TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGEwebapp-hs   ClusterIP   None         <none>        80/TCP    24smaster # kubectl get ep webapp-hsNAME        ENDPOINTS                             AGEwebapp-hs   10.244.120.109:80,10.244.120.110:80   31s

Without selectors

For headless services that do not define selectors, the endpoints controller does not create Endpoints records. However, the DNS system looks for and configures either:

If there are external IPs that route to one or more cluster nodes, Kubernetes Services can be exposed on those externalIPs. Traffic that ingresses into the cluster with the external IP (as the destination IP) on the Service port will be routed to one of the Service endpoints. externalIPsare not managed by Kubernetes and are the responsibility of the cluster administrator.

Network Policy

By now, you might have got an idea of how the network policy is implemented in Kubernetes. Yes, the iptables again; this time, the CNI takes care of implementing the network policy, not the kube-proxy. This section should have been added to the Calico (Part 2); however, I feel this is the right place to have the network policy details.

Let’s create three services — frontend, backend, and db.

By default, pods are non-isolated; they accept traffic from any source.

However, there should be a traffic policy to isolate the DB pods from the FrontEnd pods to avoid any traffic flow between them.

I would suggest you read THIS article to understand the Network Policy configuration. This section will focus on how the network policy is implemented in Kubernetes instead of configuration deep dive.

I have applied a network policy to isolate db from the frontend pods; this results in no connection between the frontend and db pods.

Note: Above picture shows the ‘service’ symbol instead of the ‘pod’ symbol to make life easier as there can be many pods in a given service. But, the actual rules are applied per Pod.

master # kubectl exec -it frontend-8b474f47-zdqdv -- /bin/sh# curl backendbackend-867fd6dff-mjf92# curl dbcurl: (7) Failed to connect to db port 80: Connection timed out

However, the backend can reach the db service without any issue.

master # kubectl exec -it backend-867fd6dff-mjf92 -- /bin/sh# curl dbdb-8d66ff5f7-bp6kf

Let’s take a look at the NetworkPolicy — Allow ingress from the service if it has a label ‘allow-db-access’ set to ‘true.

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-db-access
spec:
podSelector:
matchLabels:
app: "db"
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
networking/allow-db-access: "true"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
labels:
app: backend
spec:
replicas: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
networking/allow-db-access: "true"
spec:
volumes:
- name: workdir
emptyDir: {}
containers:
- name: nginx
image: nginx:1.14.2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
initContainers:
- name: install
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', "echo $HOSTNAME > /work-dir/index.html"]
volumeMounts:
- name: workdir
mountPath: "/work-dir"
...

Calico converts the Kubernetes network policy into Calico’s native format,

master # calicoctl get networkPolicy --output yaml
apiVersion: projectcalico.org/v3
items:
- apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
creationTimestamp: "2020-11-05T05:26:27Z"
name: knp.default.allow-db-access
namespace: default
resourceVersion: /53872
uid: 1b3eb093-b1a8-4429-a77d-a9a054a6ae90
spec:
ingress:
- action: Allow
destination: {}
source:
selector: projectcalico.org/orchestrator == 'k8s' && networking/allow-db-access
== 'true'
order: 1000
selector: projectcalico.org/orchestrator == 'k8s' && app == 'db'
types:
- Ingress
kind: NetworkPolicyList
metadata:
resourceVersion: 56821/56821

The iptables rule plays an important role in enforcing the policy by using the ‘filter’ table. It’s hard to do reverse engineering as the Calico uses advanced concepts like ipset. From the iptables rules, I see that the packets are allowed to db pod only if the packets are from the backend, and that’s exactly our network policy is.

Get the workload endpoint details from the calicoctl.

master # calicoctl get workloadEndpoint
WORKLOAD NODE NETWORKS INTERFACE
backend-867fd6dff-mjf92 minikube 10.88.0.27/32 cali2b1490aa46a
db-8d66ff5f7-bp6kf minikube 10.88.0.26/32 cali95aa86cbb2a
frontend-8b474f47-zdqdv minikube 10.88.0.24/32 cali505cfbeac50

cali95aa86cbb2a — Host side end of veth pair that is in use by db pod.

Let’s get the iptables rules related to this interface.

$ sudo iptables-save | grep cali95aa86cbb2a
:cali-fw-cali95aa86cbb2a - [0:0]
:cali-tw-cali95aa86cbb2a - [0:0]
-A cali-from-wl-dispatch -i cali95aa86cbb2a -m comment --comment "cali:R489GtivXlno-SCP" -g cali-fw-cali95aa86cbb2a
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:3XN24uu3MS3PMvfM" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:xyfc0rlfldUi6JAS" -m conntrack --ctstate INVALID -j DROP
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:wG4_76ot8e_QgXek" -j MARK --set-xmark 0x0/0x10000
-A cali-fw-cali95aa86cbb2a -p udp -m comment --comment "cali:Ze6pH1ZM5N1pe76G" -m comment --comment "Drop VXLAN encapped packets originating in pods" -m multiport --dports 4789 -j DROP
-A cali-fw-cali95aa86cbb2a -p ipencap -m comment --comment "cali:3bjax7tRUEJ2Uzew" -m comment --comment "Drop IPinIP encapped packets originating in pods" -j DROP
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:0pCFB_VsKq1qUOGl" -j cali-pro-kns.default
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:mbgUOxlInVlwb2Ie" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:I7GVOQegh6Wd9EMv" -j cali-pro-ksa.default.default
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:g5ViWVLiyVrKX91C" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN
-A cali-fw-cali95aa86cbb2a -m comment --comment "cali:RBmQDo38EoPmxJ0I" -m comment --comment "Drop if no profiles matched" -j DROP
-A cali-to-wl-dispatch -o cali95aa86cbb2a -m comment --comment "cali:v3sEoNToLYUOg7M6" -g cali-tw-cali95aa86cbb2a
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:eCrqwxNk3cKw9Eq6" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:_krp5nzavhAu5avJ" -m conntrack --ctstate INVALID -j DROP
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:Cu-tVtfKKu413YTT" -j MARK --set-xmark 0x0/0x10000
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:leBL64hpAXM9y4nk" -m comment --comment "Start of policies" -j MARK --set-xmark 0x0/0x20000
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:pm-LK-c1ra31tRwz" -m mark --mark 0x0/0x20000 -j cali-pi-_tTE-E7yY40ogArNVgKt
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:q_zG8dAujKUIBe0Q" -m comment --comment "Return if policy accepted" -m mark --mark 0x10000/0x10000 -j RETURN
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:FUDVBYh1Yr6tVRgq" -m comment --comment "Drop if no policies passed packet" -m mark --mark 0x0/0x20000 -j DROP
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:X19Z-Pa0qidaNsMH" -j cali-pri-kns.default
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:Ljj0xNidsduxDGUb" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:0z9RRvvZI9Gud0Wv" -j cali-pri-ksa.default.default
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:pNCpK-SOYelSULC1" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN
-A cali-tw-cali95aa86cbb2a -m comment --comment "cali:sMkvrxvxj13WlTMK" -m comment --comment "Drop if no profiles matched" -j DROP
$ sudo iptables-save -t filter | grep cali-pi-_tTE-E7yY40ogArNVgKt
:cali-pi-_tTE-E7yY40ogArNVgKt - [0:0]
-A cali-pi-_tTE-E7yY40ogArNVgKt -m comment --comment "cali:M4Und37HGrw6jUk8" -m set --match-set cali40s:LrVD8vMIGQDyv8Y7sPFB1Ge src -j MARK --set-xmark 0x10000/0x10000
-A cali-pi-_tTE-E7yY40ogArNVgKt -m comment --comment "cali:sEnlfZagUFRSPRoe" -m mark --mark 0x10000/0x10000 -j RETURN

By checking the ipset, it is clear that the ingress to db pod allowed only from the backend pod IP 10.88.0.27

[root@minikube /]# ipset list
Name: cali40s:LrVD8vMIGQDyv8Y7sPFB1Ge
Type: hash:net
Revision: 6
Header: family inet hashsize 1024 maxelem 1048576
Size in memory: 408
References: 3
Number of entries: 1
Members:
10.88.0.27

I’ll update Part 2 of this series with more detailed steps to decode the calico iptables rules.

References:

https://kubernetes.io
https://www.projectcalico.org/
https://rancher.com/
http://www.netfilter.org/

dineshkumarr.ramasamy@gmail.com