Life of a Packet in Kubernetes — Part 4

6 min readMay 10, 2021

This is part 4 of the series on Life of a Packet in Kubernetes we’ll be tackling Kubernetes’s Ingress resource and Ingress controller. An Ingress Controller is a controller that watches the Kubernetes API server for updates to the Ingress resource and reconfigures the Ingress load balancer accordingly.

Nginx Controller and LoadBalancer/Proxy

An ingress controller is usually an application that runs as a pod in a Kubernetes cluster and configures a load balancer according to Ingress Resources. The load balancer can be a software load balancer running in the cluster or hardware or cloud load balancer running externally. Different load balancers require different ingress controllers.

The basic idea behind the Ingress is to provide a way of describing higher-level traffic management constraints, specifically for HTTP(S). With Ingress we can define rules for routing traffic without creating a bunch of Load Balancers or exposing each service on the node. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL/TLS, and offer name-based virtual hosting and content-based routing.

Configuration Options

The Kubernetes Ingress Controller uses ingress classes to filter Kubernetes Ingress objects and other resources before converting them into configuration. This allows it to coexist with other ingress controllers and/or other deployments of the Kubernetes Ingress Controller in the same cluster: a Kubernetes Ingress Controller will only process configuration marked for its use.

Prefix Based

Host-Based

Host + Prefix

Ingress is one of the built-in APIs which doesn’t have a built-in controller, and an ingress controller is needed to actually implement the Ingress API. There are many ingress controllers out there, will look at Nginx and Contour.

Ingress is made up of an Ingress API object and an Ingress controller. As mentioned earlier, Kubernetes Ingress is an API object that describes the desired state for exposing services deployed to a Kubernetes cluster. So, to make it work as an Ingress controller you will require the actual implementation of the Ingress API to read and process the Ingress resource’s information.

Since the Ingress API is actually just metadata, the Ingress controller does the heavy lifting. Various ingress controllers are available and it is important to choose the right one carefully for each use case.

It’s also possible to have multiple ingress controllers in the same cluster and to set the desired ingress controller for each Ingres. Usually, we end up using a combination of these controllers for different scenarios in the same cluster. For example, we may have one for handling the external traffic coming into the cluster which includes bindings to SSL certificates, and has another internal one with no SSL binding that handles in-cluster traffic.

Deployment options

Contour + Envoy

The Contour Ingress controller is a collaboration between:

Envoy, which provides the high-performance reverse proxy.
Contour, which acts as a management server for Envoy and provides it with configuration

These containers are deployed separately, Contour as a Deployment and Envoy as a Daemonset, although other configurations are possible. Contour is a client of the Kubernetes API. Contour watches Ingress, HTTPProxy, Secret, Service, and Endpoint objects, and acts as the management server for its Envoy sibling by translating its cache of objects into the relevant JSON stanzas: Service objects for CDS, Ingress for RDS, Endpoint objects for EDS, and so on).

The following example shows EnvoyProxy with host network-enabled (0.0.0.0:80)

Nginx

The goal of the Nginx Ingress controller is the assembly of a configuration file (nginx.conf). The main implication of this requirement is the need to reload NGINX after any change in the configuration file. Though it is important to note that we don’t reload Nginx on changes that impact only an upstream configuration (i.e Endpoints change when you deploy your app). We use lua-nginx-module to achieve this.

On every endpoint change, the controller fetches endpoints from all the services it sees and generates corresponding Backend objects. It then sends these objects to a Lua handler running inside Nginx. The Lua code in turn stores those backends in a shared memory zone. Then for every request Lua code running in balancer_by_lua context detects what endpoints it should choose upstream peer from and applies the configured load balancing algorithm to choose the peer. Then Nginx takes care of the rest. This way we avoid reloading Nginx on endpoint changes. In a relatively big cluster with frequently deploying apps, this feature saves a significant number of Nginx reloads which can otherwise affect response latency, load balancing quality (after every reload Nginx resets the state of load balancing) and so on.

Nginx + Keepalived — High Availablity Deployment

The keepalived daemon can be used to monitor services or systems and to automatically failover to standby if problems occur. We configure a floating IP address that can be moved between worker nodes. If a worker goes down, the floating IP will be moved to another worker automatically, allowing nginx to bind to a new IP address.

MetalLB —Nginx with LoadBalancer Service (For the private clusters with a small Public IP address pool)

MetalLB hooks into your Kubernetes cluster, and provides a network load-balancer implementation. In short, it allows you to create Kubernetes services of type “LoadBalancer” in clusters that don’t run on a cloud provider. In a cloud-enabled Kubernetes cluster, you request a load-balancer, and your cloud platform assigns an IP address to you. In a bare-metal cluster, MetalLB is responsible for that allocation. Once MetalLB has assigned an external IP address to a service, it needs to make the network beyond the cluster-aware that the IP “lives” in the cluster. MetalLB uses standard routing protocols to achieve this: ARP, NDP, or BGP.

In layer 2 mode, one machine in the cluster takes ownership of the service and uses standard address discovery protocols (ARP for IPv4, NDP for IPv6) to make those IPs reachable on the local network. From the LAN’s point of view, the announcing machine simply has multiple IP addresses.

In BGP mode, all machines in the cluster establish BGP peering sessions with nearby routers that you control and tell those routers how to forward traffic to the service IPs. Using BGP allows for true load balancing across multiple nodes, and fine-grained traffic control thanks to BGP’s policy mechanisms.

MetalLB Pods,

controller is the cluster-wide MetalLB controller, in charge of IP assignment (deployment)
speaker is the per-node daemon that advertises services with assigned IPs using various advertising strategies (daemonset)

Note: Metal LB can be used by any services in the cluster by defining the service type as ‘LoadBalancer’ but it is not practical to get a big public IP address pool to use with MetalLB

References

Disclaimer

This article does not provide any technical advice or recommendation; if you feel so, it is my personal view, not the company I work for.