Kubernetes Internals: Inside The Mind of A Monster

Illustrated guides to become better programmers

Mar 28, 2023

Kubernetes automates the deployment, scaling, and management of containerized applications. This post is aimed at tackling Kubernetes internals as to practically implement systems, only knowing command-line arguments are not enough. As always, this post contains a generous serving of illustrations.

Kubernetes or K8s is the backbone of modern infrastructure. If you are interested in System Design or Backend Engineering, you’ll come across it sooner or later.

This article, though in-depth can be shared as an introductory holistic material for the kube beginner. It is also useful to understand configfiles.

Basics

Kubernetes has lots of concepts of it’s own. A pod is such a concept. A pod can contain one or more containers. If you don’t know about containers, read this awesome introduction and this another one to know it in depth.

K8s also has nodes. A node has it’s own resources. It can be a virtual machine or a physical one. A node runs pods. Pods share the node’s resources.

A cluster is a collection of nodes.

Resources are grouped into namespaces for access control and better organization.

There are two types of nodes, worker nodes and the control plane / master node.

K8s uses config-maps to specify information.

Basic nodes / Worker nodes

A basic node has several components. It has Kubelet. Kubernetes supports different container runtime (like Docker). The Container Runtime Interface describes how to interact with supported runtimes.

The container runtime is used to create, delete or stop containers.
Note: The service and pod illustration is part of the node but drawn outside.

Kubelet also communicates with the api-server found in the master node.

Kube-proxy, another component handles load-balancing to pods as well as configuring networking access.

A load balancer is just a piece of software that decides what request hits what server. It distributes the requests so that one server / pod is not overwhlemed by the number of requests.

Pods look up to the network policy to check allowed connections.

It also has node-status to report the state of a node to the api-server

Worker nodes have a volume to persist container data. Since pods can de destroyed and recreated, if we need to store something persistent, we use the node’s volume.

Master node / Control plane

The control plane provides the api-server so that we can interact with it. Apart from that, other plane components are used internally.

Etcd is a distributed key-value store. It stores information in-memory and periodically writes to disk. It stores information on each node and solves consensus using the Raft algorithm. It stores internal infos like configs and states.

The controller manager is responsible to deal with some Kubernetes objects,

Replica Sets ensure that a number of replicas are running at all time. If pods are missing, it re-creates them.

A Deployment concerns itself with creating and scaling ReplicaSets

StatefulSet ensures that stateful components can be scaled and updated without disruption. Not all components are stateless. Databases for example cannot just be created and deleted. StatefulSet ensures that each pod in the set has a unique and persistent identity. The identity remains the same even if the pod is recreated or moved to another node.

DaemonSet ensures that a specific pod runs on nodes in a cluster. It is useful to run well, daemons or similar processes such as logging or monitoring tools. If a new node is added to the cluster, the pod is automatically created.

Job creates a pod and destroys it when the task is complete.

Cron Job creates one or more pods on schedule.

Networking

A service has a stable address and is used to communicate with pods within the service. Just like pods, they also have unique ips. Pods are also assigned default DNS names.

Endpoint vs EndpointSlice

We create service objects (SVC) in front of pods for reliable networking. Behind services there are Endpoint objects that list pods.

The Endpoint object has been deprecated and replaced with the EndpointSlice object to allow for better scaling of pods as internally Kubernetes was carrying a lot of data around due to update Endpoint objects.

Ingress

Ingress is a component which allows external users access to services. It routes requests to the correct service or app. Let’s take our kube-proxy example.

An ingress can be configured to route traffics based on a number of rules such as domain name or url.

It can also be used to route different urls to completely different services.

But, an ingress by itself is not applied unless there is an ingress controller to handle it. A controller reads the ingress rules, interprets it and decides how to route the traffic.

Ingress controllers: Istio

Istio is one such controller. Istio is a service mesh. A service mesh handles service to service communication. A service mesh is an infrastructure layer of it’s own and uses sidecar containers (containers running alongside pod containers) to handle communication between services.

Istio injects the Envoy proxy as sidecar to intercept all traffic.

It can then provide load balancing, traffic management, and service discovery.

Conclusion

Kubernetes is a complex solution with complex concepts, complex scenarios and complex ecosystem to handle the complexity. It requires some study to grasp all aspects of it. But, it’s also incredibly powerful and allows us to version control our setup through the declarative aspect of it (config files are version controlled).