Sarthak's Newsletter
Posts
Kubernetes Design Principles

Kubernetes Design Principles

Sarthak Dalabehera
July 22, 2023

When I started using Kubernetes, there was a lot that I didn't know, and I don't consider myself very smart. So, you know, I didn't really pick up on the Kubernetes ideas very quickly. For me, when I start to learn things, learning all the little things, how to do them, doesn't really work. What I need to do is try to understand why things work the way they do, and that kind of sticks better in my head and that has helped me gain a better understanding of how it works. It might be material that many of you are already familiar with, but it's still interesting to acknowledge and apply it to future patterns.

So, what's in it for me in this newsletter? A deeper understanding of Kubernetes. Learning an important tool involves understanding the problem and the "why" behind it, not just the "what." Trying to memorize every detail from a vast collection of "whats" is impossible. However, when you comprehend the fundamentals of how things work, it becomes easier to extrapolate and develop your own understanding of how things should function.

What is Kubernetes?

Before diving into Kubernetes, let's explore the problem space that existed prior to its emergence. In the past, when distributed systems were prevalent, they were primarily deployed on bare metal servers or virtual machines. However, with the rise of containerization, a new approach emerged, offering consistent, repeatable, and reliable deployments. Containers provided the ability to run multiple applications on the same machines without concerns about conflicting dependencies.

While containers offered significant advantages, deploying them at scale presented challenges. This is where Kubernetes came into the picture. Instead of building your own system to manually SSH into each machine, start Docker containers, and manage monitoring services, Kubernetes stepped in as a comprehensive solution. It addressed the need for a robust container orchestration platform.

The fundamentals

The traditional approach

When I first started working with Kubernetes, my initial understanding was based on my prior knowledge, which relied on a familiar master-slave model. In this model, there were typically two entities: a master and one or more slaves. The master would dictate tasks to the slaves, directing them on what to do. Applying this mindset to Kubernetes, I would imagine a user selecting a machine and instructing it to start a container.

However, this traditional approach came with its own set of challenges. For example, what would happen if the container unexpectedly crashed? In the pre-Kubernetes era, when manually SSH-ing into machines to start containers, issues such as container failure or node instability could occur.

Imagine a scenario where the SSH connection dropped, preventing the successful launch of the container. In such cases, custom recovery logic had to be implemented to monitor the service and application, ensuring their continuous operation. This resulted in writing extensive amounts of custom logic solely to keep the application running smoothly.

This where the #1 Principle of Kubernetes comes in

Kubernetes APIs are declarative rather than imperative

In Kubernetes, you don't explicitly tell the system to start a specific container on a particular machine. Instead, you define the desired state you want to achieve. You express your intention by stating that you want a container to be running, and Kubernetes takes care of making it happen.

This differs from, for instance, a pilot manually flying an airplane versus engaging the autopilot. When the pilot flies the plane manually, they are constantly providing input, monitoring the situation, and making decisions about where the aircraft should go. On the other hand, when they engage the autopilot, the computer takes over, and control systems ensure that the requested altitude is maintained. Similarly, Kubernetes operates based on a declarative API, where you specify the desired state you want to achieve, and the system manages the necessary actions to reach that state.

Think of Kubernetes as a platform that relieves you from the need to provide step-by-step instructions and continuously monitor the system. Instead, you can simply state what you want to happen, and Kubernetes handles the orchestration. This fundamental principle forms the essence of Kubernetes—shifting from manual management to a declarative approach, streamlining the deployment and management of workloads.

How do you deploy your workload?

the kubernetes way!

You: create API object that is persisted on kube API server until deletion

Kubernetes: all components work in parallel to drive that state

Let's delve into a more concrete explanation of how actions are performed in the Kubernetes API. In Kubernetes, when you want to create or perform any operation, you start by creating an API object. This object is persisted on the Kubernetes API server until it is explicitly deleted.

Once you create the API object, all the components within Kubernetes work in parallel to drive the system towards the desired state defined by that object. For example, when you want to run a workload, you can utilize one of the fundamental building blocks called a replica set. This is a familiar concept for many, which specifies the creation of a container, such as an Nginx container, and designates the desired number of replicas to be running across the system.

Once the replica set object is created, the Kubernetes system takes over and determines the necessary steps to achieve the desired state. It schedules the workload appropriately, distributing it across the available resources in the cluster. This orchestration process involves coordination between various components within Kubernetes to ensure the successful deployment and management of the workload.

At this point, you can be mostly hands-off, as you don't need to constantly monitor the status of your workloads. Kubernetes takes care of that for you. Once you have defined the desired state through API objects and initiated the necessary operations, Kubernetes autonomously manages the workloads and ensures their proper functioning.

Why declarative over imperative?

The primary benefit is automatic recovery. This means that if something happens to your application, such as a crash or if the node crashes, Kubernetes will automatically take care of recovering that application for you. It will move it around as needed.

Let’s revisit this and deep dive into it a little bit

In Kubernetes, when you create a replica object on the Kubernetes API server, you request a specific pod definition to be created on your cluster. But how does the node know that it's supposed to run this workload, and how does all of this coordination work?

At first glance, it might seem intuitive to have the Kubernetes master API server call out to the selected node and ask it to start the container. After all, this resembles traditional server-client setups, where the client instructs the server on what to do, and the server executes the action in return. However, this approach presents a series of challenges.

One significant problem is handling failures. Imagine if a container or a node crashes, or the node becomes unavailable at the moment the master tries to issue the command. How would Kubernetes recover from such situations? The master would need to maintain the state of every component it's responsible for and continuously track and resolve discrepancies between expected and actual states. This would make the master overly complex, fragile, and challenging to extend.

To address these issues, Kubernetes follows a declarative approach instead of an imperative one. In the declarative model, you describe the desired state of your application, and Kubernetes takes care of maintaining that state on your behalf.

Here's how the declarative model works:

You define the desired state of your application, specifying the number of replicas, the pod definition, and other configurations in the replica object.
The Kubernetes control plane (master) constantly monitors the cluster's actual state, comparing it to the desired state you've defined.
If there are any discrepancies between the actual and desired states (e.g., a pod dies, a node fails), Kubernetes automatically takes corrective action to reconcile the state.
Kubernetes schedules pods on appropriate nodes based on resource availability and constraints, without the need for the master to issue direct commands.

By adopting the declarative model, Kubernetes achieves a more robust and scalable architecture. The control plane doesn't need to be aware of the state of each individual component actively; it focuses on reconciling discrepancies when they occur.

So this is where the #2 Principle of Kubernetes comes in

The Kubernetes control plane is transparent. There are no hidden internal APIs.

In Kubernetes, the power of its architecture lies in a fundamental principle: "No Hidden Internal APIs" This concept ensures that the same declarative API exposed to end users is also used by all internal Kubernetes components to interact with one another. Let's explore why this approach is so powerful.

With "No Hidden Internal APIs" the benefits of the declarative API extend to the components themselves. In contrast to an imperative model, where the master directly instructs individual nodes on what to do, Kubernetes takes a different approach. When a component, such as a node, comes online, it autonomously queries the Kubernetes API server to understand its role and responsibilities.’

This decentralized decision-making process brings several advantages:

Autonomous Recovery: Since components independently monitor the Kubernetes API server for their desired state, they can easily recover from failures. If a component crashes and comes back online, it simply checks the API server to understand its purpose and proceeds accordingly.
Level Triggered Approach: Kubernetes operates on a level-triggered approach rather than an edge-triggered one. This means that components maintain their state based on the desired configuration set in the API server. In contrast, an edge-triggered system relies on continuous events, which can lead to complexities and issues when nodes are temporarily unreachable.
Robustness and Reliability: By distributing decision-making to individual components, Kubernetes becomes more robust and resilient. Since each component knows what it should be doing without relying on a central master, the system can better tolerate failures.
Extensibility: As the Kubernetes ecosystem grows, new components can be seamlessly added without drastically increasing the complexity of the master control plane. Each component can independently operate based on the desired state set in the API server.
No Single Point of Failure: In traditional models, if the central master goes down, the entire cluster could become inaccessible. However, with the Kubernetes API server acting as the central point, individual components can continue operating based on the last known state. When the API server is back online, they adapt to the updated state.

Scheduling and Node Registration:

When you create a new Pod in Kubernetes, it is initially unscheduled, meaning it doesn't have a specific node assigned to it yet. The scheduling process starts once the Pod's definition is created in the Kubernetes API server. Let's walk through the steps involved:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: nginx:latest

Pod Creation: You, as an end user, interact with the Kubernetes API server to create a new Pod with a specific definition.
Pod Submission to the API Server: The Pod definition is submitted to the Kubernetes API server, which stores it in its etcd data store.
Unscheduled Pod: At this point, the Pod is considered unscheduled because it doesn't have a node assigned to it yet.
Scheduler Watches for Unscheduled Pods: The scheduler component in the Kubernetes control plane constantly monitors the API server for unscheduled Pods.
Scheduler Selects a Node: The scheduler's primary role is to intelligently select a suitable node for the Pod based on various factors, such as resource requirements, node capacity, and any user-defined constraints (affinities and taints/tolerations). For this example, let's assume we have two nodes available in the cluster with the following labels:
- Node A: env=production
- Node B: env=development
Node Selection for Scheduling: The scheduler looks for nodes that meet the Pod's resource requirements and satisfy any constraints. In this case, the Pod definition does not specify any constraints, so the scheduler is free to choose any node.
Node Registration: Now, suppose the scheduler decides to place the Pod on Node B (the development environment node). If Node B is not already registered with the Kubernetes API server, it needs to register itself.
Node Reads its Labels: After registering, Node B communicates with the Kubernetes API server to obtain its own information, including its labels. It realizes that it has the env=development label.
Node Scheduling Decision: The scheduler, being aware of Node B's label, cross-references it with the Pod's requirements. Since the Pod does not specify any node affinity or node selector, and Node B matches the label env=development, the scheduler decides to schedule the Pod onto Node B.
Pod Scheduling Complete: The scheduler updates the Pod's definition in the API server, indicating that it should be scheduled to Node B. At this point, the Pod is considered scheduled.
Node Executes Pod: Node B, having received the scheduling information from the API server, starts executing the Pod's containers. It pulls the required Docker image (in this case, nginx:latest) and launches the containers on itself.