A Developer's Guide to Container Orchestration - Kubernetes
Kubernetes Architecture, Compnents and Principles
Kubernetes is a flexible, scalable, and open-source application orchestrator. Nowadays, some programs are so huge that they must run on many different machines in the cloud. These programs are called distributed systems, and their parts are spread all over the cloud, each doing its own job. We call this kind of setup a 'microservices architecture,' and each part of the system is called a 'service.' These services usually run inside something called a 'container.' Containers are great because they're easy to set up, manage, update, and remove.
Kubernetes' main job is to take care of these containers. Think of it as the manager of your program, making sure all the pieces work well together. This whole process is called orchestration. But remember, Kubernetes isn't a traditional platform for running software. It's more like the conductor of an orchestra, making sure all the instruments (containers) play together harmoniously.
What Does Kubernetes Offer?
Kubernetes is capable of creating highly resilient distributed systems. It can also make the system bigger or smaller as needed. It has lots of tools to help make sure the system is reliable and stays up and running. Here are some of the things it can do.
Service Discovery: Kubernetes can serve containers using domain names or public IP Addresses.
Load Balancing: In case traffic to the website is high, Kubernetes can distribute traffic among containers. This makes sure that the application is stable.
Automatic Rollout and Rollback: Kubernetes controls the state of the distributed system. A user defines the desired state of the system and Kubernetes matches the actual state with the desired state in a controlled manner.
Self Healing: Kubernetes destroys, deploys, and manages pods based on requirement. If a pod is not working as expected, Kubernetes automatically replaces it with a new pod.
Secret and Configuration Management: Secrets and configurations can be updated without rebuilding the containers or exposing the password or SSH keys in the configuration stack.
Horizontal Scaling: Kubernetes is extremely scalable. New nodes can be added or removed based on commands or usage.
Kubernetes Architecture
Kubernetes creates a Kubernetes Cluster. A cluster is a group of machines that work together like a single system. A Kubernetes cluster consists of a Control Plane and Nodes. The Control Plane is responsible for managing the nodes in the cluster. In a production environment, it is advisable to use at least three Control Plane nodes.
Nodes are the machines where Kubernetes creates Pods. Pods are the building blocks of the application workload. Each Pod can contain one or more containers. It's generally recommended to use only one container in a Pod. A Kubernetes Cluster incorporates multiple nodes to ensure high availability and fault tolerance.
Control Plane Components:
Components of the Control Plane make the decisions in a Kubernetes Cluster. These components are as follows.
kube-apiserver: Kube API Server exposes the Kubernetes API. It is a gateway to the Kubernetes cluster. A user or system communicates with Kubernetes using the API Server. For example, the kubectl command forwards requests to the API Server.
etcd: etcd is a highly-available distributed key-value store. It is considered the single point of truth in the Kubernetes cluster at any given time. It is capable of machine failure and fault tolerance.
kube-scheduler: Scheduler looks for newly created POD with no node assigned. Then it selects a suitable node for the POD to run on. When kubectl runs a pod creation command, api server receives the request and updates the state in etcd by adding a key-value pair. Now, the Kubernetes cluster wants to match the actual state with the desired state. The scheduler gets informed about this new POD and selects a suitable node to deploy the POD. Once deployed, the desired and actual state of the Kubernetes cluster becomes the same.
kube-controller-manager: It is not a single component, it is actually a combination of node controller, job controller, EndPointSlice controller, ServiceAccount controller, and many more. All controllers are combined into a single binary and run together to reduce complexity.
Node Components:
Every node in the Kubernetes cluster runs these components. These components make sure the Kubernetes runtime environment is running and PODs are working as expected.
kubelet: Kublet is a Kubernetes agent that runs on every node in the Kubernetes cluster. kubelet is responsible for running containers within POD. kubelet checks the containers running are the same as mentioned in PodSpecs. It monitors the health of the containers as well.
kube-proxy: Kube proxy implements a specific part of Kubernetes Service in worker nodes. Network rules within a POD are maintained by kube-proxy. These rules allow/deny access to POD based on configuration. If the OS packet filtering layer is available kube-proxy uses that or else it forwards the traffic.
Container Runtime: At first, Kubernetes used Docker as the container runtime. Now, Kubernetes only uses containerd, a stripped-of container runtime from docker, as its default container runtime. This container runtime manages the container operations like create, destroy, update, image build, and others.
What is Container Orchestration?
Container Orchestration is the operation that deploys, creates, provisions, scales, and manages containerized applications automatically and without the need to worry about underlined infrastructure. Any system that supports containerization, can be used for container orchestration. The application that maintains the process of orchestrator is called Container Orchestrator.
Container Orchestrators use a declarative approach. A user defines the desired output instead of specifying how to achieve it. A user writes configuration file, most probably a YAML file, and based on this file, the orchestrator achieves desired state of the application. Configurations like the location of container images, container storage and resources, the process of creating a secured network connection, and many more.
Different Container Orchestrators
Many orchestrators are available in the market. The following list represents the most popular and reliable ones.
Kubernetes: This article has already described what Kubernetes is and its internal architecture. It is one of the most versatile and configurable orchestrators available in the market right now.
Redhat Openshift: Openshift is based on the Redhat open-source application OKD or Origin Kubernetes Distribution. It is a hybrid cloud platform powered by Kubernetes.
Google Kubernetes Engine (GKE): One of the most scalable and fully automated Kubernetes services provided by Google. GKE uses hands-off operations called Autopilot that manage your cluster’s underlying compute. Autopilot makes sure users pay for running pods only.
Amazon Elastic Kubernetes Service (Amazon EKS): Elastic Kubernetes Service is a managed Kubernetes service. EKS runs PODs on AWS cloud and on-premise data centers.
Azure Kubernetes Service (AKS): AKS creates production-ready Kubernetes cluster in Azure Cloud. AKS is well known for its security and fast delivery.
Docker Swarm: Docker Swarm is compatible with docker containers and it orchestrates only docker containers. Docker Swarm has Swarm Manager that controls the activities in the cluster. Docker Swarm and Kubernetes once were fighting to be the go-to solution for container orchestration but clearly, Kubernetes has its own. While complexity and exhaustive configurations can be overkill for a project, docker swarm is capable of introducing balance in small to medium-sized projects.
It is now clear almost all the managed orchestrator services use Kubernetes at its backbone. So learning Kubernetes can provide a good understanding of how these services run. Moreover, if needed user can make complex configuration changes in these managed services with confidence and without braking the deployment environment.