What Is Kubernetes? Understanding Container Orchestration From Scratch

You learned Docker. You're running containers. The application comes up, works the same everywhere.

Now you want to go to production. A single container isn't enough — high traffic requires multiple instances. When a container crashes, it should automatically restart. When traffic spikes, new instances should be added. While a new version deploys, the old version should still be running. The health of all these containers needs to be monitored.

Doing this by hand isn't viable. Kubernetes solves this problem.

What Is Kubernetes?

Kubernetes (K8s) is an open-source orchestration platform that automates the deployment, scaling, and management of containers. Developed by Google, open-sourced in 2014. Today it's the de facto standard of cloud-native software.

Kubernetes's core promise: you define what you want the application to look like, Kubernetes figures out how to make it happen.

yaml
# "Always have 3 instances running" — Kubernetes handles it
replicas: 3

If an instance crashes, Kubernetes starts a new one. If a node (server) goes down, it moves containers to other nodes. When traffic increases, it adds new instances.

Core Concepts

Pod: The smallest deployable unit in Kubernetes. Contains one or more containers. Containers in the same pod share the same network and storage.

Node: The physical or virtual server where pods run. Kubernetes manages nodes in a cluster — it decides which pod runs on which node (scheduling).

Cluster: The Kubernetes environment made up of multiple nodes. A control plane (master node) manages the cluster, worker nodes run the workloads.

code
Cluster:
┌─────────────────────────────────────────┐
│  Control Plane                          │
│  (API Server, Scheduler, etcd)          │
├────────────┬────────────┬───────────────┤
│  Node 1    │  Node 2    │  Node 3       │
│  Pod A     │  Pod A     │  Pod B        │
│  Pod B     │  Pod C     │  Pod C        │
└────────────┴────────────┴───────────────┘

Deployment: Managing Pods

Instead of creating pods directly, you typically use a Deployment. A Deployment defines how many pods should run, which image to use, and the update strategy.

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: app
          image: my-app:v1.0
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "256Mi"
              cpu: "500m"
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10

readinessProbe is critical: Kubernetes uses it to know whether a pod is actually ready. If the probe fails, no traffic is sent to that pod.

Rolling update: When a new image is deployed, Kubernetes shuts down old pods one by one and starts new ones one by one. Version updated with zero downtime.

bash
# Deploy new version
kubectl set image deployment/my-app app=my-app:v2.0

# Watch status
kubectl rollout status deployment/my-app

# Roll back if problems arise
kubectl rollout undo deployment/my-app

Service: Reaching Pods

Pod IP addresses change constantly — a pod gets a new IP when it restarts. A Service places a stable endpoint in front of pods.

yaml
apiVersion: v1
kind: Service
metadata:
  name: my-app-service
spec:
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000
  type: ClusterIP

Ingress: Opening to the Outside World

LoadBalancer gets a separate external IP for each service — expensive. Ingress routes to multiple services through a single external endpoint.

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /users
            pathType: Prefix
            backend:
              service:
                name: user-service
                port:
                  number: 80
          - path: /orders
            pathType: Prefix
            backend:
              service:
                name: order-service
                port:
                  number: 80

One IP, multiple services.

HorizontalPodAutoscaler: Automatic Scaling

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

When CPU usage exceeds 70%, Kubernetes automatically adds new pods. When it drops, it scales back down. Without you doing a thing.

When Is Kubernetes Actually Needed?

Kubernetes is powerful but costly — steep learning curve, operational complexity, resource requirements.

You don't need Kubernetes yet: Small application running on a single server, team smaller than 5, simple deployment needs.

Consider Kubernetes: Multiple services need to scale independently, high availability required, team has reached CI/CD maturity, cloud-native infrastructure being built.

Moving to Kubernetes without understanding Docker is flying blind. But once you understand Docker, Kubernetes is the inevitable next step for managing containers in production.