What Is Kubernetes? Understanding Container Orchestration From Scratch
You learned Docker. You're running containers. The application comes up, works the same everywhere.
Now you want to go to production. A single container isn't enough — high traffic requires multiple instances. When a container crashes, it should automatically restart. When traffic spikes, new instances should be added. While a new version deploys, the old version should still be running. The health of all these containers needs to be monitored.
Doing this by hand isn't viable. Kubernetes solves this problem.
What Is Kubernetes?
Kubernetes (K8s) is an open-source orchestration platform that automates the deployment, scaling, and management of containers. Developed by Google, open-sourced in 2014. Today it's the de facto standard of cloud-native software.
Kubernetes's core promise: you define what you want the application to look like, Kubernetes figures out how to make it happen.
# "Always have 3 instances running" — Kubernetes handles it replicas: 3
If an instance crashes, Kubernetes starts a new one. If a node (server) goes down, it moves containers to other nodes. When traffic increases, it adds new instances.
Core Concepts
Pod: The smallest deployable unit in Kubernetes. Contains one or more containers. Containers in the same pod share the same network and storage.
Node: The physical or virtual server where pods run. Kubernetes manages nodes in a cluster — it decides which pod runs on which node (scheduling).
Cluster: The Kubernetes environment made up of multiple nodes. A control plane (master node) manages the cluster, worker nodes run the workloads.
Cluster:
┌─────────────────────────────────────────┐
│ Control Plane │
│ (API Server, Scheduler, etcd) │
├────────────┬────────────┬───────────────┤
│ Node 1 │ Node 2 │ Node 3 │
│ Pod A │ Pod A │ Pod B │
│ Pod B │ Pod C │ Pod C │
└────────────┴────────────┴───────────────┘
Deployment: Managing Pods
Instead of creating pods directly, you typically use a Deployment. A Deployment defines how many pods should run, which image to use, and the update strategy.
apiVersion: apps/v1 kind: Deployment metadata: name: my-app spec: replicas: 3 selector: matchLabels: app: my-app template: metadata: labels: app: my-app spec: containers: - name: app image: my-app:v1.0 resources: requests: memory: "128Mi" cpu: "250m" limits: memory: "256Mi" cpu: "500m" readinessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 5 periodSeconds: 10
readinessProbe is critical: Kubernetes uses it to know whether a pod is actually ready. If the probe fails, no traffic is sent to that pod.
Rolling update: When a new image is deployed, Kubernetes shuts down old pods one by one and starts new ones one by one. Version updated with zero downtime.
# Deploy new version kubectl set image deployment/my-app app=my-app:v2.0 # Watch status kubectl rollout status deployment/my-app # Roll back if problems arise kubectl rollout undo deployment/my-app
Service: Reaching Pods
Pod IP addresses change constantly — a pod gets a new IP when it restarts. A Service places a stable endpoint in front of pods.
apiVersion: v1 kind: Service metadata: name: my-app-service spec: selector: app: my-app ports: - port: 80 targetPort: 3000 type: ClusterIP
Ingress: Opening to the Outside World
LoadBalancer gets a separate external IP for each service — expensive. Ingress routes to multiple services through a single external endpoint.
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: my-ingress spec: rules: - host: api.example.com http: paths: - path: /users pathType: Prefix backend: service: name: user-service port: number: 80 - path: /orders pathType: Prefix backend: service: name: order-service port: number: 80
One IP, multiple services.
HorizontalPodAutoscaler: Automatic Scaling
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: my-app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-app minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70
When CPU usage exceeds 70%, Kubernetes automatically adds new pods. When it drops, it scales back down. Without you doing a thing.
When Is Kubernetes Actually Needed?
Kubernetes is powerful but costly — steep learning curve, operational complexity, resource requirements.
You don't need Kubernetes yet: Small application running on a single server, team smaller than 5, simple deployment needs.
Consider Kubernetes: Multiple services need to scale independently, high availability required, team has reached CI/CD maturity, cloud-native infrastructure being built.
Moving to Kubernetes without understanding Docker is flying blind. But once you understand Docker, Kubernetes is the inevitable next step for managing containers in production.