What Is Docker? Actually Understanding Container Technology

"It works on my machine" — perhaps the most classic sentence in software development. Docker buried that sentence in history.

When Docker emerged in 2013, it fundamentally changed how software is packaged and distributed. Today, almost every modern application's infrastructure includes Docker or container technology. But most developers describe Docker as "kind of like a virtual machine" and leave it there. This article goes far beyond that description.

The Problem: Environment Inconsistency

Code alone isn't enough for an application to run. Does it need Python 3.9 or 3.11? Which system libraries does it depend on? What environment variables does it expect? Which port will it run on?

Everything is installed and configured on the developer's machine. Slightly different on the CI/CD server. Even more different in production. This inconsistency is the most insidious source of bugs — the code is correct, but the environment is wrong.

Docker solves this problem by packaging the application with everything it needs to run. Code, runtime, libraries, system tools, configuration — all together. Wherever you move it, it runs in the same environment.

Containers vs Virtual Machines: The Critical Difference

Docker containers are often confused with virtual machines (VMs). Both provide isolation but work very differently.

A virtual machine simulates an entire physical computer. It has its own OS kernel, its own memory management, its own everything. That's why it's heavy — several gigabytes of disk space, takes minutes to start.

A container shares the host operating system's kernel. It only packages the libraries and files the application needs. Megabytes in size, starts in seconds.

code
Virtual Machine:           Container:
┌──────────────┐           ┌──────────────┐
│  Application │           │  Application │
├──────────────┤           ├──────────────┤
│  Guest OS    │           │  Libraries   │
├──────────────┤           ├──────────────┤
│  Hypervisor  │           │  Container   │
├──────────────┤           │  Engine      │
│   Host OS    │           ├──────────────┤
├──────────────┤           │   Host OS    │
│   Hardware   │           ├──────────────┤
└──────────────┘           │   Hardware   │
                           └──────────────┘

Image and Container: Two Core Concepts

Understanding Docker requires clearly distinguishing two concepts: image and container.

An image is a template. It defines how the application is packaged. Read-only, immutable. Think of it like a class definition.

A container is an instance running from an image. Like an object created from a class. You can run dozens of containers from the same image, each independent.

bash
# Pull an image
docker pull nginx:latest

# Create and run a container from the image
docker run -d -p 8080:80 --name my-nginx nginx:latest

# See running containers
docker ps

# Stop the container
docker stop my-nginx

# Remove the container
docker rm my-nginx

Dockerfile: How to Build an Image

A Dockerfile is the file that defines step-by-step how an image is built. Each line creates a layer, and Docker caches these layers — which dramatically reduces build time.

dockerfile
# Base image: Node.js 20 (Alpine Linux - small size)
FROM node:20-alpine

# Set working directory
WORKDIR /app

# Copy only package.json files first (cache optimization)
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application code
COPY . .

# Indicate which port the app uses
EXPOSE 3000

# Command to run when container starts
CMD ["node", "src/index.js"]

Why copy package.json first? Thanks to Docker's layer cache, the npm ci step doesn't re-run as long as dependencies haven't changed. Only when code changes does the rebuild start from the COPY . . layer. This saves minutes on large projects.

Multi-Stage Build: Production-Ready Images

Tools needed for development (TypeScript compiler, test frameworks, build tools) shouldn't go into the production image. With multi-stage builds, you separate the build environment from the production environment.

dockerfile
# Stage 1: Builder
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build   # compile TypeScript

# Stage 2: Production
FROM node:20-alpine
WORKDIR /app

# Install only production dependencies
COPY package*.json ./
RUN npm ci --only=production

# Get only compiled code from builder stage
COPY --from=builder /app/dist ./dist

USER node
EXPOSE 3000
CMD ["node", "dist/index.js"]

Result: ~150MB production image instead of ~800MB including build tools.

Docker Compose: Managing Multiple Containers

Real applications aren't made of a single container. Backend API, PostgreSQL database, Redis cache, Nginx reverse proxy — all running in separate containers that need to talk to each other.

docker-compose.yml defines these containers together and brings them all up with a single command.

yaml
version: '3.8'

services:
  # Backend API
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgresql://user:password@db:5432/myapp
      - REDIS_URL=redis://cache:6379
    depends_on:
      - db
      - cache
    volumes:
      - .:/app          # hot reload for development
      - /app/node_modules

  # PostgreSQL Database
  db:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
    volumes:
      - postgres_data:/var/lib/postgresql/data  # persist data

  # Redis Cache
  cache:
    image: redis:7-alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

bash
docker-compose up -d      # start all services in background
docker-compose logs -f    # follow logs
docker-compose down       # stop all services
docker-compose down -v    # also delete volumes (clear data)

Volumes: Making Data Persistent

Containers are ephemeral. When a container is deleted, its data goes with it. Database data, uploaded files, logs — volumes are used to prevent these from being lost.

A volume mounts a directory on the host machine into the container. Even if the container is deleted, the data in the volume remains.

bash
# Create a named volume
docker volume create myapp-data

# Mount volume to container
docker run -v myapp-data:/app/data myapp

# For development: mount code directory directly (hot reload)
docker run -v $(pwd):/app -v /app/node_modules myapp

Networking: How Containers Talk to Each Other

Containers created with Docker Compose automatically join the same network and can reach each other by service name. The api service connects to the db service using the hostname db instead of localhost.

This isolation matters — the container network is separate from the outside world. You decide which ports are exposed externally. You can provide access only to your API without exposing your database to the outside.

What You Lose Without Docker

Learning Docker can feel like overhead at first. But think about what it provides: developer environment setup drops to minutes, the "works on my machine" problem disappears, you develop in the same environment as production, scaling comes down to a single command.

Orchestration platforms like Kubernetes, Docker Swarm, and AWS ECS are all built on top of container technology. Moving to these platforms without understanding containers is flying blind.

Docker is less about learning a tool and more about adopting a way of thinking: everything an application needs to run is packaged with the code, and the environment travels from the inside, not the outside.