kitestacks-homelab/homelab-mastery/build-guide/without-ai/04-docker-deep-dive.md

# Without AI — Part 4: Docker Deep Dive

**Track:** Advanced (No AI)
**Time for this section:** 1–2 weeks

Docker is the technology that runs every service in this homelab. Understanding it
deeply — not just copying compose files — is what separates someone who can maintain
and troubleshoot a homelab from someone who hopes nothing breaks.

---

## What Docker Actually Is

Most explanations say "containers are like lightweight VMs." That is wrong and leads
to confusion. Here is what a container actually is:

**A container is a Linux process with isolation applied.**

Two Linux kernel features provide that isolation:

**Namespaces** — the container gets its own view of:
- Filesystem (it sees `/` but it is a different tree than the host's `/`)
- Network interfaces (its own `eth0`, its own IP on the Docker network)
- Process list (it can only see its own processes, not the host's)
- User IDs (it can be "root" inside without being root on the host)

**cgroups (control groups)** — limits how much of the host's resources the container can use:
- CPU cores and usage limits
- RAM limits
- Disk I/O limits
- Network bandwidth limits

**Result:** No second kernel, no hardware emulation, no hypervisor. The nginx process
in your `homepage` container is a regular Linux process on your machine — it just
thinks it is alone.

---

## Images vs Containers

```
Image                        Container
─────────────────────────    ─────────────────────────────────────────
A recipe                     A running instance made from the recipe
Read-only, immutable         Has a writable layer on top of the image
Stored in layers             One writable layer per container
Shared across containers     Separate per container
Survives container deletion  Deleted with the container (unless volume)
```

**Layers:** Docker images are built in layers. Each line in a `Dockerfile` creates a layer.
If you update one layer, only that layer is re-downloaded. This is why pulling an update
is fast — most layers are already local.

```bash
docker image ls                         # List local images
docker image inspect nginx:alpine       # See image metadata and layers
docker image history nginx:alpine       # See how the image was built, layer by layer
docker image pull postgres:16-alpine    # Download an image explicitly
docker image rm nginx:alpine            # Remove a local image
```

---

## Docker Networks — In Depth

Docker provides several networking modes:

**bridge (default):** Container gets its own virtual network interface with a private IP
(172.x.x.x range). Containers on the same bridge network can reach each other by IP
or by name (via Docker's built-in DNS). Containers on different bridge networks are isolated.

**host:** Container shares the host's network namespace entirely. `--network host` means
no isolation — the container sees all host network interfaces and binds directly to
host ports. Used for kitestacks-metrics-api so psutil can see real network stats.

**none:** No networking at all. Rarely used.

```bash
# Create a named bridge network
docker network create kitestacks

# See all networks
docker network ls

# Inspect a network — see which containers are connected and their IPs
docker network inspect kitestacks

# Connect a running container to a network
docker network connect kitestacks my-container

# Disconnect
docker network disconnect kitestacks my-container
```

**The DNS trick:** When two containers are on the same bridge network, Docker runs a
DNS server at `127.0.0.11` inside each container. Container names resolve to their
internal IPs. This is why `cloudflared` can connect to `http://grafana:3000` —
Docker DNS resolves `grafana` to the grafana container's IP.

```bash
# Verify DNS works from inside a container
docker exec cloudflared nslookup grafana
docker exec cloudflared curl -s http://grafana:3000/api/health
```

---

## Volumes — Persisting Data

Containers are ephemeral. When you delete a container, its writable layer is gone.
To keep data, you use volumes.

**Bind mount:** You choose the path on the host.
```yaml
volumes:
  - ./data:/forgejo-data           # host path : container path
  - /home/kenpat/books:/books:ro   # :ro = read-only
```
Data is at `./data` on the host. You can navigate there with `cd`. You can back it up.

**Named volume:** Docker manages the path.
```yaml
volumes:
  - uptime-kuma:/app/data

volumes:
  uptime-kuma:              # define the named volume
```
Data is at `/var/lib/docker/volumes/uptime-kuma/_data/` on the host (Docker manages this).

```bash
docker volume ls                            # List named volumes
docker volume inspect uptime-kuma           # See where it is stored
docker volume rm uptime-kuma                # Delete a volume (and its data!)
```

**Access a named volume from a one-off container:**
```bash
docker run --rm -v uptime-kuma:/data alpine ls /data
```

This is the pattern used throughout this homelab to read or modify volumes without
stopping the running service (for reads) or after stopping it (for writes).

---

## Docker Compose — The Full Picture

Docker Compose reads a YAML file and manages the lifecycle of multiple containers.

```yaml
services:
  forgejo:
    image: codeberg.org/forgejo/forgejo:latest
    container_name: forgejo           # Fixed name (not random)
    restart: unless-stopped           # Restart on crash or host reboot
    env_file: .env                    # Load environment variables from file
    environment:
      FORGEJO__server__DOMAIN: gitforge.kitestacks.com   # Override one env var
    volumes:
      - ./data:/data                  # Bind mount: ./data on host → /data in container
    ports:
      - "127.0.0.1:2222:22"          # Bind host 127.0.0.1:2222 to container port 22 (SSH)
    networks:
      - kitestacks
    depends_on:
      - authentik-postgres             # Start this service before forgejo
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

networks:
  kitestacks:
    external: true                    # Use existing network (don't create a new one)
```

**Key fields explained:**

`restart: unless-stopped`
- `no` — never restart
- `always` — always restart, even on manual stop
- `on-failure` — restart only if exit code is non-zero
- `unless-stopped` — restart on crash or reboot, but not if you manually stopped it

`env_file: .env`
Reads `KEY=VALUE` pairs from a file. The `.env` file is in `.gitignore` so secrets
never get committed to git. Always use this for passwords, tokens, and secrets.

`depends_on`
Starts services in dependency order. Does NOT wait for a service to be "ready" —
just waits for the container to START. If you need to wait for a database to be ready,
add a health check and use `condition: service_healthy`.

**Common commands:**
```bash
docker compose up -d              # Start all services in background
docker compose down               # Stop and remove containers (not volumes)
docker compose down -v            # Stop, remove containers AND volumes (data loss!)
docker compose restart forgejo    # Restart one service
docker compose pull               # Pull latest images
docker compose logs -f forgejo    # Follow logs for one service
docker compose ps                 # Show service status
docker compose exec forgejo bash  # Open shell in running service
docker compose config             # Validate and show merged config
```

---

## Port Mappings — When to Use Them

```yaml
ports:
  - "3005:3000"           # host_port:container_port
  - "127.0.0.1:3005:3000" # bind to localhost only (not accessible from outside host)
  - "0.0.0.0:9100:9100"   # bind on all interfaces (accessible from outside)
```

**In this homelab, most services do NOT expose host ports** — they only communicate
through the Docker network. Cloudflare Tunnel connects directly to the container via
the Docker bridge network, so no host ports are needed for public services.

The only services that need host ports:
- `node-exporter` on kscloud1 (so Prometheus on monk can scrape it via public IP)
- `kitestacks-metrics-api` does NOT use ports — it uses `network_mode: host`
- `portainer` uses 9443 (HTTPS)

---

## Inspecting and Debugging

```bash
# See everything about a container
docker inspect forgejo

# See just its IP address on each network
docker inspect forgejo --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'

# See its environment variables (careful — this shows secrets!)
docker inspect forgejo --format '{{range .Config.Env}}{{println .}}{{end}}'

# See its mounts
docker inspect forgejo --format '{{json .Mounts}}' | python3 -m json.tool

# See resource usage
docker stats                    # Live, all containers
docker stats forgejo --no-stream # One snapshot for one container

# See what the container's filesystem looks like
docker exec forgejo ls /
docker exec forgejo cat /etc/forgejo/app.ini
docker exec forgejo find /data -name "*.db" 2>/dev/null
```

---

## Common Gotchas

**Containers share the host's kernel:** If you run an Alpine-based image but your
host kernel is too old, some syscalls may not work. Rare but real.

**Named volumes are invisible by default:** New developers spend hours wondering where
data went after deleting a container. Named volumes survive `docker compose down`.
They do NOT survive `docker compose down -v`.

**Order vs readiness:** `depends_on` does not mean "wait until ready." A Postgres
container starts in milliseconds, but PostgreSQL inside it takes 3–5 seconds to accept
connections. Use healthchecks for real readiness checking.

**Port conflicts:** Two containers cannot bind the same host port. If you get
`Bind for 0.0.0.0:3000 failed: port is already allocated`, something else is already
using that host port.

**network_mode: host and named networks cannot coexist:**
```yaml
network_mode: host    # This means the container has NO network isolation
# You cannot also add networks: [...] — they conflict
```

---

## Practice Exercises

1. Pull the `nginx:alpine` image and run it: `docker run -d -p 8080:80 nginx:alpine`
   Visit `http://localhost:8080`. Then exec into it and find the nginx config.

2. Run two containers (`alpine`) on the same custom network and verify they can
   ping each other by container name

3. Create a named volume and mount it in two different containers. Write a file from
   one container and read it from the other

4. Write a `docker-compose.yml` with three services: one nginx, one redis, one alpine
   that waits for redis to be healthy before starting

5. Use `docker inspect` to find the IP address of your `forgejo` container on the
   `kitestacks` network. Confirm it matches what Docker DNS resolves.

---

**Next:** [Part 5 — Networking](05-networking.md)