kitestacks-homelab/homelab-mastery/architecture/overview.md
kenpat 0d3fc4051c merge: add homelab-mastery as subdir
Moved homelab-mastery repo content into homelab-mastery/ subdirectory.
Covers architecture, concepts, certifications, interview-prep, and learning-path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 00:33:54 -05:00

10 KiB

KiteStacks Architecture — Full System Overview

The Big Picture

                        INTERNET
                           │
                    ┌──────▼──────┐
                    │  Cloudflare  │  DNS + TLS termination
                    │   (edge)     │  Zero Trust Tunnel
                    └──────┬──────┘
                           │  HTTPS (443) only
          ┌────────────────┼────────────────┐
          │ connector 1    │ connector 2    │ connector 3
          │                │               │
   ┌──────▼──────┐         │        ┌──────▼──────┐
   │    MONK     │         │        │   KSCLOUD1  │
   │ (home PC)   │         │        │ (Hetzner VPS│
   │             │  Active │        │  5.78.x.x)  │
   │ All 9       │  Active │        │             │
   │ services    │         │        │ All 9       │
   │             │         │        │ services    │
   └──────┬──────┘         │        └──────┬──────┘
          │                │               │
          └────────────────┼───────────────┘
                     TAILSCALE VPN
                    (100.x.x.x range)
                           │
                  ┌────────▼────────┐
                  │  SHARED DB LAYER │
                  │  on kscloud1    │
                  │  Postgres :5432  │
                  │  Redis    :6379  │
                  │  (Tailscale     │
                  │   only, private)│
                  └─────────────────┘

Every Service and What It Does

The Nine Public Services

Service Container Name What It Does Why It's Here
Portal homepage The public website (kitestacks.com) — custom nginx serving static HTML/CSS/JS with a cyberpunk theme Front door to everything. Shows system stats, recent activity, links to all services
Authentik authentik Identity provider — handles all logins via OIDC/OAuth2 SSO Single place to manage all user accounts and access control
Forgejo forgejo Self-hosted Git platform (like GitHub but yours) Store all homelab code, config, and documentation
OpenProject openproject Project management (like Jira) Task tracking, project planning
Open WebUI kite-openwebui ChatGPT-like AI chat interface Access multiple AI models through one interface
Karakeep karakeep Bookmark and read-it-later manager Save links, articles, and content
Kavita kavita eBook and manga reader Personal digital library
Grafana grafana Monitoring dashboards Visualize CPU, RAM, network, uptime across both hosts
Uptime Kuma uptime-kuma Status page and uptime monitoring Monitor that all 9 services are up and alert if they go down

The Infrastructure Services (Not Public-Facing)

Service What It Does
cloudflared Cloudflare Tunnel connector — creates encrypted outbound tunnel to Cloudflare edge
prometheus Metrics collection — scrapes system stats from both monk and kscloud1 every 15 seconds
node-exporter Exposes host system metrics (CPU, RAM, disk, network) for Prometheus to scrape
kite-litellm LLM proxy gateway — routes AI requests to OpenRouter (multiple free models)
portainer Docker management UI — visual interface to manage all containers
kitestacks-metrics-api Python FastAPI service — serves real-time system stats, weather, and Forgejo activity to the portal

How Traffic Flows

When Someone Visits www.kitestacks.com

1. Browser sends HTTPS request to www.kitestacks.com
2. DNS resolves to Cloudflare's anycast IP (not your home IP)
3. Cloudflare terminates TLS — your home router never sees HTTPS
4. Cloudflare routes the request through the tunnel to whichever
   cloudflared connector responds first (monk or kscloud1)
5. cloudflared resolves "homepage" via Docker DNS
6. Request hits the nginx container serving the static portal
7. Portal's JavaScript fetches /api/metrics and /api/activity
   from the kitestacks-metrics-api container via nginx proxy
8. Page renders with live system stats and recent git activity

When Someone Clicks "Sign In with Authentik"

1. App (e.g., Grafana) redirects browser to auth.kitestacks.com/application/o/authorize/
2. Authentik presents login page
3. User enters credentials — Authentik validates against its database
   (stored on kscloud1's Postgres, shared over Tailscale)
4. Authentik generates an authorization code and redirects back to Grafana
5. Grafana's backend calls auth.kitestacks.com/application/o/token/
   to exchange the code for an access token
6. Authentik validates the code (found in shared DB) and returns a JWT
7. Grafana reads the user's email/name from the JWT and logs them in

The critical detail: Steps 1 and 5 can hit different tunnel connectors (monk vs kscloud1). The authorization code from step 4 must exist in whichever database step 5 hits. That's why both connectors point to the SAME Postgres on kscloud1 — otherwise step 5 returns invalid_grant because the code isn't found.


The Two Hosts in Detail

Monk (Primary Home Machine)

  • Role: Primary production host
  • Network: Home LAN, no open ports on router (Cloudflare Tunnel handles all inbound)
  • Services: All 9 public services + all infrastructure services
  • Data: Each service has its own database/storage
  • Authentik DB: Points to kscloud1's Postgres over Tailscale (100.x.x.x)

kscloud1 (Hetzner VPS)

  • Role: Permanent cloud replica — always on, even when monk is off (travel, power outage, etc.)
  • Network: Public IP, Cloudflare Tunnel connector 3
  • Services: Full replica of all 9 public services (separate databases except Authentik)
  • Hosts: The shared Authentik Postgres + Redis (bound to Tailscale interface only)
  • Resources: 3 vCPU, 3.7 GB RAM — tight but functional

What's the Same Across Both

  • Same Cloudflare Tunnel token (different connector IDs assigned automatically)
  • Same Authentik database (shared via Tailscale)
  • Same Authentik secret key (required for JWT validation)
  • Same kavita.db (one-time sync — users and OIDC config)

What's Different Across Both

  • Forgejo data (separate repos — accepted inconsistency)
  • OpenProject data (separate projects)
  • Karakeep bookmarks (separate)
  • Kavita book files (monk has them, kscloud1 doesn't — covers synced, books not)

The Docker Network

Every container joins the kitestacks external Docker bridge network:

docker network create kitestacks

This is what makes Cloudflare Tunnel work. The cloudflared container is also on this network, so when Cloudflare tells cloudflared to route http://grafana:3000, Docker's internal DNS resolves grafana to the grafana container's IP on that network.

Without this shared network, cloudflared can't reach the service containers by name.


Why No Open Ports on the Router

Traditional homelab: open port 80/443 on home router → NAT to home server → expose home IP.

Problems with that:

  • Your home IP is public (DDoS risk, targeted attacks)
  • Router configuration is fragile
  • ISP can change your IP (dynamic IP)
  • Some ISPs block port 80/443

Cloudflare Tunnel approach:

  • cloudflared container makes an OUTBOUND connection to Cloudflare
  • Cloudflare holds that connection open
  • Inbound requests come through Cloudflare, over that existing outbound tunnel
  • Your home IP is never exposed
  • Works on any network, any ISP, any firewall

This is why you can run a public website from a home PC with zero router configuration.


Tailscale — The Private Backbone

Tailscale creates a private overlay network (VPN mesh) across all your devices:

monk (100.x.x.x) ←—— encrypted ——→ kscloud1 (100.x.x.x)
monk (100.x.x.x) ←—— encrypted ——→ pixel-6 (100.x.x.x)

Used in this project for:

  1. Shared Authentik DB: kscloud1's Postgres binds to its Tailscale IP, not its public IP. Only devices on the tailnet can connect. Monk points to that address.
  2. Forgejo activity feed: On kscloud1, the metrics API fetches recent commits from monk's Forgejo via monk's Tailscale IP — so both portal instances show the same activity feed.
  3. SSH/Admin access: You can SSH into any device on the tailnet from anywhere.

The Monitoring Stack

node-exporter (monk)  →  prometheus (monk)  →  grafana (monk)
node-exporter (kscloud1) ↗       (scrapes 5.78.x.x:9100)

Prometheus scrapes metrics every 15 seconds from:

  • node-exporter:9100 — monk's own node-exporter (via Docker DNS)
  • 5.78.x.x:9100 — kscloud1's node-exporter (via public IP, port exposed 0.0.0.0)

Grafana visualizes both, letting you switch between hosts in the instance picker.


The Portal Architecture

The portal is NOT gethomepage or any pre-built dashboard. It's a custom-built static site:

nginx (container: "homepage")
  ├── /         → serves static HTML/CSS/JS from ./public/
  └── /api/*    → proxy_pass to kitestacks-metrics-api:8000 (host)

kitestacks-metrics-api (network_mode: host, pid: host)
  ├── GET /api/metrics   → psutil reads HOST's CPU/RAM/disk/network
  ├── GET /api/weather   → wttr.in API → current weather by IP geolocation
  ├── GET /api/activity  → Forgejo API → recent commits
  └── GET /api/health    → {"ok": true}

The metrics API runs with network_mode: host and pid: host so it reads the HOST machine's process table and /proc filesystem — not the container's. Without this, it would report container stats, not laptop stats.