This repository has been archived on 2026-06-19. You can view files and clone it, but you cannot make any changes to it's state, such as pushing and creating new issues, pull requests or comments.
homelab-mastery/architecture/overview.md

7.9 KiB

KiteStacks Architecture Overview

Last updated: 2026-06-18
Status: Active production homelab


High-Level Architecture

┌──────────────────────────────────────────────────┐
│               Public Internet                    │
│           (via Cloudflare Tunnel)                │
└───────────────────────┬──────────────────────────┘
                        │
          ┌─────────────▼──────────────┐
          │     Cloudflare Zero Trust   │
          │     Active-Active Tunnel    │
          └──────┬────────────┬────────┘
                 │            │
    ┌────────────▼───┐  ┌─────▼──────────────┐
    │   monk (home)  │  │   kscloud1 (Hetzner)│
    │ cloudflared    │  │ cloudflared         │
    │ All services   │  │ Replica services    │
    │ Tailscale mesh │  │ Shared Authentik DB │
    └────────────────┘  └─────────────────────┘
           │                    │
           └────────────────────┘
              Tailscale overlay
              (private network)

The two machines share one Cloudflare Tunnel token, so Cloudflare load-balances across both connectors automatically. If monk goes offline, kscloud1 continues serving all public subdomains within seconds.


Service Map

Identity & Access

Service Host URL Purpose
Authentik server monk auth.kitestacks.com SSO identity provider
Authentik worker monk (internal) Background jobs, flow execution
Authentik LDAP monk (internal) LDAP proxy for non-OIDC apps
Authentik PostgreSQL kscloud1 (Tailscale only) Shared auth database
Authentik Redis kscloud1 (Tailscale only) Session cache

Infrastructure

Service Host URL Purpose
cloudflared monk + kscloud1 (no UI) CF Tunnel connector
Portainer monk portainer.kitestacks.com Docker container management
Forgejo monk gitforge.kitestacks.com Self-hosted Git (repos + CI)
Uptime Kuma monk status.kitestacks.com Service uptime monitoring

Observability

Service Host URL Purpose
Prometheus monk (internal) Metrics collection
Grafana monk grafana.kitestacks.com Metrics dashboards
Node Exporter monk (internal) Host OS metrics
Blackbox Exporter monk (internal) External endpoint probing

Knowledge & Productivity

Service Host URL Purpose
BookStack monk + kscloud1 wiki.kitestacks.com Internal wiki / documentation
Karakeep monk links.kitestacks.com Bookmark manager
Kavita monk kavita.kitestacks.com Ebook/manga reader
OSTicket monk tasks.kitestacks.com Help desk / ticket system
ntfy monk (push notifications) Push notifications

AI Stack

Service Host URL Purpose
Open WebUI monk ai.kitestacks.com Chat interface (GPT-4, Claude, local)
LiteLLM monk (internal) LLM API proxy / model router

Portal

Service Host URL Purpose
KiteStacks Portal monk + kscloud1 www.kitestacks.com Custom homepage / service launcher
Metrics API monk (internal at /api) FastAPI — live stats for portal

Authentication Flow

Every service uses Authentik SSO via OIDC or OAuth2:

Browser → https://service.kitestacks.com
    │
    └─► Service: "Not logged in" → redirect to Authentik
             │
             ▼
        https://auth.kitestacks.com/if/flow/...
             │
             ├─ User logs in with username + password
             ├─ Authentik validates credentials
             └─ Issues authorization code → redirect back to service
                         │
                         ▼
                   Service exchanges code for tokens
                   Decodes JWT to get user info (email)
                   Creates local session

BookStack-specific note: OIDC_ISSUER_DISCOVER=true and OIDC_ISSUER must point to the per-app URL (/application/o/bookstack/), not the global Authentik URL. The Authentik provider must have issuer_mode='per_provider'.


Network Architecture

External Access

All public traffic enters via Cloudflare Tunnel. No ports are open on monk's router. kscloud1 (Hetzner) has no firewall rules open for HTTP/HTTPS either — all access via the same tunnel.

Internal Networking

  • All Docker containers attach to the kitestacks bridge network
  • Containers communicate using container names as DNS (e.g., bookstack-db, prometheus)
  • Docker's embedded DNS server (127.0.0.11) resolves container names automatically

Tailscale Overlay

Tailscale creates an encrypted mesh between monk and kscloud1:

  • Used for: Authentik PostgreSQL/Redis access, SSH to kscloud1, Prometheus scraping kscloud1 metrics
  • Not used for: public traffic (that goes through Cloudflare)

Storage Layout

monk

~/kitestacks-live/docker/
├── authentik/          # media, custom-templates
├── bookstack/          # config/, db/
├── cloudflared/        # .env (TUNNEL_TOKEN)
├── forgejo/            # data/
├── grafana/            # grafana_data volume
├── karakeep/           # data/
├── kavita/             # config/
├── kitestacks-portal/  # static HTML + nginx
├── osticket/           # db/, uploads/
├── portainer/          # portainer_data volume
└── prometheus/         # prometheus.yml, prometheus_data volume

kscloud1

/opt/kitestacks/docker/
├── authentik/          # postgresql data volume, redis data
├── bookstack/          # config/, db/
├── cloudflared/        # .env (same TUNNEL_TOKEN)
└── ...                 # replica services

Resilience Model

Scenario Impact Recovery
monk goes offline All monk services unreachable; kscloud1 serves portal + wiki Automatic (CF Tunnel failover)
kscloud1 goes offline Authentik logins may fail (DB unreachable); all other services up Restart kscloud1 or point Authentik to local postgres
Cloudflare Tunnel down All public access lost; Tailscale still works Check CF dashboard; restart cloudflared
MariaDB crash (BookStack) BookStack down docker restart bookstack-db then docker restart bookstack
Portainer lockout No Docker UI Use portainer/helper-reset-password

Key Design Decisions

Why Cloudflare Tunnel instead of port-forwarding? Port-forwarding exposes your home IP, requires a static IP, and can't failover. CF Tunnel is free, hides your IPs, and trivially supports multi-origin failover.

Why active-active instead of active-passive? Active-passive requires detecting failure and switching. Active-active — same token, two connectors — Cloudflare handles routing automatically. Simpler and zero RPO.

Why Authentik over Keycloak or Authelia? Authentik is easier to self-host (Docker Compose, sensible defaults), has a good UI, and supports LDAP + OIDC + SAML. Authelia lacks SAML. Keycloak is heavier and more complex.

Why BookStack over Notion/Confluence? Self-hosted, no external API calls, Markdown-first, OIDC SSO. Data stays in-house.

Why Forgejo over GitLab? Forgejo is lightweight (~200MB RAM vs GitLab's 4GB+). Full git server with CI runners, issues, PRs. GitLab is overkill for a homelab.