From 4f91c427803ee0f8b6074cc3d6c40e2d8b6a9b79 Mon Sep 17 00:00:00 2001 From: kenpat Date: Thu, 18 Jun 2026 18:46:47 -0500 Subject: [PATCH] Rewrite architecture overview and build guide in simple plain-English Both docs now use everyday analogies (Cloudflare = post office, Authentik = doorman) instead of technical jargon, making them accessible to anyone learning the project. Co-Authored-By: Claude Sonnet 4.6 --- architecture/overview.md | 249 +++++++++++++++++++-------------------- build-guide/README.md | 53 +++++---- 2 files changed, 145 insertions(+), 157 deletions(-) diff --git a/architecture/overview.md b/architecture/overview.md index 0868a0f..092aa89 100644 --- a/architecture/overview.md +++ b/architecture/overview.md @@ -1,186 +1,173 @@ -# KiteStacks Architecture Overview +# KiteStacks Architecture — How It All Works -**Last updated:** 2026-06-18 -**Status:** Active production homelab +**Last updated:** 2026-06-18 --- -## High-Level Architecture +## The Simple Version + +KiteStacks is two computers working together to run a bunch of websites. + +- **monk** — your home machine. Runs almost everything. +- **kscloud1** — a rented computer in Germany (Hetzner). Backs everything up. + +People visit the websites through **Cloudflare**, which acts like a secret post-office. +Cloudflare knows where monk and kscloud1 are, but the rest of the internet doesn't. +That means your home address never gets exposed. ``` -┌──────────────────────────────────────────────────┐ -│ Public Internet │ -│ (via Cloudflare Tunnel) │ -└───────────────────────┬──────────────────────────┘ - │ - ┌─────────────▼──────────────┐ - │ Cloudflare Zero Trust │ - │ Active-Active Tunnel │ - └──────┬────────────┬────────┘ - │ │ - ┌────────────▼───┐ ┌─────▼──────────────┐ - │ monk (home) │ │ kscloud1 (Hetzner)│ - │ cloudflared │ │ cloudflared │ - │ All services │ │ Replica services │ - │ Tailscale mesh │ │ Shared Authentik DB │ - └────────────────┘ └─────────────────────┘ - │ │ - └────────────────────┘ - Tailscale overlay - (private network) +You (on any device) → Cloudflare (the post office) → monk or kscloud1 ``` -The two machines share one Cloudflare Tunnel token, so Cloudflare load-balances across both connectors automatically. If monk goes offline, kscloud1 continues serving all public subdomains within seconds. +If monk goes offline, Cloudflare automatically sends traffic to kscloud1 instead. +Both are always ready to handle requests — this is called **active-active**. --- -## Service Map +## What Each Service Does -### Identity & Access -| Service | Host | URL | Purpose | -|---------|------|-----|---------| -| Authentik server | monk | auth.kitestacks.com | SSO identity provider | -| Authentik worker | monk | (internal) | Background jobs, flow execution | -| Authentik LDAP | monk | (internal) | LDAP proxy for non-OIDC apps | -| Authentik PostgreSQL | kscloud1 | (Tailscale only) | Shared auth database | -| Authentik Redis | kscloud1 | (Tailscale only) | Session cache | +### Login (Identity) +| Service | What it does | +|---------|-------------| +| **Authentik** | The doorman — checks who you are before letting you into any site | +| Authentik worker | Runs background jobs for Authentik | +| Authentik PostgreSQL | The address book — stores all usernames and passwords (on kscloud1) | +| Authentik Redis | Fast memory — remembers who is logged in so you don't need to log in again | ### Infrastructure -| Service | Host | URL | Purpose | -|---------|------|-----|---------| -| cloudflared | monk + kscloud1 | (no UI) | CF Tunnel connector | -| Portainer | monk | portainer.kitestacks.com | Docker container management | -| Forgejo | monk | gitforge.kitestacks.com | Self-hosted Git (repos + CI) | -| Uptime Kuma | monk | status.kitestacks.com | Service uptime monitoring | +| Service | What it does | +|---------|-------------| +| **cloudflared** | Runs on both machines — creates the secret tunnel to Cloudflare | +| **Portainer** | A control panel to manage all the little program-boxes (containers) | +| **Forgejo** | Like GitHub but yours — stores all the code and scripts | +| **Uptime Kuma** | A watchdog — alerts when any service goes down | -### Observability -| Service | Host | URL | Purpose | -|---------|------|-----|---------| -| Prometheus | monk | (internal) | Metrics collection | -| Grafana | monk | grafana.kitestacks.com | Metrics dashboards | -| Node Exporter | monk | (internal) | Host OS metrics | -| Blackbox Exporter | monk | (internal) | External endpoint probing | +### Monitoring +| Service | What it does | +|---------|-------------| +| **Prometheus** | Collects numbers (CPU, memory, disk) from both machines every 15 seconds | +| **Grafana** | Turns those numbers into charts you can watch | +| **Node Exporter** | Runs on each machine and reports its health to Prometheus | -### Knowledge & Productivity -| Service | Host | URL | Purpose | -|---------|------|-----|---------| -| BookStack | monk + kscloud1 | wiki.kitestacks.com | Internal wiki / documentation | -| Karakeep | monk | links.kitestacks.com | Bookmark manager | -| Kavita | monk | kavita.kitestacks.com | Ebook/manga reader | -| OSTicket | monk | tasks.kitestacks.com | Help desk / ticket system | -| ntfy | monk | (push notifications) | Push notifications | - -### AI Stack -| Service | Host | URL | Purpose | -|---------|------|-----|---------| -| Open WebUI | monk | ai.kitestacks.com | Chat interface (GPT-4, Claude, local) | -| LiteLLM | monk | (internal) | LLM API proxy / model router | - -### Portal -| Service | Host | URL | Purpose | -|---------|------|-----|---------| -| KiteStacks Portal | monk + kscloud1 | www.kitestacks.com | Custom homepage / service launcher | -| Metrics API | monk | (internal at /api) | FastAPI — live stats for portal | +### Apps +| Service | What it does | +|---------|-------------| +| **BookStack** | A private wiki — all notes and guides live here | +| **Karakeep** | Saves bookmarks and website archives | +| **Kavita** | Reads ebooks and manga | +| **OSTicket** | Help-desk system — tracks tasks and tickets | +| **Open WebUI** | Chat with AI (GPT-4, Claude, or local models) | +| **LiteLLM** | Routes AI requests to the right model | +| **KiteStacks Portal** | The homepage at www.kitestacks.com | --- -## Authentication Flow +## How Login Works (SSO) -Every service uses Authentik SSO via OIDC or OAuth2: +Every website on KiteStacks uses **Authentik** for login. You log in once, and every +website trusts that. This is called **Single Sign-On (SSO)**. + +Here's what happens when you visit a site: ``` -Browser → https://service.kitestacks.com - │ - └─► Service: "Not logged in" → redirect to Authentik - │ - ▼ - https://auth.kitestacks.com/if/flow/... - │ - ├─ User logs in with username + password - ├─ Authentik validates credentials - └─ Issues authorization code → redirect back to service - │ - ▼ - Service exchanges code for tokens - Decodes JWT to get user info (email) - Creates local session +1. You go to wiki.kitestacks.com (BookStack) +2. BookStack checks: "Are you logged in?" — No. +3. BookStack sends you to auth.kitestacks.com (Authentik) +4. Authentik asks for your username and password +5. You log in — Authentik issues a proof-token +6. Authentik sends you back to BookStack with the proof +7. BookStack reads the proof and creates your session +8. You're in! ``` -**BookStack-specific note:** `OIDC_ISSUER_DISCOVER=true` and `OIDC_ISSUER` must point to the per-app URL (`/application/o/bookstack/`), not the global Authentik URL. The Authentik provider must have `issuer_mode='per_provider'`. +This system uses a standard called **OIDC** (OpenID Connect). Every website speaks OIDC, +so they all work the same way with Authentik as the login source. --- -## Network Architecture +## How the Network Works -### External Access -All public traffic enters via Cloudflare Tunnel. No ports are open on monk's router. kscloud1 (Hetzner) has no firewall rules open for HTTP/HTTPS either — all access via the same tunnel. +### Public traffic (the websites) +All public traffic enters through **Cloudflare Tunnel**. -### Internal Networking -- All Docker containers attach to the `kitestacks` bridge network -- Containers communicate using container names as DNS (e.g., `bookstack-db`, `prometheus`) -- Docker's embedded DNS server (`127.0.0.11`) resolves container names automatically +- Both monk and kscloud1 run a small program called `cloudflared` +- `cloudflared` connects outward to Cloudflare — no ports need to be open on your router +- Cloudflare sends visitor traffic through whichever connector is healthy +- If monk is off, kscloud1 handles everything within seconds -### Tailscale Overlay -Tailscale creates an encrypted mesh between monk and kscloud1: -- Used for: Authentik PostgreSQL/Redis access, SSH to kscloud1, Prometheus scraping kscloud1 metrics -- Not used for: public traffic (that goes through Cloudflare) +### Private traffic (machine-to-machine) +monk and kscloud1 talk to each other through **Tailscale** — a private encrypted network. + +Tailscale is used for: +- monk reaching the database (PostgreSQL) on kscloud1 for Authentik logins +- SSH from monk to kscloud1 for management +- Prometheus on monk scraping metrics from kscloud1 + +Nothing on Tailscale is visible to the public internet. --- -## Storage Layout +## Where Files Live -### monk +### On monk ``` ~/kitestacks-live/docker/ -├── authentik/ # media, custom-templates -├── bookstack/ # config/, db/ -├── cloudflared/ # .env (TUNNEL_TOKEN) -├── forgejo/ # data/ -├── grafana/ # grafana_data volume -├── karakeep/ # data/ -├── kavita/ # config/ -├── kitestacks-portal/ # static HTML + nginx -├── osticket/ # db/, uploads/ -├── portainer/ # portainer_data volume -└── prometheus/ # prometheus.yml, prometheus_data volume +├── authentik/ ← login system +├── bookstack/ ← wiki + its database +├── cloudflared/ ← cloudflare tunnel connector +├── forgejo/ ← git server +├── grafana/ ← monitoring charts +├── karakeep/ ← bookmarks +├── kavita/ ← ebook reader +├── kitestacks-portal/ ← homepage +├── osticket/ ← help desk +├── portainer/ ← container dashboard +└── prometheus/ ← metrics collector ``` -### kscloud1 +### On kscloud1 ``` /opt/kitestacks/docker/ -├── authentik/ # postgresql data volume, redis data -├── bookstack/ # config/, db/ -├── cloudflared/ # .env (same TUNNEL_TOKEN) -└── ... # replica services +├── authentik/ ← PostgreSQL + Redis (shared with monk's Authentik) +├── bookstack/ ← backup wiki +└── cloudflared/ ← backup tunnel connector ``` --- -## Resilience Model +## What Happens When Things Break -| Scenario | Impact | Recovery | -|----------|--------|----------| -| monk goes offline | All monk services unreachable; kscloud1 serves portal + wiki | Automatic (CF Tunnel failover) | -| kscloud1 goes offline | Authentik logins may fail (DB unreachable); all other services up | Restart kscloud1 or point Authentik to local postgres | -| Cloudflare Tunnel down | All public access lost; Tailscale still works | Check CF dashboard; restart cloudflared | -| MariaDB crash (BookStack) | BookStack down | `docker restart bookstack-db` then `docker restart bookstack` | -| Portainer lockout | No Docker UI | Use `portainer/helper-reset-password` | +| What breaks | What users see | Comes back automatically? | +|-------------|----------------|--------------------------| +| monk offline | monk services down; portal + wiki still work on kscloud1 | Yes — Cloudflare switches to kscloud1 | +| kscloud1 offline | Authentik logins may fail (database unreachable) | No — restart kscloud1 or switch to local DB | +| Cloudflare tunnel down | All public websites unreachable | No — check CF dashboard, restart cloudflared | +| BookStack database crashes | BookStack shows an error | Run: `docker restart bookstack-db && docker restart bookstack` | +| Portainer lockout | Can't manage containers from the web | Run the password reset helper (see RUNBOOK.md) | --- ## Key Design Decisions -**Why Cloudflare Tunnel instead of port-forwarding?** -Port-forwarding exposes your home IP, requires a static IP, and can't failover. CF Tunnel is free, hides your IPs, and trivially supports multi-origin failover. +**Why Cloudflare Tunnel instead of opening router ports?** +Opening ports exposes your home IP address. Anyone can then scan it, try to break in, +or use it to locate you. Cloudflare Tunnel creates a private outbound connection — your +IP stays hidden. It's also free and supports automatic failover. **Why active-active instead of active-passive?** -Active-passive requires detecting failure and switching. Active-active — same token, two connectors — Cloudflare handles routing automatically. Simpler and zero RPO. +Active-passive requires detecting failure and switching over, which takes time. Active-active +is simpler — both machines are always handling traffic, so Cloudflare just stops sending +to the broken one automatically. -**Why Authentik over Keycloak or Authelia?** -Authentik is easier to self-host (Docker Compose, sensible defaults), has a good UI, and supports LDAP + OIDC + SAML. Authelia lacks SAML. Keycloak is heavier and more complex. +**Why Authentik for login instead of passwords per app?** +If every app has its own password, you manage dozens of credentials and each app stores +its own user database. Authentik is one place — one login to change, one place to block +a user. Every app just asks Authentik "is this person who they say they are?" -**Why BookStack over Notion/Confluence?** -Self-hosted, no external API calls, Markdown-first, OIDC SSO. Data stays in-house. +**Why Forgejo instead of just GitHub?** +GitHub can disappear, change pricing, or expose your private repos. Forgejo is +self-hosted — runs on monk, uses almost no RAM, and keeps everything in-house. -**Why Forgejo over GitLab?** -Forgejo is lightweight (~200MB RAM vs GitLab's 4GB+). Full git server with CI runners, issues, PRs. GitLab is overkill for a homelab. +**Why BookStack instead of Notion?** +Notion is a third-party service that can change pricing or lose your data. BookStack is +self-hosted — the data is on your machine, and you own it completely. diff --git a/build-guide/README.md b/build-guide/README.md index 749e17f..7ff6d5d 100644 --- a/build-guide/README.md +++ b/build-guide/README.md @@ -1,52 +1,53 @@ # KiteStacks Build Guide -This guide walks you through rebuilding the entire KiteStacks homelab from scratch on a blank machine. Two paths are available — choose the one that fits how you work. +This guide walks you through rebuilding the entire KiteStacks homelab from scratch +on a blank machine. Two paths are available — choose the one that fits how you work. --- ## Choose Your Path ### Path A — With AI (Claude Code) -You provide the high-level goals, Claude Code writes the configs, debugs the errors, and explains every decision. Fastest path. Best for learning while doing. +Tell Claude Code what you want to build. Claude writes the configs, debugs errors, +and explains every decision as it goes. Fastest path. Great for learning while doing. → [Build with AI](./with-ai/README.md) -### Path B — Manual (No AI) -Step-by-step instructions you follow yourself. Every command, every config, every file. Best for deep understanding and exam prep (answering "how does this work" in interviews). +### Path B — Do It Yourself +Step-by-step instructions where you type every command yourself. Every config, every +file, explained. Best for really understanding how things work — great for exam prep. → [Build Manually](./without-ai/README.md) --- -## Prerequisites (Both Paths) +## What You Need Before Starting (Both Paths) -Before starting either path, have the following ready: - -| Requirement | Details | -|-------------|---------| -| A Linux machine | Ubuntu 24.04+ or CachyOS/Arch recommended. At least 16GB RAM, 500GB SSD | -| A Cloudflare account | Free tier is fine. You need a domain pointed to Cloudflare | +| What you need | Details | +|---------------|---------| +| A Linux computer | Ubuntu 24.04 recommended. At least 16GB RAM, 500GB SSD | +| A Cloudflare account | Free tier. You need a domain name pointed to Cloudflare | | A domain name | Any registrar works — point nameservers to Cloudflare | -| A Hetzner account (optional) | For the cloud replica (kscloud1). CAX11 or CX22 works | -| A Tailscale account | Free tier — needed for the private overlay network | -| Docker + Docker Compose | Install before starting either path | +| A Hetzner account (optional) | For the cloud backup machine (kscloud1). Any small VPS works | +| A Tailscale account | Free — creates the private network between machines | +| Docker installed | The foundation everything runs on | --- -## High-Level Build Order +## Build Order (Both Paths Follow This) -Regardless of which path you take, build in this order: +Build in this order — each step depends on the one before it: ``` -1. Docker + networking foundation -2. Cloudflare Tunnel (cloudflared) -3. Authentik (SSO identity provider) -4. Core services (Portainer, Forgejo, BookStack) -5. Monitoring (Prometheus, Node Exporter, Grafana) -6. Application services (Karakeep, Kavita, OSTicket) -7. AI services (Open WebUI, LiteLLM) -8. Portal (homepage + metrics API) -9. kscloud1 cloud replica +Step 1: Install Docker and set up networking +Step 2: Set up Cloudflare Tunnel (the secret post-office connection) +Step 3: Set up Authentik (the single login system) +Step 4: Set up core services (Portainer, Forgejo, BookStack) +Step 5: Set up monitoring (Prometheus, Node Exporter, Grafana) +Step 6: Set up app services (Karakeep, Kavita, OSTicket) +Step 7: Set up AI services (Open WebUI, LiteLLM) +Step 8: Set up the portal (main homepage) +Step 9: Add the cloud backup machine (kscloud1) ``` -Each layer depends on the one before it. Don't skip ahead. +Don't skip ahead — if you skip Authentik, none of the SSO logins will work.