docs: comprehensive homelab-mastery rewrite with full build guides

Complete documentation suite for KiteStacks covering all 11 services across 2-host active-active architecture. Includes beginner track (with AI, 8 files) and advanced track (without AI, 7 files) with time estimates, real troubleshooting cases, and command-by-command explanations. Updates certifications roadmap to reflect July 7 2026 A+ Core 2 exam goal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 01:08:43 -05:00 · 2026-06-19 01:08:43 -05:00 · 1e8319ee75
commit 1e8319ee75
parent e3cfa80d98
24 changed files with 5243 additions and 298 deletions
--- a/homelab-mastery/README.md
+++ b/homelab-mastery/README.md
@ -1,48 +1,109 @@
-# Homelab Mastery — KiteStacks Learning Guide
+# KiteStacks Homelab — Master Guide

 **Owner:** kenpat  
-**Purpose:** Everything needed to understand, explain, rebuild, and build a career around the KiteStacks homelab project.
+**Domain:** kitestacks.com  
+**Status:** Live and running  
+**Last Updated:** 2026-06-19

 ---

-## Your Current Status
+## What Is KiteStacks?

-| Milestone | Status |
-|-----------|--------|
-| CompTIA A+ Core 1 | ✅ Passed — highest score in class (22 people) |
-| CompTIA A+ Core 2 | 🔄 In progress |
-| CCNA | 📅 Next |
-| Cloud / AI certs | 📅 After CCNA |
+KiteStacks is a self-hosted homelab — a real, production web platform running on two computers
+that serves eleven public websites to the internet, 24 hours a day, even when the home machine
+is off.
+
+It is not a tutorial project. It is not a demo. It runs at a real domain, with real users,
+real uptime monitoring, and real failover. Every service is protected by single sign-on (SSO),
+meaning one account unlocks everything. All traffic goes through Cloudflare's global network —
+no ports are open on the home router, and the home IP address is never exposed.
+
+### The One-Paragraph Summary
+
+> *KiteStacks is a self-hosted homelab running eleven public-facing services behind Cloudflare
+> Tunnel with no open ports on the home router. All logins are handled by Authentik — a
+> self-hosted identity provider using OIDC/OAuth2, so one account unlocks every service.
+> A Hetzner cloud VPS (kscloud1) acts as a permanent cloud replica: if the home machine (monk)
+> goes offline, kscloud1 keeps everything running with zero downtime. Both hosts share a single
+> Postgres and Redis database over a private Tailscale VPN, so SSO logins always work regardless
+> of which server answers. Monitoring runs via Prometheus, Grafana, Uptime Kuma, and a desktop
+> Conky widget that shows live kscloud1 service health at a glance.*

 ---

-## What This Repo Is
+## The Two Computers

-You built a production homelab — a real multi-host, highly available web platform with SSO, monitoring, cloud failover, and AI services. Most people learning DevOps do tutorials with fake projects. You have a real one running at a real domain.
+| Name | What It Is | Role |
+|------|-----------|------|
+| **monk** | Home PC (ThinkPad T14s) | Development machine. Code and configs are built here, then pushed to kscloud1. |
+| **kscloud1** | Hetzner VPS in Germany | Always-live production server. Receives what monk pushes. Stays up even if monk is off. |

-This repo exists so you can:
-1. **Understand** what everything does at the conceptual level
-2. **Explain it** confidently to a hiring manager, recruiter, or LinkedIn connection
-3. **Rebuild it** from scratch on a new machine if you ever need to
-4. **Map it** to real certifications and career paths
+A third machine — the **Samurai desktop** — will eventually join as a second home connector,
+adding more redundancy when it is running.
+
+---
+
+## The Eleven Public Services
+
+| Service | URL | What It Does |
+|---------|-----|-------------|
+| **Portal** | www.kitestacks.com | The homepage — links to everything, live system stats |
+| **Authentik** | auth.kitestacks.com | SSO login provider — one account for all services |
+| **Forgejo** | gitforge.kitestacks.com | Self-hosted Git — stores all code and documentation |
+| **Open WebUI** | ai.kitestacks.com | AI chat interface (ChatGPT-style, self-hosted) |
+| **Karakeep** | links.kitestacks.com | Bookmark and read-it-later manager |
+| **Kavita** | kavita.kitestacks.com | eBook and manga library |
+| **Grafana** | grafana.kitestacks.com | Monitoring dashboards — CPU, RAM, network |
+| **Uptime Kuma** | status.kitestacks.com | Service uptime status page |
+| **BookStack** | wiki.kitestacks.com | Self-hosted wiki and documentation platform |
+| **OSTicket** | tasks.kitestacks.com | Help desk and ticket tracking system |
+| **Portainer** | portainer.kitestacks.com | Docker container management dashboard |

 ---

 ## Navigation

-| Section | What's Inside |
-|---------|--------------|
-| [certifications/](certifications/roadmap.md) | Full cert roadmap for cloud engineering, what each cert proves, study order |
-| [architecture/](architecture/overview.md) | How the entire system works, why it was built this way |
-| [concepts/](concepts/) | Deep dives on every technology: Docker, networking, OAuth2, Tailscale, etc. |
-| [build-guide/](build-guide/README.md) | Step-by-step rebuild from a blank machine, with explanations of every decision |
-| [interview-prep/](interview-prep/explain-the-project.md) | Exactly what to say to hiring managers, common questions + model answers |
-| [learning-path/](learning-path/README.md) | Structured study plan, free resources, what to learn in what order |
+| Section | What Is Inside |
+|---------|---------------|
+| [architecture/overview.md](architecture/overview.md) | How the whole system is wired together — diagrams, traffic flow |
+| [architecture/services.md](architecture/services.md) | Every service: container name, port, volume, command reference |
+| [architecture/decisions.md](architecture/decisions.md) | Why each technology was chosen over the alternatives |
+| [build-guide/README.md](build-guide/README.md) | How to build this from scratch — choose beginner (AI) or advanced |
+| [concepts/docker.md](concepts/docker.md) | What Docker actually is and how containers work |
+| [concepts/networking.md](concepts/networking.md) | DNS, ports, TLS, Tailscale, Cloudflare Tunnel, firewalls |
+| [concepts/oauth2-oidc.md](concepts/oauth2-oidc.md) | How SSO works — OAuth2, OIDC, JWTs explained simply |
+| [concepts/linux.md](concepts/linux.md) | Linux commands, file ownership, sudo, SSH tunnels |
+| [certifications/roadmap.md](certifications/roadmap.md) | Cert path from A+ to CKA — what to study and in what order |
+| [interview-prep/explain-the-project.md](interview-prep/explain-the-project.md) | What to say to hiring managers — model answers |
+| [learning-path/README.md](learning-path/README.md) | Structured study plan, free resources, daily habits |

 ---

-## The One-Paragraph Project Summary
+## Where to Start

-> *KiteStacks is a self-hosted homelab running nine public-facing services behind Cloudflare Tunnel, with full SSO via Authentik (OIDC/OAuth2), active-active cloud failover on a Hetzner VPS, private networking over Tailscale, and real-time monitoring via Prometheus and Grafana. The platform serves a public domain (kitestacks.com) and stays online even when the primary home machine is off — all running on commodity hardware with no open ports on the home router.*
+**If you want to understand what you built:**
+→ [architecture/overview.md](architecture/overview.md)

-That is what you built. Now learn to own every word of it.
+**If you want to rebuild it from scratch:**
+→ [build-guide/README.md](build-guide/README.md) — pick your track
+
+**If you have an interview coming up:**
+→ [interview-prep/explain-the-project.md](interview-prep/explain-the-project.md)
+
+**If you want to understand the tech behind it:**
+→ Pick a topic in [concepts/](concepts/)
+
+**If you want to know what certifications to study next:**
+→ [certifications/roadmap.md](certifications/roadmap.md)
+
+---
+
+## Certification Progress
+
+| Cert | Status |
+|------|--------|
+| CompTIA A+ Core 1 | ✅ Passed — highest score in class (22 people) |
+| CompTIA A+ Core 2 | 🔄 In progress — exam goal July 7, 2026 |
+| CCNA | 📅 Next after A+ Core 2 |
+| AWS Solutions Architect Associate | 📅 After CCNA |
+| CKA (Kubernetes) | 📅 After AWS certs |
--- a/homelab-mastery/architecture/decisions.md
+++ b/homelab-mastery/architecture/decisions.md
@ -1,12 +1,16 @@
 # Architecture Decisions — The Why Behind Every Choice

-For every technology choice, there was a reason. Understanding the "why" is what separates someone who copied commands from someone who designed a system.
+For every technology choice, there was a reason. Understanding the "why" is what separates
+someone who copied commands from someone who designed a system.
+
+**Last Updated:** 2026-06-19

 ---

 ## Why Docker Instead of Running Services Directly?

-**Problem:** Running 15+ services directly on a Linux host creates dependency hell — different Python versions, conflicting library versions, services affecting each other.
+**Problem:** Running 15+ services directly on a Linux host creates dependency conflicts —
+different Python versions, conflicting library versions, services that break each other on updates.

 **Options considered:**
 - Bare metal: install each app directly on the OS
@ -16,13 +20,15 @@ For every technology choice, there was a reason. Understanding the "why" is what
 **Decision:** Docker

 **Why:**
- Each container has its own filesystem, dependencies, and runtime — they can't conflict
- Starting/stopping/updating one service doesn't affect others
- The `docker-compose.yml` file IS the documentation — it shows exactly what the service needs to run
+- Each container has its own filesystem and runtime — they can't conflict
+- Starting, stopping, or updating one service doesn't affect others
+- The `docker-compose.yml` file IS the documentation — it shows exactly what the service needs
 - Portability: move the same compose file to a new machine and it works identically
- Isolation: if Karakeep gets compromised, it can't easily touch Forgejo's data
+- `restart: unless-stopped` means containers self-heal after a crash or host reboot

-**What you'd say to a hiring manager:** *"I containerized every service using Docker and Docker Compose so each has isolated dependencies and the entire deployment is reproducible from a single YAML file."*
+**What to say in an interview:**
+> *"I containerized every service using Docker Compose so each has isolated dependencies
+> and the entire deployment is reproducible from a single YAML file."*

 ---

@ -30,170 +36,247 @@ For every technology choice, there was a reason. Understanding the "why" is what

 **Problem:** How do you make home services accessible from the internet?

-**Traditional approach:** Open port 80 and 443 on the home router, configure NAT, point DNS to home IP.
+**Traditional approach:** Open ports 80 and 443 on the home router, configure NAT,
+point DNS to your home IP address.

 **Problems with that:**
- Exposes your home IP address publicly (DDoS risk, can be found, ISP tracks it)
- Dynamic home IP means DNS breaks every time IP changes
- Some ISPs block residential port 80/443
- Router configuration is error-prone and varies by hardware
+- Your home IP is public (DDoS risk, can be scanned and targeted)
+- Dynamic home IP means DNS breaks every time the ISP changes it
+- Some ISPs block residential ports 80 and 443
+- Router configuration is fragile and varies by hardware

 **Decision:** Cloudflare Tunnel (cloudflared)

 **Why:**
- cloudflared makes an OUTBOUND connection to Cloudflare — no inbound ports needed
- Home IP never exposed
- Works regardless of ISP restrictions
- Cloudflare handles TLS/HTTPS — you don't manage SSL certificates
+- cloudflared makes an outbound connection to Cloudflare — no inbound ports needed at all
+- Home IP is never exposed to the public internet
+- Works on any ISP, any network, any firewall
+- Cloudflare handles TLS certificates automatically (no Let's Encrypt setup)
 - Free tier covers everything needed
- Bonus: built-in DDoS protection
+- Built-in DDoS protection at Cloudflare's edge

-**The trade-off:** You depend on Cloudflare. If Cloudflare has an outage, your site goes down even if your hardware is fine. This is acceptable — Cloudflare's uptime is better than most home internet connections.
+**The tradeoff:** You depend on Cloudflare. If Cloudflare has an outage, your site goes down
+even if your hardware is fine. Acceptable — Cloudflare's uptime exceeds most home ISPs.

 ---

-## Why Authentik for SSO Instead of Separate Logins Per App?
+## Why Authentik for SSO?

-**Problem:** 9 services means 9 different usernames and passwords to manage. Adding a user requires going into 9 admin panels. Removing access means 9 places to deactivate.
+**Problem:** Eleven services means eleven separate usernames and passwords. Adding a user
+means eleven admin panels. Removing access means eleven places to deactivate.

 **Options:**
- Separate logins per service (no SSO)
- Authelia (simpler, forward-auth proxy only)
- Authentik (full OIDC provider, more complex)
- Keycloak (enterprise-grade, very heavy)
+- No SSO — separate logins per service
+- Authelia — simpler, forward-auth proxy only
+- Authentik — full OIDC provider, more complex to set up
+- Keycloak — enterprise-grade, very heavy on RAM

 **Decision:** Authentik

 **Why:**
 - One account controls access to everything
- Apps that support native OIDC (Grafana, Kavita, Open WebUI, Karakeep) get real SSO — the user is authenticated inside the app
- Can restrict which groups can access which applications (Portainer restricted to homelab-admin group)
- Self-hosted — user data stays on your infrastructure
- Authentik supports both native OIDC (for apps that support it) and proxy provider (for apps that don't)
+- Apps that support native OIDC (Grafana, Kavita, Karakeep, Open WebUI, Portainer, BookStack,
+  Forgejo) get real SSO — user is authenticated inside the app with a JWT, not just at a proxy
+- Access policies per application (Portainer restricted to `homelab-admin` group only)
+- Self-hosted — user data never leaves your infrastructure

-**The trade-off:** Authentik is complex to set up and has a significant memory footprint. Authelia would be simpler. But Authelia only does forward-auth proxy — it can't give an app a real JWT. Authentik does both.
+**Why not Authelia:** Authelia only does forward-auth proxy. It blocks the login page until
+authenticated, but the app itself never receives user identity. Authentik sends a real JWT
+with user email and name — apps can create user accounts automatically on first login.

 ---

 ## Why a Shared Postgres Instead of Separate Authentik Databases?

-**Problem:** After setting up active-active failover, users kept getting `invalid_grant` errors when signing in through SSO.
+**Problem:** After deploying two Cloudflare Tunnel connectors, users got `invalid_grant`
+errors when signing in through SSO — roughly 50% of the time.

-**Root cause:** OAuth2 authorization codes are rows in a database. The flow is:
-1. `/authorize` → code stored in Database A (monk's Authentik)
-2. `/token` → looks for code in Database B (kscloud1's Authentik)
-3. Code not found → `invalid_grant`
+**Root cause:** OAuth2 authorization codes are short-lived rows in a database.

-Cloudflare Tunnel load-balances between monk and kscloud1 for every HTTP request. Steps 1 and 2 of the OAuth flow can hit different hosts.
+```
+Step 1: /authorize → creates code → stored in monk's Authentik DB
+Step 2: /token     → looks for code → hits kscloud1's Authentik DB → NOT FOUND
+```
+
+Cloudflare load-balances every HTTP request independently. Steps 1 and 2 of the OAuth2
+flow can hit completely different hosts. The code exists in one database but not the other.

 **Options:**
- Sync databases continuously (complex, slow, conflict-prone)
+- Sync both databases continuously (complex, slow, conflict-prone)
 - Use sticky sessions (Cloudflare paid feature)
- Share one database (simple, reliable)
+- Share one database between both Authentik instances

-**Decision:** Shared Postgres on kscloud1, accessible only over Tailscale
+**Decision:** Single shared Postgres + Redis hosted on kscloud1, accessible only over Tailscale

 **Why:**
- Both monk and kscloud1 Authentik read/write the same database — authorization codes always found
- Tailscale binding means the database is never exposed to the public internet (security)
- Simple: one line change in each `docker-compose.yml` to point to a different host
- Cost: free (already paying for kscloud1)
+- Both connectors' Authentik instances read and write the same database
+- Authorization codes are always found regardless of which host handles which request
+- Database is bound to kscloud1's Tailscale IP — never reachable from the public internet
+- Simple configuration change: one environment variable pointing to the shared host

-**The trade-off:** If kscloud1 goes down and Tailscale connectivity breaks, monk's Authentik can't start. Rollback procedure: restore monk's compose to use a local Postgres.
+**The tradeoff:** If kscloud1 and Tailscale both go down, monk's Authentik can't connect
+to the database and fails to start. Rollback: restore local Postgres in monk's compose file.

 ---

 ## Why Tailscale Instead of WireGuard or OpenVPN?

-**Problem:** Need private networking between monk (home) and kscloud1 (Hetzner cloud) without exposing the Authentik database to the public internet.
+**Problem:** Need private networking between monk (home) and kscloud1 (Hetzner cloud).
+The shared Authentik database must not be exposed to the public internet.

 **Options:**
- WireGuard: manual key exchange, manual routing, technical to configure
- OpenVPN: even more complex, slower
+- WireGuard: manual key exchange, manual routing, hard to configure through NAT
+- OpenVPN: complex, slower, more overhead
 - Tailscale: managed WireGuard, automatic key exchange, works behind NAT

 **Decision:** Tailscale

 **Why:**
- Works instantly — install, authenticate, done
- Handles NAT traversal automatically (monk is behind home router NAT)
- Devices get stable 100.x.x.x IPs regardless of actual network location
+- Works in minutes: install, authenticate, done
+- Handles NAT traversal automatically — monk is behind home router NAT
+- Every device gets a stable `100.x.x.x` IP regardless of location
 - Free for up to 100 devices
- Uses WireGuard under the hood — same encryption, much easier configuration
+- WireGuard underneath — same encryption, much easier operation

-**The trade-off:** Tailscale is a managed service — you trust Tailscale's coordination servers. The actual data is encrypted peer-to-peer (Tailscale can't see it), but they control device authentication. Self-hosted alternative: Headscale.
+**The tradeoff:** You trust Tailscale's coordination servers to manage device authentication.
+Actual data is encrypted peer-to-peer (Tailscale never sees it), but they control who can
+join your network. Self-hosted alternative if needed: Headscale.

 ---

-## Why Active-Active Instead of Active-Passive Failover?
+## Why Active-Active Failover Instead of Active-Passive?

-**The context:** The user travels. When away from home, monk might be inaccessible (home network down, ISP outage, power). kscloud1 should keep the site running.
+**The situation:** The user travels. When away from home, monk may be unreachable.
+kscloud1 must keep the site running.

-**Active-Passive:** kscloud1 only starts serving if monk is detected as down. Cloudflare would need health checks and failover rules.
+**Active-Passive:** kscloud1 only starts serving if Cloudflare detects monk as down.
+Requires health checks, failover rules, and a delay before traffic switches.

-**Active-Active:** Both monk and kscloud1 are always in the Cloudflare Tunnel rotation. Every request might hit either host.
+**Active-Active:** Both monk and kscloud1 are always in the Cloudflare Tunnel rotation.
+Every request may hit either host at any time.

 **Decision:** Active-Active

 **Why:**
- Simpler: no health checks to configure, no failover logic
- Instant: if monk goes down, kscloud1 is already handling 50% of traffic
- Free: Cloudflare Tunnel active-active is free; health-check-based failover requires paid plans
+- No failover logic needed — both are always live
+- Instant: if monk goes down, kscloud1 is already handling traffic
+- Free: Cloudflare Tunnel active-active is included; health-check-based failover is paid

-**The trade-off:** Stateful apps (Forgejo, OpenProject, Kavita) have separate databases on each host. A user might see different data depending on which host answers. This was explicitly accepted: the point is uptime, not data consistency across hosts.
+**The tradeoff:** Stateful apps with separate databases (Kavita, Karakeep) may show
+different data depending on which host answers. Explicitly accepted — the priority is
+uptime, not data consistency across hosts. Forgejo and Authentik share databases so
+they are consistent.

 ---

-## Why nginx for the Portal Instead of a Pre-Built Dashboard?
+## Why a Custom Portal Instead of a Pre-Built Dashboard?

 **Options:**
- gethomepage (what was used before) — nice but limited customization
+- Homepage (gethomepage) — nice but limited customization
 - Heimdall — similar limitations
- Custom static site + nginx — full control
+- Custom static HTML/CSS/JS + nginx — full control, full ownership

-**Decision:** Custom static HTML/CSS/JS + nginx
+**Decision:** Custom static site

 **Why:**
- Complete visual control — the cyberpunk theme, the layout, every pixel
- Static files served by nginx are extremely fast and reliable
- Can proxy the metrics API for real-time stats without CORS issues
- No framework dependencies — no Node.js, no build step, just files
+- Complete visual control — the cyberpunk theme, layout, every card, every color
+- Static files + nginx are extremely fast and reliable (no Node.js, no build step)
+- nginx proxies the `/api/*` endpoints to the metrics API without CORS issues
+- No dependency on external frameworks that can change or break

-**The trade-off:** More work to build and maintain than a pre-built dashboard. But you now understand every line of it.
+**The tradeoff:** More work to build and maintain. But you understand every line of it,
+and you can explain exactly why every piece is there.

 ---

 ## Why Python + FastAPI for the Metrics API?

-**Problem:** The portal needs real-time system stats (CPU, RAM, network), weather, and Forgejo activity. These can't come from static HTML files.
+**Problem:** The portal needs live system stats (CPU, RAM, network), weather, and
+Forgejo git activity. Static HTML can't provide these.

-**Options:**
- Shell scripts + cron → write stats to a JSON file the frontend reads
- Node.js + Express
- Python + FastAPI
-
-**Decision:** Python FastAPI
+**Decision:** Python FastAPI with `psutil`

 **Why:**
- Python's `psutil` library reads system metrics with one line of code
- FastAPI is modern, fast, and automatically documents the API
+- `psutil` reads host system metrics in one line of Python
+- FastAPI auto-generates API documentation and handles async requests well
+- Python is readable — easy to understand and modify
 - `async/await` means the API doesn't block while waiting for weather API responses
- Python is readable — you can understand and modify the code

-**The special requirement:** The container needs `network_mode: host` and `pid: host`. Without these:
- `network_mode: host`: the container can see the host's network interfaces and report real network throughput (not container-level)
- `pid: host`: psutil can read the host's `/proc` filesystem, showing actual system stats instead of container stats
+**Special requirements:**
+- `network_mode: host` — container shares host network namespace so psutil sees real
+  network interfaces, not the container's virtual interface
+- `pid: host` — container can read the host's `/proc` filesystem for accurate process stats
+
+Without these flags, the API would report container-level stats instead of actual laptop stats.

 ---

-## Why the Forgejo Repo for Documentation?
+## Why Forgejo Instead of GitHub or GitLab?

-You could keep documentation in Notion, Google Docs, or a wiki.
+**Problem:** Need to store all homelab code, configs, and documentation in version control.

-**Why Forgejo:**
- It's self-hosted — you own the data
- Git tracks every change with a timestamp and message
- The documentation lives alongside the configs it describes
- Hiring managers can see the commit history and read your documentation directly
+**Options:**
+- GitHub: free, reliable, but your configs and docs are on someone else's server
+- GitLab: self-hostable but heavy (4GB+ RAM for full install)
+- Forgejo: lightweight GitHub-like self-hosted Git, fork of Gitea

-**What this shows to a hiring manager:** You treat documentation like code — version-controlled, structured, maintained.
+**Decision:** Forgejo
+
+**Why:**
+- Self-hosted — configs and documentation stay on your infrastructure
+- Very lightweight — uses less than 100MB RAM
+- GitHub-compatible API — tools that work with GitHub also work with Forgejo
+- Full UI with code review, issues, CI/CD (Forgejo Actions)
+- Shows commit history and documentation to anyone you give access to
+
+**The tradeoff:** You maintain it yourself. If Forgejo goes down, git operations fail.
+Mitigated by kscloud1 running a replica and the shared Postgres.
+
+---
+
+## Why OSTicket for the Help Desk?
+
+**What it replaced:** OpenProject (project management tool on tasks.kitestacks.com)
+
+**Why OpenProject was removed:**
+- OpenProject CE (Community Edition) requires an Enterprise Edition license for SSO
+- The SSO button simply does not appear in CE — it is a hard paywall with no workaround
+- OpenProject is also resource-heavy for what it provides
+
+**Why OSTicket:**
+- Lightweight and runs well on the existing stack
+- Email integration works (SMTP via Gmail app password — confirmed working)
+- Handles the ticket/task tracking use case without the licensing barrier
+
+---
+
+## Why BookStack for the Wiki?
+
+**Problem:** Need a place for long-form documentation that's more structured than markdown files.
+
+**Decision:** BookStack
+
+**Why:**
+- Clean, organized UI: Shelves → Books → Chapters → Pages hierarchy
+- WYSIWYG editor — easy to write docs without markdown syntax
+- Authentik OIDC SSO works natively
+- API available — docs can be pushed programmatically from scripts or CI
+
+**Key gotcha:** Cache directory must be writable by the container user.
+`chown -R abc:users /config/www/framework/cache/` is required after first install.
+
+---
+
+## Why the Forgejo Shared Postgres?
+
+**Problem:** With two connectors in active-active, Forgejo on monk and kscloud1 had
+separate SQLite databases. Repos created on one weren't visible on the other.
+
+**Fix:** Migrated both Forgejo instances to a single shared PostgreSQL database on kscloud1
+(same shared server as Authentik's Postgres). Both connectors now serve identical Forgejo data.
+
+**How it was done:**
+- `forgejo dump --database postgres` — exported clean SQL from monk's Forgejo
+- Dropped the pgloader schema (had wrong structure), reloaded the clean SQL
+- Both compose files point to `authentik-postgres:5432` database `forgejo`, user `forgejo`
+- kscloud1's Forgejo joined the `authentik_default` Docker network to reach authentik-postgres
--- a/homelab-mastery/architecture/overview.md
+++ b/homelab-mastery/architecture/overview.md
@ -1,138 +1,169 @@
 # KiteStacks Architecture — Full System Overview

+**Last Updated:** 2026-06-19
+
+---
+
 ## The Big Picture

 ```
-                        INTERNET
-                           │
-                    ┌──────▼──────┐
-                    │  Cloudflare  │  DNS + TLS termination
-                    │   (edge)     │  Zero Trust Tunnel
-                    └──────┬──────┘
-                           │  HTTPS (443) only
-          ┌────────────────┼────────────────┐
-          │ connector 1    │ connector 2    │ connector 3
-          │                │               │
-   ┌──────▼──────┐         │        ┌──────▼──────┐
-   │    MONK     │         │        │   KSCLOUD1  │
-   │ (home PC)   │         │        │ (Hetzner VPS│
-   │             │  Active │        │  5.78.x.x)  │
-   │ All 9       │  Active │        │             │
-   │ services    │         │        │ All 9       │
-   │             │         │        │ services    │
-   └──────┬──────┘         │        └──────┬──────┘
-          │                │               │
-          └────────────────┼───────────────┘
-                     TAILSCALE VPN
-                    (100.x.x.x range)
-                           │
-                  ┌────────▼────────┐
-                  │  SHARED DB LAYER │
-                  │  on kscloud1    │
-                  │  Postgres :5432  │
-                  │  Redis    :6379  │
-                  │  (Tailscale     │
-                  │   only, private)│
-                  └─────────────────┘
+                          INTERNET
+                             │
+                      ┌──────▼──────┐
+                      │  Cloudflare  │  DNS + TLS termination
+                      │   (edge)     │  Tunnel routing
+                      └──────┬──────┘
+                             │  HTTPS only — home IP never exposed
+              ┌──────────────┴──────────────┐
+              │ connector 1                 │ connector 2
+              │                             │
+       ┌──────▼──────┐               ┌──────▼──────┐
+       │    MONK     │               │   KSCLOUD1  │
+       │ (ThinkPad   │               │ (Hetzner VPS│
+       │  T14s, home)│               │  Germany)   │
+       │             │               │             │
+       │ Development │               │ ALWAYS LIVE │
+       │ Pushes to → │               │ Receives ←  │
+       │ kscloud1    │               │ from monk   │
+       └──────┬──────┘               └──────┬──────┘
+              │                             │
+              └─────────── TAILSCALE ───────┘
+                         (100.x.x.x range)
+                         Encrypted peer-to-peer
+                                 │
+                    ┌────────────▼────────────┐
+                    │    SHARED DATABASE LAYER │
+                    │    hosted on kscloud1    │
+                    │                         │
+                    │  PostgreSQL  :5432       │
+                    │  Redis       :6379       │
+                    │                         │
+                    │  Bound to Tailscale IP   │
+                    │  only — not public       │
+                    └─────────────────────────┘
+```
+
+**The key idea:** Cloudflare holds two persistent outbound connections — one from monk,
+one from kscloud1. Every request to kitestacks.com arrives at Cloudflare, which routes
+it to whichever connector responds. If monk goes offline, kscloud1 handles everything.
+Your home IP is never involved.
+
+---
+
+## How Work Flows Between the Two Hosts
+
+```
+monk (dev)  ──push──►  kscloud1 (prod, always live)
+```
+
+- **monk** is where changes are made: editing config files, testing new services, writing code
+- **kscloud1** receives those changes and is always serving live traffic
+- If monk is off, kscloud1 continues serving the last pushed state — users see no downtime
+- A third machine (Samurai desktop) is planned as a future second home connector
+
+---
+
+## The Eleven Public Services
+
+| Service | Container | URL | What It Does |
+|---------|-----------|-----|-------------|
+| Portal | `homepage` | www.kitestacks.com | Custom homepage — links, live stats, cyberpunk theme |
+| Authentik | `authentik` | auth.kitestacks.com | SSO identity provider — handles all logins |
+| Forgejo | `forgejo` | gitforge.kitestacks.com | Self-hosted Git (like GitHub) |
+| Open WebUI | `kite-openwebui` | ai.kitestacks.com | AI chat interface |
+| Karakeep | `karakeep` | links.kitestacks.com | Bookmark and read-it-later manager |
+| Kavita | `kavita` | kavita.kitestacks.com | eBook and manga reader |
+| Grafana | `grafana` | grafana.kitestacks.com | Monitoring dashboards |
+| Uptime Kuma | `uptime-kuma` | status.kitestacks.com | Public status page and uptime monitoring |
+| BookStack | `bookstack` | wiki.kitestacks.com | Self-hosted wiki / docs platform |
+| OSTicket | `osticket-app` | tasks.kitestacks.com | Help desk ticketing system |
+| Portainer | `portainer` | portainer.kitestacks.com | Docker management dashboard |
+
+## The Infrastructure Services (Internal Only)
+
+| Container | What It Does |
+|-----------|-------------|
+| `cloudflared` | Cloudflare Tunnel connector — outbound connection to Cloudflare edge |
+| `prometheus` | Metrics collector — scrapes node-exporter every 15 seconds |
+| `node-exporter` | Exposes host CPU/RAM/disk/network metrics for Prometheus |
+| `blackbox-exporter` | HTTP probe monitor — checks endpoints are returning 200 |
+| `kite-litellm` | LLM proxy — routes AI requests to OpenRouter (many free models) |
+| `kitestacks-metrics-api` | Python FastAPI — serves live stats and Forgejo activity to portal |
+| `ntfy` | Push notification server — sends alerts to phone |
+| `flux` | GitOps controller — watches Forgejo, deploys changes automatically |
+| `authentik-worker` | Background job processor for Authentik |
+| `authentik-ldap` | LDAP proxy layer for Authentik |
+
+---
+
+## How Traffic Flows — Step by Step
+
+### Someone visits www.kitestacks.com
+
+```
+1. Browser → DNS lookup "www.kitestacks.com"
+2. DNS returns Cloudflare's anycast IP (not your home IP)
+3. Browser → HTTPS request to Cloudflare edge
+4. Cloudflare reads Host header: "www.kitestacks.com"
+5. Cloudflare routes request through active tunnel connector
+   (monk or kscloud1 — whichever responds first)
+6. cloudflared resolves "homepage" via Docker DNS
+7. Request hits nginx in the homepage container
+8. nginx serves static HTML/CSS/JS from ./public/
+9. Browser JavaScript calls /api/metrics and /api/activity
+10. nginx proxies those to kitestacks-metrics-api (Python, host network)
+11. metrics-api reads CPU/RAM via psutil (sees real host, not container)
+12. metrics-api calls Forgejo API for recent commits
+13. Browser renders complete page with live stats
+```
+
+### Someone clicks "Sign In with Authentik"
+
+```
+1. App (e.g. Grafana) redirects browser to:
+   https://auth.kitestacks.com/application/o/authorize/
+   ?client_id=grafana&redirect_uri=...&response_type=code
+
+2. Cloudflare routes this to a cloudflared connector
+3. Authentik shows login page
+4. User enters username + password
+5. Authentik validates against shared Postgres (on kscloud1, over Tailscale)
+6. Authentik creates an authorization code (row in DB) and redirects:
+   https://grafana.kitestacks.com/login/generic_oauth?code=abc123
+
+7. Grafana backend POSTs to auth.kitestacks.com/application/o/token/
+   with code=abc123 and client_secret
+
+8. THIS REQUEST may hit a DIFFERENT connector than step 2 did
+   → This is why the shared DB matters: the code must exist in one DB,
+     not two separate ones that might be out of sync
+
+9. Authentik finds code=abc123 in shared Postgres, validates it
+10. Authentik returns JWT (access_token + id_token)
+11. Grafana reads user's email from JWT, creates/updates local user
+12. User is logged in — never re-enters password for other SSO apps
 ```

 ---

-## Every Service and What It Does
+## The Shared Database — Why It Exists

-### The Nine Public Services
+After deploying two connectors (monk + kscloud1), users got `invalid_grant` errors when
+signing in. The cause: each host had its own separate Authentik database. The OAuth2 flow
+makes two separate HTTP requests:

-| Service | Container Name | What It Does | Why It's Here |
-|---------|---------------|--------------|---------------|
-| **Portal** | `homepage` | The public website (kitestacks.com) — custom nginx serving static HTML/CSS/JS with a cyberpunk theme | Front door to everything. Shows system stats, recent activity, links to all services |
-| **Authentik** | `authentik` | Identity provider — handles all logins via OIDC/OAuth2 SSO | Single place to manage all user accounts and access control |
-| **Forgejo** | `forgejo` | Self-hosted Git platform (like GitHub but yours) | Store all homelab code, config, and documentation |
-| **OpenProject** | `openproject` | Project management (like Jira) | Task tracking, project planning |
-| **Open WebUI** | `kite-openwebui` | ChatGPT-like AI chat interface | Access multiple AI models through one interface |
-| **Karakeep** | `karakeep` | Bookmark and read-it-later manager | Save links, articles, and content |
-| **Kavita** | `kavita` | eBook and manga reader | Personal digital library |
-| **Grafana** | `grafana` | Monitoring dashboards | Visualize CPU, RAM, network, uptime across both hosts |
-| **Uptime Kuma** | `uptime-kuma` | Status page and uptime monitoring | Monitor that all 9 services are up and alert if they go down |
+1. `/authorize` → creates authorization code → stored in Database A
+2. `/application/o/token/` → looks up authorization code → hits Database B → **not found**

-### The Infrastructure Services (Not Public-Facing)
+Cloudflare load-balances requests, so steps 1 and 2 can hit different hosts.

-| Service | What It Does |
-|---------|-------------|
-| `cloudflared` | Cloudflare Tunnel connector — creates encrypted outbound tunnel to Cloudflare edge |
-| `prometheus` | Metrics collection — scrapes system stats from both monk and kscloud1 every 15 seconds |
-| `node-exporter` | Exposes host system metrics (CPU, RAM, disk, network) for Prometheus to scrape |
-| `kite-litellm` | LLM proxy gateway — routes AI requests to OpenRouter (multiple free models) |
-| `portainer` | Docker management UI — visual interface to manage all containers |
-| `kitestacks-metrics-api` | Python FastAPI service — serves real-time system stats, weather, and Forgejo activity to the portal |
+**Fix:** Both connectors point to a single shared Postgres+Redis hosted on kscloud1.
+It is bound only to kscloud1's Tailscale IP (`100.123.x.x`) — never the public IP.
+Only devices on the Tailscale network can connect.

---
-
-## How Traffic Flows
-
-### When Someone Visits www.kitestacks.com
-
-```
-1. Browser sends HTTPS request to www.kitestacks.com
-2. DNS resolves to Cloudflare's anycast IP (not your home IP)
-3. Cloudflare terminates TLS — your home router never sees HTTPS
-4. Cloudflare routes the request through the tunnel to whichever
-   cloudflared connector responds first (monk or kscloud1)
-5. cloudflared resolves "homepage" via Docker DNS
-6. Request hits the nginx container serving the static portal
-7. Portal's JavaScript fetches /api/metrics and /api/activity
-   from the kitestacks-metrics-api container via nginx proxy
-8. Page renders with live system stats and recent git activity
-```
-
-### When Someone Clicks "Sign In with Authentik"
-
-```
-1. App (e.g., Grafana) redirects browser to auth.kitestacks.com/application/o/authorize/
-2. Authentik presents login page
-3. User enters credentials — Authentik validates against its database
-   (stored on kscloud1's Postgres, shared over Tailscale)
-4. Authentik generates an authorization code and redirects back to Grafana
-5. Grafana's backend calls auth.kitestacks.com/application/o/token/
-   to exchange the code for an access token
-6. Authentik validates the code (found in shared DB) and returns a JWT
-7. Grafana reads the user's email/name from the JWT and logs them in
-```
-
-**The critical detail:** Steps 1 and 5 can hit different tunnel connectors (monk vs kscloud1). The authorization code from step 4 must exist in whichever database step 5 hits. That's why both connectors point to the SAME Postgres on kscloud1 — otherwise step 5 returns `invalid_grant` because the code isn't found.
-
---
-
-## The Two Hosts in Detail
-
-### Monk (Primary Home Machine)
-
- **Role:** Primary production host
- **Network:** Home LAN, no open ports on router (Cloudflare Tunnel handles all inbound)
- **Services:** All 9 public services + all infrastructure services
- **Data:** Each service has its own database/storage
- **Authentik DB:** Points to kscloud1's Postgres over Tailscale (100.x.x.x)
-
-### kscloud1 (Hetzner VPS)
-
- **Role:** Permanent cloud replica — always on, even when monk is off (travel, power outage, etc.)
- **Network:** Public IP, Cloudflare Tunnel connector 3
- **Services:** Full replica of all 9 public services (separate databases except Authentik)
- **Hosts:** The shared Authentik Postgres + Redis (bound to Tailscale interface only)
- **Resources:** 3 vCPU, 3.7 GB RAM — tight but functional
-
-### What's the Same Across Both
-
- Same Cloudflare Tunnel token (different connector IDs assigned automatically)
- Same Authentik database (shared via Tailscale)
- Same Authentik secret key (required for JWT validation)
- Same kavita.db (one-time sync — users and OIDC config)
-
-### What's Different Across Both
-
- Forgejo data (separate repos — accepted inconsistency)
- OpenProject data (separate projects)
- Karakeep bookmarks (separate)
- Kavita book files (monk has them, kscloud1 doesn't — covers synced, books not)
+**Forgejo** also uses this shared Postgres (separate database on the same server).
+Both monk's and kscloud1's Forgejo read from the same data, so git repos are consistent
+regardless of which connector serves the request.

 ---

@ -141,81 +172,109 @@
 Every container joins the `kitestacks` external Docker bridge network:

 ```bash
+# Create once on each host:
 docker network create kitestacks
 ```

-This is what makes Cloudflare Tunnel work. The cloudflared container is also on this network, so when Cloudflare tells cloudflared to route `http://grafana:3000`, Docker's internal DNS resolves `grafana` to the grafana container's IP on that network.
+All service containers and the cloudflared container join this network. Docker provides
+built-in DNS: when cloudflared needs to route to Grafana, it resolves the hostname `grafana`
+to that container's IP address on the bridge network.

-Without this shared network, cloudflared can't reach the service containers by name.
+```
+cloudflared → "grafana" → Docker DNS → 172.x.x.x:3000 → grafana container
+```
+
+Without this shared network, cloudflared cannot reach services by name.

 ---

-## Why No Open Ports on the Router
+## Why No Open Ports on the Home Router

-Traditional homelab: open port 80/443 on home router → NAT to home server → expose home IP.
+Traditional approach: open port 80 and 443 on the router → NAT to home server → home IP in DNS.

-Problems with that:
- Your home IP is public (DDoS risk, targeted attacks)
- Router configuration is fragile
- ISP can change your IP (dynamic IP)
- Some ISPs block port 80/443
+Problems:
+- Home IP is exposed publicly (DDoS target, ISP tracks it)
+- Dynamic home IP breaks DNS when it changes
+- Some ISPs block residential port 80/443
+- Router misconfiguration = exposed server

-Cloudflare Tunnel approach:
- cloudflared container makes an OUTBOUND connection to Cloudflare
- Cloudflare holds that connection open
- Inbound requests come through Cloudflare, over that existing outbound tunnel
- Your home IP is never exposed
- Works on any network, any ISP, any firewall
+**Cloudflare Tunnel approach:**
+- cloudflared makes one outbound HTTPS connection to Cloudflare edge servers
+- Cloudflare holds that connection open permanently
+- All inbound traffic arrives over that existing outbound connection
+- The home router sees only one outbound HTTPS connection — nothing unusual
+- Home IP is never in DNS, never exposed

-This is why you can run a public website from a home PC with zero router configuration.
+**Result:** A public website running on a home PC with zero router configuration and
+no exposed home IP address.

 ---

 ## Tailscale — The Private Backbone

-Tailscale creates a private overlay network (VPN mesh) across all your devices:
+Tailscale creates an encrypted overlay network across all your devices.
+Every device gets a stable `100.x.x.x` IP regardless of physical location.

 ```
-monk (100.x.x.x) ←—— encrypted ——→ kscloud1 (100.x.x.x)
-monk (100.x.x.x) ←—— encrypted ——→ pixel-6 (100.x.x.x)
+monk       100.85.x.x  ←── WireGuard ───► 100.123.x.x  kscloud1
+samurai    100.74.x.x  ←── WireGuard ───► 100.123.x.x  kscloud1
+phone      100.x.x.x   ←── WireGuard ───► 100.123.x.x  kscloud1
 ```

-Used in this project for:
-1. **Shared Authentik DB:** kscloud1's Postgres binds to its Tailscale IP, not its public IP. Only devices on the tailnet can connect. Monk points to that address.
-2. **Forgejo activity feed:** On kscloud1, the metrics API fetches recent commits from monk's Forgejo via monk's Tailscale IP — so both portal instances show the same activity feed.
-3. **SSH/Admin access:** You can SSH into any device on the tailnet from anywhere.
+Used in this homelab for:
+
+1. **Shared Authentik DB:** kscloud1 Postgres and Redis are bound to `100.123.x.x` only.
+   Monk's Authentik connects to that address. Traffic is encrypted peer-to-peer.
+
+2. **SSH admin access:** SSH to kscloud1 from anywhere using its Tailscale IP.
+   Even behind a hotel firewall or mobile data — Tailscale routes around it.
+
+3. **Uptime monitoring:** The Conky desktop widget on monk reads Uptime Kuma status
+   from kscloud1 directly via Tailscale (not through Cloudflare), so it shows the
+   true kscloud1-side status.

 ---

 ## The Monitoring Stack

 ```
-node-exporter (monk)  →  prometheus (monk)  →  grafana (monk)
-node-exporter (kscloud1) ↗       (scrapes 5.78.x.x:9100)
+                  ┌──────────────┐
+monk's            │  node-exporter│ ← exposes CPU/RAM/disk/network
+node-exporter     │  port 9100    │
+                  └──────┬───────┘
+                         │ scrape every 15s
+                  ┌──────▼───────┐
+kscloud1's  ───► │  prometheus   │ (also scrapes kscloud1:9100 via public IP)
+metrics           └──────┬───────┘
+                         │
+                  ┌──────▼───────┐
+                  │   grafana    │ ← visualize both hosts, switch via instance picker
+                  └──────────────┘
+
+Uptime Kuma → HTTP checks every 60s → all 13 public service URLs
+Conky widget → reads Uptime Kuma API on kscloud1 → shows live dot per service
 ```

-Prometheus scrapes metrics every 15 seconds from:
- `node-exporter:9100` — monk's own node-exporter (via Docker DNS)
- `5.78.x.x:9100` — kscloud1's node-exporter (via public IP, port exposed 0.0.0.0)
-
-Grafana visualizes both, letting you switch between hosts in the instance picker.
-
 ---

 ## The Portal Architecture

-The portal is NOT gethomepage or any pre-built dashboard. It's a custom-built static site:
+The portal is a custom static site — not a pre-built dashboard:

 ```
-nginx (container: "homepage")
-  ├── /         → serves static HTML/CSS/JS from ./public/
-  └── /api/*    → proxy_pass to kitestacks-metrics-api:8000 (host)
+nginx container ("homepage")
+  ├── /           → static HTML/CSS/JS (cyberpunk theme, service cards)
+  └── /api/*      → proxy_pass → kitestacks-metrics-api on host

-kitestacks-metrics-api (network_mode: host, pid: host)
-  ├── GET /api/metrics   → psutil reads HOST's CPU/RAM/disk/network
-  ├── GET /api/weather   → wttr.in API → current weather by IP geolocation
-  ├── GET /api/activity  → Forgejo API → recent commits
+kitestacks-metrics-api (Python FastAPI, network_mode: host, pid: host)
+  ├── GET /api/metrics   → psutil reads HOST CPU/RAM/disk/network
+  ├── GET /api/weather   → wttr.in API → current conditions
+  ├── GET /api/activity  → Forgejo API → recent commits across all repos
  └── GET /api/health    → {"ok": true}
 ```

-The metrics API runs with `network_mode: host` and `pid: host` so it reads the HOST machine's process table and `/proc` filesystem — not the container's. Without this, it would report container stats, not laptop stats.
+`network_mode: host` — the container shares the host's network namespace.
+Without it, psutil would report the container's stats, not the laptop's.
+
+`pid: host` — the container can see the host's process table via `/proc`.
+Without it, system stats would be wrong.
--- a/homelab-mastery/architecture/services.md
+++ b/homelab-mastery/architecture/services.md
@ -0,0 +1,388 @@
+# KiteStacks — Complete Service Reference
+
+Every service that runs in KiteStacks: what it does, where it lives, how to manage it,
+and what commands to use. This is the day-to-day operations reference.
+
+**Last Updated:** 2026-06-19
+
+---
+
+## Quick Reference — All Containers on monk
+
+```
+docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
+```
+
+| Container | Purpose | Public URL |
+|-----------|---------|-----------|
+| `homepage` | Portal / main website | www.kitestacks.com |
+| `authentik` | SSO identity provider | auth.kitestacks.com |
+| `authentik-worker` | Authentik background jobs | — |
+| `authentik-ldap` | LDAP interface for Authentik | — |
+| `authentik-ldap-proxy` | LDAP proxy | — |
+| `forgejo` | Git platform | gitforge.kitestacks.com |
+| `kite-openwebui` | AI chat | ai.kitestacks.com |
+| `kite-litellm` | LLM proxy gateway | — |
+| `karakeep` | Bookmarks | links.kitestacks.com |
+| `karakeep-chrome` | Headless browser for Karakeep | — |
+| `karakeep-meilisearch` | Search engine for Karakeep | — |
+| `kavita` | eBook reader | kavita.kitestacks.com |
+| `grafana` | Monitoring dashboards | grafana.kitestacks.com |
+| `uptime-kuma` | Status page | status.kitestacks.com |
+| `bookstack` | Wiki / docs | wiki.kitestacks.com |
+| `bookstack-db` | MariaDB for BookStack | — |
+| `osticket-app` | Help desk | tasks.kitestacks.com |
+| `osticket-db` | MySQL for OSTicket | — |
+| `portainer` | Docker management UI | portainer.kitestacks.com |
+| `cloudflared` | Tunnel connector | — |
+| `prometheus` | Metrics collector | — |
+| `node-exporter` | Host metrics exporter | — |
+| `blackbox-exporter` | HTTP probe monitor | — |
+| `kitestacks-metrics-api` | System stats API for portal | — |
+| `ntfy` | Push notifications | — |
+| `flux` | GitOps controller | — |
+
+---
+
+## Service Deep Dives
+
+### homepage — Portal
+
+**What it is:** Custom-built static website served by nginx.
+**Directory:** `~/kitestacks-live/docker/kitestacks-portal/`
+**Public files:** `./public/index.html` — edit this to change what the portal shows
+**Config:** `./nginx.conf` — nginx routing rules
+
+```bash
+# Restart portal
+cd ~/kitestacks-live/docker/kitestacks-portal
+docker compose restart homepage
+
+# Edit the portal
+nano public/index.html
+
+# View nginx logs
+docker logs homepage -f
+```
+
+**Ports:** 3005:3000 (host:container). Cloudflare Tunnel uses container port 3000 directly.
+
+---
+
+### authentik — SSO Identity Provider
+
+**What it is:** Self-hosted OAuth2/OIDC identity provider. Handles all logins for every service.
+**Directory:** `~/kitestacks-live/docker/authentik/`
+**Database:** Shared PostgreSQL on kscloud1 at `100.123.x.x:5432`, database `authentik`
+**Redis:** Shared Redis on kscloud1 at `100.123.x.x:6379`
+
+```bash
+cd ~/kitestacks-live/docker/authentik
+
+# Start all Authentik services
+docker compose up -d
+
+# Check health (wait for "healthy" before testing SSO)
+docker inspect --format '{{.State.Health.Status}}' authentik
+docker inspect --format '{{.State.Health.Status}}' authentik-worker
+
+# Run a Django management command (admin tasks, user management)
+docker exec authentik ak shell
+
+# View logs
+docker logs authentik -f
+docker logs authentik-worker -f
+```
+
+**SSO apps configured in Authentik:**
+- Grafana, Forgejo, Kavita, Karakeep, Open WebUI, Portainer, BookStack
+
+**Key Authentik admin panel:** https://auth.kitestacks.com/if/admin/
+
+**Important:** OAuth2 code TTL is set to 10 minutes (increased from default 1 minute)
+to allow monk's Authentik to finish starting up after a reconnect before codes expire.
+
+---
+
+### forgejo — Git Platform
+
+**What it is:** Self-hosted Git. Stores all homelab code, configs, and documentation.
+**Directory:** `~/kitestacks-live/docker/forgejo/`
+**Database:** Shared PostgreSQL on kscloud1, database `forgejo`, user `forgejo`
+**Data volume:** `./data/` (repositories, avatars, attachments)
+
+```bash
+cd ~/kitestacks-live/docker/forgejo
+
+# Start
+docker compose up -d
+
+# Admin commands
+docker exec -u git forgejo forgejo admin user list
+docker exec -u git forgejo forgejo admin user create --username newuser --password pass --email e@mail.com --admin
+
+# View logs
+docker logs forgejo -f
+
+# API token for automation
+# Token: stored in .env — used by kitestacks-metrics-api for activity feed
+```
+
+**API base URL:** `https://gitforge.kitestacks.com/api/v1/`
+**Local access (via Cloudflare):** gitforge.kitestacks.com
+
+---
+
+### kite-openwebui — AI Chat
+
+**What it is:** Self-hosted ChatGPT-like interface connected to LiteLLM proxy.
+**Directory:** `~/kitestacks-live/docker/kite-openwebui/`
+**Backend:** `kite-litellm` — routes to OpenRouter (many models, free tier available)
+
+```bash
+cd ~/kitestacks-live/docker/kite-openwebui
+docker compose up -d
+docker logs kite-openwebui -f
+docker logs kite-litellm -f
+```
+
+**SSO:** Authentik OIDC — "Sign in with Authentik" on login page.
+
+---
+
+### karakeep — Bookmarks
+
+**What it is:** Bookmark manager and read-it-later tool. Saves full page content.
+**Directory:** `~/kitestacks-live/docker/karakeep/`
+**Depends on:** `karakeep-chrome` (headless Chromium for page capture) + `karakeep-meilisearch` (search)
+
+```bash
+cd ~/kitestacks-live/docker/karakeep
+docker compose up -d
+
+# SSO callback URL: https://links.kitestacks.com/api/auth/callback/custom
+# (NextAuth.js uses "custom" as the provider ID, not "authentik")
+```
+
+**SSO:** Authentik OAuth2 — redirect URI must be `/api/auth/callback/custom` (not `/callback/authentik`)
+
+---
+
+### kavita — eBook Reader
+
+**What it is:** eBook, manga, and comic library.
+**Directory:** `~/kitestacks-live/docker/kavita/`
+**Book files:** `./library/books/` — add books here, then scan library in Kavita UI
+**Config/DB:** `./config/kavita.db` (SQLite)
+
+```bash
+cd ~/kitestacks-live/docker/kavita
+docker compose up -d
+docker logs kavita -f
+
+# If you change OIDC settings, use the Kavita UI at kavita.kitestacks.com/settings
+# Do NOT edit kavita.db directly for OIDC config — Kavita overwrites it on restart
+# Use SSH port-forward to access kscloud1's Kavita directly if needed:
+# ssh -L 5099:localhost:5000 kenpat@kscloud1-tailscale-ip
+# Then visit http://localhost:5099
+```
+
+**SSO:** Authentik OIDC — Authority URL must end with trailing slash:
+`https://auth.kitestacks.com/application/o/kavita/`
+
+---
+
+### grafana — Monitoring Dashboards
+
+**What it is:** Visualizes metrics collected by Prometheus.
+**Directory:** `~/kitestacks-live/docker/grafana/`
+**Provisioning:** `./provisioning/` — auto-loads datasource (Prometheus) and dashboard (Node Exporter Full)
+**Data:** Named Docker volume `grafana-data`
+
+```bash
+cd ~/kitestacks-live/docker/grafana
+docker compose up -d
+docker logs grafana -f
+```
+
+**Dashboards auto-loaded:**
+- Node Exporter Full (id 1860) — CPU, RAM, disk, network for both monk and kscloud1
+- Switch between hosts using the "instance" variable at top of dashboard
+
+**SSO:** Authentik OAuth2. Local admin login also works.
+
+---
+
+### uptime-kuma — Status Page
+
+**What it is:** Uptime monitoring with a public status page.
+**Directory:** `~/kitestacks-live/docker/uptime-kuma/`
+**Database:** Named Docker volume `uptime-kuma` (SQLite kuma.db)
+**Status page slug:** `homelab` → https://status.kitestacks.com/status/homelab
+
+```bash
+cd ~/kitestacks-live/docker/uptime-kuma
+docker compose up -d
+docker logs uptime-kuma -f
+
+# To push kuma.db to kscloud1 after changes (monk → kscloud1):
+# See scripts/sync-kuma.sh (or follow the sqlite backup pattern)
+```
+
+**Monitors configured:** All 11 public services + kscloud1 ping + Monk ping + Samurai ping.
+
+**Conky widget:** Reads kscloud1's Uptime Kuma directly via Tailscale IP at
+`http://100.123.x.x:3001/api/status-page/homelab`. This means the widget shows
+kscloud1's health, not monk's — which is what matters for production status.
+
+---
+
+### bookstack — Wiki
+
+**What it is:** Self-hosted documentation wiki with a clean UI.
+**Directory:** `~/kitestacks-live/docker/bookstack/`
+**Database:** MariaDB container `bookstack-db`
+**Config:** `.env` file (APP_URL, DB settings, OIDC config)
+
+```bash
+cd ~/kitestacks-live/docker/bookstack
+docker compose up -d
+docker logs bookstack -f
+
+# BookStack API (used to push docs from Forgejo):
+# Token created via: DB injection + bcrypt hash for API key
+# Token ID/secret stored in .env
+```
+
+**SSO:** Authentik OIDC. Key config:
+- `OIDC_ISSUER=https://auth.kitestacks.com/application/o/bookstack/`
+- `OIDC_ISSUER_DISCOVER=true`
+- Cache dir must be writable: `chown -R abc:users /config/www/framework/cache/`
+
+---
+
+### osticket-app — Help Desk
+
+**What it is:** OSTicket help desk and ticketing system.
+**Directory:** `~/kitestacks-live/docker/osticket/`
+**Database:** MySQL container `osticket-db`
+**URL:** tasks.kitestacks.com (took over from OpenProject)
+
+```bash
+cd ~/kitestacks-live/docker/osticket
+docker compose up -d
+docker logs osticket-app -f
+```
+
+**SMTP:** Configured for smtp.gmail.com:587 using kitestacks.helpdesk@gmail.com.
+App password stored in `ost_email` table (smtp_auth_creds=1 for all email entries).
+**Confirmed working:** Email delivery verified 2026-06-19.
+
+---
+
+### portainer — Docker Management
+
+**What it is:** Web UI for managing Docker containers on both monk and kscloud1.
+**Directory:** `~/kitestacks-live/docker/portainer/`
+**URL:** portainer.kitestacks.com
+
+```bash
+cd ~/kitestacks-live/docker/portainer
+docker compose up -d
+```
+
+**SSO:** Authentik OAuth2 (AuthenticationMethod=3). User kenpat7177@gmail.com pre-created as admin.
+**Security:** Authentik PolicyBinding restricts Portainer app to `homelab-admin` group only.
+
+---
+
+### cloudflared — Tunnel Connector
+
+**What it is:** Creates the outbound tunnel to Cloudflare. This is what makes all
+public services reachable without opening ports on the router.
+**Directory:** `~/kitestacks-live/docker/cloudflared/`
+**Token:** Read from `.env` file as `TUNNEL_TOKEN` (never hardcoded in docker-compose.yml)
+
+```bash
+cd ~/kitestacks-live/docker/cloudflared
+docker compose up -d
+docker logs cloudflared -f
+
+# To rotate the token (runs on both monk and kscloud1):
+# ~/kitestacks-homelab/scripts/rollout-cloudflared-token.sh '<new-token>'
+```
+
+**Tunnel ID:** 5e60ea8e-a543-49b6-bab5-325f39441e00
+**Account:** Cloudflare dashboard → Zero Trust → Networks → Tunnels
+
+---
+
+### prometheus + node-exporter — Metrics
+
+**What it is:** Prometheus collects time-series metrics. node-exporter exposes host stats.
+**Directory:** `~/kitestacks-live/docker/prometheus/`
+**Config:** `./prometheus.yml` — defines scrape targets
+
+```bash
+cd ~/kitestacks-live/docker/prometheus
+docker compose up -d
+docker logs prometheus -f
+
+# Scrape targets configured:
+# - node-exporter:9100 (monk, via Docker DNS)
+# - 5.78.x.x:9100     (kscloud1, via public IP — node-exporter exposed on 0.0.0.0)
+```
+
+---
+
+## Common Operations
+
+### Restart a single service
+```bash
+cd ~/kitestacks-live/docker/<service-name>
+docker compose restart <container-name>
+```
+
+### View live logs
+```bash
+docker logs <container-name> -f
+# -f = follow (live tail). Ctrl+C to stop.
+```
+
+### Update a service to latest image
+```bash
+cd ~/kitestacks-live/docker/<service-name>
+docker compose pull
+docker compose up -d
+```
+
+### Check all container health at once
+```bash
+docker ps --format "table {{.Names}}\t{{.Status}}"
+```
+
+### Enter a container's shell
+```bash
+docker exec -it <container-name> bash
+# or sh if bash isn't available:
+docker exec -it <container-name> sh
+```
+
+### Check disk and memory usage
+```bash
+docker system df        # Docker disk usage
+free -h                 # RAM usage
+df -h                   # Disk usage
+```
+
+### Push a kuma.db update to kscloud1
+```bash
+# 1. Make changes to monk's Uptime Kuma (add monitors, etc.)
+# 2. Backup monk's db:
+docker run --rm -v uptime-kuma:/src:ro -v /tmp:/out python:3-alpine \
+  python3 -c "import sqlite3; s=sqlite3.connect('/src/kuma.db'); b=sqlite3.connect('/out/kuma.db.push'); s.backup(b); b.close(); s.close()"
+# 3. Transfer and restore on kscloud1:
+gzip -c /tmp/kuma.db.push | ssh -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x \
+  "gunzip > /home/kenpat/kuma.db.push"
+# Then on kscloud1: stop uptime-kuma, restore via same sqlite.backup() pattern, restart
+```
--- a/homelab-mastery/build-guide/README.md
+++ b/homelab-mastery/build-guide/README.md
@ -0,0 +1,85 @@
+# KiteStacks Build Guide — Choose Your Path
+
+This guide teaches you how to build the entire KiteStacks homelab from a blank machine.
+There are two tracks. Pick the one that fits where you are right now.
+
+---
+
+## Track A — With AI (Beginner)
+
+**Who this is for:** Someone with zero or very little tech experience.
+You do not need to know Linux, Docker, or networking. You just need to be able to
+follow instructions and copy commands.
+
+**How it works:** You use an AI assistant (Claude, ChatGPT, or similar) as your guide
+throughout the build. The AI explains what each command does in plain language before
+you run it. You never copy something without understanding what it does — the AI makes
+sure of that.
+
+**Time to complete:** 2–4 weeks of evenings and weekends (2–3 hours per session).
+
+**What you will have at the end:** A fully working homelab identical to KiteStacks.
+
+→ **[Start the AI-Assisted Build](with-ai/01-what-you-need.md)**
+
+---
+
+## Track B — Without AI (Advanced)
+
+**Who this is for:** Someone who wants to understand everything deeply and build skills
+along the way — not just copy commands but know what every line does and why.
+
+**How it works:** You build the homelab from scratch, learning Bash scripting, Python,
+Docker internals, Linux administration, and networking as you go. Every command is
+explained in full. No shortcuts.
+
+**Time to complete:** 3–6 months of consistent part-time study and building
+(evenings and weekends). Full-time: 6–10 weeks.
+
+**What you will learn:** Linux, Bash scripting, Python, Docker, networking (DNS, ports,
+TLS, firewalls), OAuth2/OIDC, infrastructure design, and troubleshooting methodology.
+
+→ **[Start the Advanced Build](without-ai/01-linux-foundations.md)**
+
+---
+
+## What Both Tracks Build
+
+By the end of either track you will have:
+
+- ✅ A public domain (e.g. kitestacks.com) serving real websites
+- ✅ Eleven self-hosted services running in Docker
+- ✅ Single sign-on — one account for everything
+- ✅ A cloud VPS as a permanent backup — site stays up when your home PC is off
+- ✅ Private networking between home and cloud via Tailscale VPN
+- ✅ Real-time monitoring with Grafana and Uptime Kuma
+- ✅ A desktop widget showing live service status
+
+---
+
+## Hardware and Accounts Needed (Both Tracks)
+
+### Hardware
+- Any PC or laptop running Linux (or you can install Linux on it) — minimum 8GB RAM, 100GB disk
+- A domain name — buy from Cloudflare Registrar, Namecheap, or similar (~$10–15/year)
+- A credit card for the cloud VPS (~€4–5/month on Hetzner — less than a coffee)
+
+### Accounts to Create
+- **Cloudflare** — free account at cloudflare.com
+- **Hetzner** — cloud VPS provider at hetzner.com (or any VPS: DigitalOcean, Vultr, Linode)
+- **Tailscale** — free at tailscale.com (up to 100 devices)
+- **OpenRouter** — free AI model access at openrouter.ai (for the AI chat service)
+
+### What You Are Building On
+```
+Home PC (monk)
+  └── Ubuntu or similar Linux OS
+  └── Docker + Docker Compose
+  └── ~15 containers running
+
+Cloud VPS (kscloud1)
+  └── Ubuntu Linux
+  └── Docker + Docker Compose
+  └── Same 15 containers running (replica)
+  └── Shared PostgreSQL + Redis
+```
--- a/homelab-mastery/build-guide/with-ai/01-what-you-need.md
+++ b/homelab-mastery/build-guide/with-ai/01-what-you-need.md
@ -0,0 +1,182 @@
+# Step 1 — What You Need Before You Start
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 1–2 hours
+
+Welcome. You are about to build a real, working homelab that serves websites to the
+actual internet. It sounds complicated, but with an AI assistant helping you every step
+of the way, you can absolutely do this even if you have never used Linux before.
+
+---
+
+## How to Use This Guide
+
+Throughout this build, whenever you see a command like this:
+
+```bash
+docker ps
+```
+
+That is something you type into a terminal (a black window where you type commands).
+Before you type any command, **ask your AI assistant what it does**. Say:
+
+> "What does this command do: `docker ps`"
+
+The AI will explain it in plain language. Never run a command you do not understand.
+That is the rule throughout this entire build.
+
+---
+
+## What You Need
+
+### 1. A Computer to Run Everything On
+
+You need a PC or laptop that will be your home server. This will be called **monk**
+throughout this guide (that is just a nickname — you can call it whatever you want).
+
+Minimum specs:
+- **RAM:** 8 GB (16 GB recommended — you will run about 15 programs at once)
+- **Storage:** 100 GB free space
+- **Operating system:** Linux (Ubuntu 22.04 or 24.04 recommended)
+
+If your computer currently runs Windows, you have two options:
+- Install Ubuntu alongside Windows (dual boot)
+- Replace Windows with Ubuntu entirely (easier, recommended)
+
+**Ask your AI:** "How do I install Ubuntu 24.04 on my computer?"
+
+---
+
+### 2. A Domain Name
+
+A domain name is your address on the internet — for example, `kitestacks.com`.
+You need to buy one. It costs about $10–15 per year.
+
+**Where to buy:** Cloudflare Registrar (registrar.cloudflare.com) is recommended
+because you will use Cloudflare for everything else and it keeps things in one place.
+
+**Tips for picking a domain:**
+- Keep it short and memorable
+- `.com` is most professional
+- Avoid hyphens and numbers
+
+**Ask your AI:** "How do I buy a domain name on Cloudflare Registrar?"
+
+---
+
+### 3. A Cloudflare Account
+
+Cloudflare is the service that sits between the internet and your home computer.
+It hides your home IP address, handles all the security, and routes traffic to
+your services. Best part: everything you need is on their free plan.
+
+Go to cloudflare.com and create a free account.
+
+If you bought your domain from Cloudflare Registrar, your account is already set up.
+If you bought it elsewhere, you will need to move it to Cloudflare — ask your AI how.
+
+---
+
+### 4. A Cloud VPS (Virtual Private Server)
+
+A VPS is a small computer that rents space in a data center. It runs 24 hours a day
+even when your home computer is off. This is what keeps your websites online when
+you are travelling or when your home internet goes down.
+
+**Recommended provider:** Hetzner (hetzner.com) — excellent value, based in Germany.
+**Plan to choose:** CX22 — 2 vCPU, 4 GB RAM, 40 GB disk — approximately €4/month.
+
+Create a Hetzner account, then ask your AI: "How do I create a new CX22 VPS on Hetzner
+with Ubuntu 24.04?"
+
+This second computer will be called **kscloud1** throughout this guide.
+
+---
+
+### 5. A Tailscale Account
+
+Tailscale is a free service that creates a private, encrypted connection between your
+home computer and your cloud VPS. Think of it as a private tunnel that only your
+devices can use.
+
+Go to tailscale.com and create a free account.
+
+---
+
+### 6. An OpenRouter Account (for AI services)
+
+OpenRouter gives you access to dozens of AI models for free (with rate limits) or
+for very low cost. Your KiteStacks AI service will use this.
+
+Go to openrouter.ai and create a free account.
+
+---
+
+## Setting Up Your Home Computer (monk)
+
+Once Ubuntu is installed on your home computer, open a terminal. On Ubuntu,
+press `Ctrl + Alt + T` to open one.
+
+You will see something like:
+```
+kenpatmonk@monk:~$
+```
+
+That `$` means you are ready to type commands.
+
+**First, update your system. Ask your AI what this does, then run it:**
+
+```bash
+sudo apt update && sudo apt upgrade -y
+```
+
+**Then install some tools you will need:**
+
+```bash
+sudo apt install -y curl git nano wget
+```
+
+**Ask your AI:** "What does `sudo apt install` do and why do I need curl, git, nano, and wget?"
+
+---
+
+## Setting Up Your Cloud VPS (kscloud1)
+
+After creating your VPS on Hetzner, you will get an IP address (something like `5.78.233.28`).
+You connect to it using a tool called SSH.
+
+**Ask your AI:** "What is SSH and how do I connect to my VPS from Ubuntu?"
+
+The basic command looks like this:
+```bash
+ssh root@YOUR_VPS_IP
+```
+
+Replace `YOUR_VPS_IP` with the actual IP Hetzner gave you.
+
+Once connected, update the VPS just like you did on your home computer:
+```bash
+apt update && apt upgrade -y
+```
+
+---
+
+## Checkpoint
+
+Before moving to Step 2, make sure you have:
+
+- [ ] Ubuntu installed and running on your home computer
+- [ ] A domain name purchased and pointing to Cloudflare
+- [ ] A Cloudflare account (free)
+- [ ] A Hetzner VPS created with Ubuntu (noted your VPS IP address)
+- [ ] A Tailscale account (free)
+- [ ] An OpenRouter account (free)
+- [ ] You can open a terminal on your home computer
+- [ ] You can SSH into your VPS
+
+If any of these are not done, stop here and ask your AI for help completing them
+before moving on. Every future step assumes all of these are in place.
+
+---
+
+**Next:** [Step 2 — DNS and Cloudflare Setup](02-dns-and-cloudflare.md)
--- a/homelab-mastery/build-guide/with-ai/02-dns-and-cloudflare.md
+++ b/homelab-mastery/build-guide/with-ai/02-dns-and-cloudflare.md
@ -0,0 +1,129 @@
+# Step 2 — DNS and Cloudflare Setup
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 1–2 hours
+
+In this step you will set up Cloudflare so your domain points to Cloudflare's servers,
+and you will create the Cloudflare Tunnel that allows the internet to reach your home
+computer without exposing your home IP address.
+
+---
+
+## What Is Happening Here?
+
+When someone types `www.kitestacks.com` into a browser, their computer asks a system
+called DNS: "What is the IP address for kitestacks.com?"
+
+Normally, that answer would be your home IP address. But we do NOT want that — your
+home IP could change, could be targeted by attackers, or could be blocked by your ISP.
+
+Instead, the DNS answer will be Cloudflare's IP address. Traffic goes to Cloudflare,
+Cloudflare sends it to your computer through a tunnel, and your home IP is never involved.
+
+**Ask your AI:** "Can you explain in simple terms how Cloudflare Tunnel works?"
+
+---
+
+## Step 2A — Add Your Domain to Cloudflare
+
+If you bought your domain from Cloudflare Registrar, skip to Step 2B.
+
+If you bought it elsewhere (Namecheap, GoDaddy, etc.):
+
+1. Log in to Cloudflare at cloudflare.com
+2. Click "Add a site"
+3. Enter your domain name
+4. Choose the Free plan
+5. Cloudflare will give you two nameserver addresses (like `vera.ns.cloudflare.com`)
+6. Go to your domain registrar's website and replace the nameservers with Cloudflare's
+
+**Ask your AI:** "How do I change nameservers on [your registrar]?"
+
+It can take up to 24 hours for nameserver changes to propagate worldwide, but usually
+it happens within an hour.
+
+---
+
+## Step 2B — Create Your Cloudflare Tunnel
+
+A Cloudflare Tunnel is the invisible connection between your home computer and Cloudflare.
+Your home computer reaches out to Cloudflare (outbound connection). Cloudflare holds that
+connection open. When someone visits your website, Cloudflare sends the request back through
+that existing connection. Your home router never needs to be configured.
+
+**To create a tunnel:**
+
+1. In your Cloudflare dashboard, go to: **Zero Trust → Networks → Tunnels**
+2. Click **"Create a tunnel"**
+3. Choose **"Cloudflared"** as the connector type
+4. Name your tunnel (e.g., `kitestacks-tunnel`)
+5. Cloudflare will show you a token — a long string of characters starting with `eyJ`
+6. **Save this token somewhere safe** — you will need it in Step 3
+
+---
+
+## Step 2C — Add Public Hostnames to the Tunnel
+
+A public hostname tells Cloudflare: "When someone visits this URL, send the traffic
+to this container on my home computer."
+
+You will set up hostnames for all eleven of your services. For each one:
+
+1. In the tunnel settings, click **"Public Hostnames"**
+2. Click **"Add a public hostname"**
+
+Add all of these (you will complete the services in later steps, but adding the
+hostnames now means they are ready):
+
+| Subdomain | Domain | Service | URL |
+|-----------|--------|---------|-----|
+| www | yourdomain.com | http://homepage:3000 | www.yourdomain.com |
+| auth | yourdomain.com | http://authentik:9000 | auth.yourdomain.com |
+| gitforge | yourdomain.com | http://forgejo:3000 | gitforge.yourdomain.com |
+| ai | yourdomain.com | http://kite-openwebui:8080 | ai.yourdomain.com |
+| links | yourdomain.com | http://karakeep:3000 | links.yourdomain.com |
+| kavita | yourdomain.com | http://kavita:5000 | kavita.yourdomain.com |
+| grafana | yourdomain.com | http://grafana:3000 | grafana.yourdomain.com |
+| status | yourdomain.com | http://uptime-kuma:3001 | status.yourdomain.com |
+| wiki | yourdomain.com | http://bookstack:80 | wiki.yourdomain.com |
+| tasks | yourdomain.com | http://osticket-app:80 | tasks.yourdomain.com |
+| portainer | yourdomain.com | https://portainer:9443 | portainer.yourdomain.com |
+
+For the `portainer` entry, enable **"No TLS Verify"** (Portainer uses its own self-signed certificate internally).
+
+Replace `yourdomain.com` with your actual domain throughout.
+
+**Ask your AI:** "What does the 'service' field in a Cloudflare Tunnel hostname mean?
+Why do I use `http://homepage:3000` instead of an IP address?"
+
+---
+
+## Step 2D — Create the Docker Network
+
+Everything in this homelab runs in Docker (covered in the next step), and all the
+containers need to be able to talk to each other and to the Cloudflare connector.
+They do this by being on the same Docker network.
+
+On your **home computer**, run:
+```bash
+docker network create kitestacks
+```
+
+You will also do this on your **cloud VPS** in a later step.
+
+**Ask your AI:** "What is a Docker network and why do all containers need to be on the same one?"
+
+---
+
+## Checkpoint
+
+Before moving to Step 3, make sure:
+
+- [ ] Your domain is on Cloudflare (nameservers changed or bought from Cloudflare)
+- [ ] You created a Cloudflare Tunnel and saved the tunnel token
+- [ ] You added all 11 public hostnames to the tunnel
+- [ ] You ran `docker network create kitestacks` on your home computer
+
+---
+
+**Next:** [Step 3 — Installing Docker](03-docker-setup.md)
--- a/homelab-mastery/build-guide/with-ai/03-docker-setup.md
+++ b/homelab-mastery/build-guide/with-ai/03-docker-setup.md
@ -0,0 +1,196 @@
+# Step 3 — Installing Docker
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 30–60 minutes (on both your home computer and your VPS)
+
+Docker is the technology that runs all your services. Think of it like a machine that
+can run many small, isolated programs at the same time — each program thinks it is
+the only one on the computer, even though they are all sharing the same hardware.
+
+Each program is called a **container**. You will have about 15 containers running.
+
+---
+
+## What Is Docker? (Plain English)
+
+Imagine you want to run fifteen different apps on your computer. If you installed them
+all directly, they might conflict — one app needs Python version 3.9, another needs 3.11,
+and they fight over which one to use. Docker solves this by giving each app its own
+little bubble where it has exactly what it needs, completely separate from everything else.
+
+A **container** is one of those bubbles.
+A **Docker image** is the recipe for making a bubble.
+**Docker Compose** is a tool that lets you describe multiple containers in one file
+and start them all with one command.
+
+**Ask your AI:** "Can you explain Docker containers vs Docker images using a simple analogy?"
+
+---
+
+## Installing Docker on Your Home Computer (monk)
+
+Run these commands one at a time. Before each one, ask your AI what it does.
+
+```bash
+# Install required packages
+sudo apt install -y ca-certificates curl
+
+# Add Docker's official GPG key (proves the software is authentic)
+sudo install -m 0755 -d /etc/apt/keyrings
+sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
+sudo chmod a+r /etc/apt/keyrings/docker.asc
+
+# Add Docker's package source
+echo \
+  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
+  https://download.docker.com/linux/ubuntu \
+  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
+  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
+
+# Update package list and install Docker
+sudo apt update
+sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
+```
+
+Now let Docker start automatically when your computer boots:
+```bash
+sudo systemctl enable docker
+sudo systemctl start docker
+```
+
+Add yourself to the Docker group so you do not need `sudo` every time:
+```bash
+sudo usermod -aG docker $USER
+```
+
+**Log out and log back in** (or reboot) for this change to take effect.
+
+Test that Docker is installed:
+```bash
+docker --version
+docker compose version
+```
+
+You should see version numbers printed. If you see errors, ask your AI to help.
+
+---
+
+## Installing Docker on Your Cloud VPS (kscloud1)
+
+SSH into your VPS and run the exact same commands as above. The process is identical.
+
+```bash
+ssh root@YOUR_VPS_IP
+```
+
+Then run all the same installation commands.
+
+---
+
+## Your First Container — Cloudflared (Tunnel Connector)
+
+The first container you will run is `cloudflared` — this is what creates the tunnel
+between your computer and Cloudflare. Without this, nothing else can be reached from
+the internet.
+
+**On your home computer**, create a folder for it:
+```bash
+mkdir -p ~/kitestacks-live/docker/cloudflared
+cd ~/kitestacks-live/docker/cloudflared
+```
+
+Create a file called `.env` that holds your tunnel token:
+```bash
+nano .env
+```
+
+Inside the file, type:
+```
+TUNNEL_TOKEN=paste-your-token-here
+```
+
+Replace `paste-your-token-here` with the token you saved from Step 2.
+Press `Ctrl+X`, then `Y`, then `Enter` to save.
+
+Now create the `docker-compose.yml` file:
+```bash
+nano docker-compose.yml
+```
+
+Paste this content:
+```yaml
+services:
+  cloudflared:
+    image: cloudflare/cloudflared:latest
+    container_name: cloudflared
+    restart: unless-stopped
+    command: tunnel --no-autoupdate run
+    environment:
+      - TUNNEL_TOKEN=${TUNNEL_TOKEN:?set TUNNEL_TOKEN in .env}
+    networks:
+      - default
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+Save and close the file. Then start it:
+```bash
+docker compose up -d
+```
+
+Check that it is running:
+```bash
+docker ps
+```
+
+You should see `cloudflared` in the list with a status of `Up`.
+
+Check the logs to confirm it connected:
+```bash
+docker logs cloudflared
+```
+
+You should see something like "Connection established" or "Registered tunnel connection".
+
+**Ask your AI:** "What does `restart: unless-stopped` mean in a Docker Compose file?"
+
+---
+
+## Run Cloudflared on Your VPS Too
+
+SSH into your VPS and do the exact same thing. Use the **same tunnel token** — Cloudflare
+will register this as a second connector for the same tunnel. If your home computer goes
+offline, the VPS will keep serving traffic.
+
+```bash
+mkdir -p /opt/kitestacks/docker/cloudflared
+cd /opt/kitestacks/docker/cloudflared
+```
+
+Create the same `.env` and `docker-compose.yml` files, then:
+```bash
+docker compose up -d
+docker logs cloudflared
+```
+
+---
+
+## Checkpoint
+
+Before moving to Step 4:
+
+- [ ] Docker is installed on your home computer
+- [ ] Docker is installed on your VPS
+- [ ] `docker ps` shows `cloudflared` running on both machines
+- [ ] `docker logs cloudflared` shows successful connection on both
+
+Go to your Cloudflare Tunnel dashboard. Under your tunnel, you should now see
+**2 connectors** listed — one from your home computer and one from your VPS.
+If you only see one, wait a few minutes and refresh.
+
+---
+
+**Next:** [Step 4 — Core Services](04-core-services.md)
--- a/homelab-mastery/build-guide/with-ai/04-core-services.md
+++ b/homelab-mastery/build-guide/with-ai/04-core-services.md
@ -0,0 +1,298 @@
+# Step 4 — Core Services: Portal, Forgejo, and Authentik
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 3–5 hours
+
+These three services form the foundation of KiteStacks:
+- **Portal** — the homepage that links to everything
+- **Forgejo** — stores all your code and configurations in Git
+- **Authentik** — handles all logins for every service (SSO)
+
+Set these up first. Everything else depends on them.
+
+---
+
+## How Docker Compose Files Work
+
+Every service in this homelab has its own folder with a `docker-compose.yml` file.
+That file describes the service: what image to use, what environment variables to set,
+what folders to use for data, and what network to join.
+
+You will create these files using `nano` (a simple text editor in the terminal).
+
+**Ask your AI:** "Can you explain what each section of a docker-compose.yml file does:
+services, image, container_name, restart, environment, volumes, networks?"
+
+---
+
+## Service 1 — The Portal (Homepage)
+
+The portal is your home page at `www.yourdomain.com`. It shows links to all your
+services and displays live system stats.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/kitestacks-portal/public
+cd ~/kitestacks-live/docker/kitestacks-portal
+```
+
+Create `docker-compose.yml`:
+```yaml
+services:
+  homepage:
+    image: nginx:alpine
+    container_name: homepage
+    restart: unless-stopped
+    volumes:
+      - ./public:/usr/share/nginx/html:ro
+      - ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+Create a basic `nginx.conf`:
+```nginx
+server {
+    listen 3000;
+    root /usr/share/nginx/html;
+    index index.html;
+
+    location / {
+        try_files $uri $uri/ /index.html;
+    }
+}
+```
+
+Create a basic `public/index.html` to test:
+```html
+<!DOCTYPE html>
+<html>
+<head><title>KiteStacks</title></head>
+<body>
+  <h1>KiteStacks is live!</h1>
+</body>
+</html>
+```
+
+Start it:
+```bash
+docker compose up -d
+docker ps
+```
+
+Visit `www.yourdomain.com` in a browser. You should see your page.
+If it works, you have confirmed the tunnel is routing correctly.
+
+**Ask your AI:** "I want to build a proper homepage for my homelab. It should have a
+dark cyberpunk theme with cards for each of my services. Can you help me write the HTML?"
+
+Work with your AI to build the portal you want. The KiteStacks portal source is in
+`~/kitestacks-homelab/apps/kitestacks-portal/` as reference.
+
+---
+
+## Service 2 — Forgejo (Git)
+
+Forgejo stores all your code. You will push your homelab configs to it so everything
+is version-controlled and you never lose your work.
+
+First, set up the shared PostgreSQL database (Forgejo will use this):
+
+```bash
+mkdir -p ~/kitestacks-live/docker/postgres
+cd ~/kitestacks-live/docker/postgres
+```
+
+Create `.env`:
+```
+POSTGRES_USER=authentik
+POSTGRES_PASSWORD=choose-a-strong-password-here
+POSTGRES_DB=authentik
+```
+
+Create `docker-compose.yml`:
+```yaml
+services:
+  authentik-postgres:
+    image: postgres:16-alpine
+    container_name: authentik-postgres
+    restart: unless-stopped
+    env_file: .env
+    volumes:
+      - ./data:/var/lib/postgresql/data
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+```bash
+docker compose up -d
+```
+
+Now create the Forgejo service:
+```bash
+mkdir -p ~/kitestacks-live/docker/forgejo
+cd ~/kitestacks-live/docker/forgejo
+```
+
+Create `.env`:
+```
+FORGEJO_DB_TYPE=postgres
+FORGEJO_DB_HOST=authentik-postgres:5432
+FORGEJO_DB_NAME=forgejo
+FORGEJO_DB_USER=forgejo
+FORGEJO_DB_PASSWD=choose-a-strong-password-here
+FORGEJO_DOMAIN=gitforge.yourdomain.com
+FORGEJO_SSH_DOMAIN=gitforge.yourdomain.com
+FORGEJO_ROOT_URL=https://gitforge.yourdomain.com
+```
+
+Create `docker-compose.yml`:
+```yaml
+services:
+  forgejo:
+    image: codeberg.org/forgejo/forgejo:latest
+    container_name: forgejo
+    restart: unless-stopped
+    env_file: .env
+    volumes:
+      - ./data:/data
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+```bash
+docker compose up -d
+docker logs forgejo -f
+```
+
+Wait for it to finish starting (about 30 seconds), then visit `gitforge.yourdomain.com`.
+You will see a Forgejo setup page — follow the on-screen instructions to create your admin account.
+
+**Ask your AI:** "How do I create a repository on Forgejo and push my local files to it?"
+
+---
+
+## Service 3 — Authentik (Single Sign-On)
+
+Authentik is the most complex service to set up, but it is worth it — once done,
+you log in once and every other service recognizes you automatically.
+
+First, set up Redis (Authentik needs this for session management):
+```bash
+mkdir -p ~/kitestacks-live/docker/redis
+cd ~/kitestacks-live/docker/redis
+```
+
+Create `docker-compose.yml`:
+```yaml
+services:
+  authentik-redis:
+    image: redis:alpine
+    container_name: authentik-redis
+    restart: unless-stopped
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+```bash
+docker compose up -d
+```
+
+Now create Authentik:
+```bash
+mkdir -p ~/kitestacks-live/docker/authentik
+cd ~/kitestacks-live/docker/authentik
+```
+
+Generate a secret key (run this and save the output):
+```bash
+openssl rand -base64 60 | tr -d '\n'
+```
+
+Create `.env` (replace the values):
+```
+PG_PASS=same-postgres-password-from-above
+AUTHENTIK_SECRET_KEY=paste-the-generated-key-here
+AUTHENTIK_BOOTSTRAP_EMAIL=your@email.com
+AUTHENTIK_BOOTSTRAP_PASSWORD=choose-a-strong-admin-password
+AUTHENTIK_POSTGRESQL__HOST=authentik-postgres
+AUTHENTIK_POSTGRESQL__USER=authentik
+AUTHENTIK_POSTGRESQL__NAME=authentik
+AUTHENTIK_POSTGRESQL__PASSWORD=same-postgres-password-from-above
+AUTHENTIK_REDIS__HOST=authentik-redis
+```
+
+Create `docker-compose.yml`:
+```yaml
+services:
+  authentik:
+    image: ghcr.io/goauthentik/server:latest
+    container_name: authentik
+    restart: unless-stopped
+    command: server
+    env_file: .env
+    networks:
+      - kitestacks
+
+  authentik-worker:
+    image: ghcr.io/goauthentik/server:latest
+    container_name: authentik-worker
+    restart: unless-stopped
+    command: worker
+    env_file: .env
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+```bash
+docker compose up -d
+```
+
+Authentik takes about 2 minutes to start on first run (it sets up the database).
+Watch the logs:
+```bash
+docker logs authentik -f
+```
+
+When you see "Starting authentik server" it is ready.
+Visit `auth.yourdomain.com` and log in with the bootstrap email and password you set.
+
+**Ask your AI:** "I have Authentik running. How do I create an OAuth2 provider for Grafana
+so it can use SSO? Walk me through the steps in the Authentik admin panel."
+
+Use the same process (with your AI's help) to create OAuth2 providers for each service
+as you add them in the next steps.
+
+---
+
+## Checkpoint
+
+Before moving to Step 5:
+
+- [ ] Portal is live at `www.yourdomain.com`
+- [ ] Forgejo is live at `gitforge.yourdomain.com` with your admin account created
+- [ ] Authentik is live at `auth.yourdomain.com` and you can log in
+- [ ] You can see all three containers in `docker ps`
+
+---
+
+**Next:** [Step 5 — All Remaining Services](05-all-services.md)
--- a/homelab-mastery/build-guide/with-ai/05-all-services.md
+++ b/homelab-mastery/build-guide/with-ai/05-all-services.md
@ -0,0 +1,266 @@
+# Step 5 — All Remaining Services
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 4–8 hours (take breaks — deploy one service at a time)
+
+In this step you will deploy the remaining eight services. For each one:
+1. Create the folder
+2. Create the `docker-compose.yml` file
+3. Run `docker compose up -d`
+4. Verify it is working
+5. Move on to the next one
+
+For each service, ask your AI to explain the docker-compose file before you run it.
+
+---
+
+## How to Use Your AI for Each Service
+
+For every service in this step, you can say to your AI:
+
+> "I am setting up [service name] in my KiteStacks homelab. It is a self-hosted [description].
+> Can you give me a docker-compose.yml for it that joins a network called 'kitestacks'?
+> I want to understand each part before I run it."
+
+Then ask follow-up questions about anything you do not understand.
+
+---
+
+## Service 4 — Open WebUI + LiteLLM (AI Chat)
+
+Open WebUI is your ChatGPT-style interface. LiteLLM sits behind it and routes your
+AI requests to OpenRouter (where you have free model access).
+
+```bash
+mkdir -p ~/kitestacks-live/docker/kite-openwebui
+mkdir -p ~/kitestacks-live/docker/kite-litellm
+```
+
+**Ask your AI:**
+> "I want to set up Open WebUI (ghcr.io/open-webui/open-webui) with LiteLLM as the
+> backend. LiteLLM should route to OpenRouter. Can you give me docker-compose files
+> for both? Container names: kite-openwebui and kite-litellm. Network: kitestacks."
+
+Work with your AI to get the right environment variables (you will need your OpenRouter
+API key from openrouter.ai).
+
+Start both:
+```bash
+cd ~/kitestacks-live/docker/kite-litellm && docker compose up -d
+cd ~/kitestacks-live/docker/kite-openwebui && docker compose up -d
+```
+
+Visit `ai.yourdomain.com` and create your admin account.
+
+---
+
+## Service 5 — Karakeep (Bookmarks)
+
+Karakeep saves bookmarks, articles, and links. It uses a headless Chrome browser
+to capture the full content of pages you save.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/karakeep
+```
+
+**Ask your AI:**
+> "I want to set up Karakeep (ghcr.io/karakeep/karakeep) for bookmark management.
+> It needs a headless Chrome container (browserless/chrome) for page capture and
+> a Meilisearch container for search. Container names: karakeep, karakeep-chrome,
+> karakeep-meilisearch. All on the 'kitestacks' network. Give me one docker-compose.yml
+> for all three."
+
+```bash
+cd ~/kitestacks-live/docker/karakeep && docker compose up -d
+```
+
+Visit `links.yourdomain.com`.
+
+**Important:** When you set up SSO for Karakeep in Step 6, note that Karakeep uses
+NextAuth.js with the provider ID `custom` — so the OAuth2 redirect URL will be
+`https://links.yourdomain.com/api/auth/callback/custom` (not `/callback/authentik`).
+This is a common mistake. Make a note of it now.
+
+---
+
+## Service 6 — Kavita (eBook Reader)
+
+Kavita lets you read ebooks, manga, and comics from a library you maintain.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/kavita/library/books
+mkdir -p ~/kitestacks-live/docker/kavita/config
+```
+
+**Ask your AI:**
+> "I want to set up Kavita (jvmilazz0/kavita) as an ebook reader. Container name: kavita.
+> The library should be mounted from ./library/books into the container. Config directory
+> at ./config. Network: kitestacks. Give me the docker-compose.yml."
+
+```bash
+cd ~/kitestacks-live/docker/kavita && docker compose up -d
+```
+
+Visit `kavita.yourdomain.com` and create your admin account. Add your books by placing
+ebook files in `~/kitestacks-live/docker/kavita/library/books/` and scanning the library
+in Kavita's settings.
+
+**Important for SSO:** Kavita's OIDC settings must be configured through the Kavita web UI,
+not by editing files directly. The Authority URL must end with a trailing slash:
+`https://auth.yourdomain.com/application/o/kavita/`
+
+---
+
+## Service 7 — Grafana (Monitoring Dashboards)
+
+Grafana shows you beautiful graphs of your server's CPU, RAM, network, and disk usage.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/grafana/provisioning/datasources
+mkdir -p ~/kitestacks-live/docker/grafana/provisioning/dashboards
+```
+
+**Ask your AI:**
+> "I want to set up Grafana (grafana/grafana) with Prometheus as the data source.
+> I want the 'Node Exporter Full' dashboard (id 1860) to auto-load via provisioning.
+> Container name: grafana. Network: kitestacks. Give me the docker-compose.yml and
+> the provisioning YAML files for the datasource and dashboard."
+
+```bash
+cd ~/kitestacks-live/docker/grafana && docker compose up -d
+```
+
+Visit `grafana.yourdomain.com`.
+
+**Also set up Prometheus and node-exporter (Grafana needs these for data):**
+
+**Ask your AI:**
+> "I want to set up Prometheus to scrape metrics from node-exporter running on the same
+> host. Container names: prometheus and node-exporter. Network: kitestacks. Give me the
+> docker-compose.yml and prometheus.yml config file."
+
+---
+
+## Service 8 — Uptime Kuma (Status Page)
+
+Uptime Kuma monitors all your services and shows a public status page.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/uptime-kuma
+```
+
+**Ask your AI:**
+> "Set up Uptime Kuma (louislam/uptime-kuma). Container name: uptime-kuma. Network: kitestacks.
+> Use a named volume called 'uptime-kuma' for data. Give me the docker-compose.yml."
+
+```bash
+cd ~/kitestacks-live/docker/uptime-kuma && docker compose up -d
+```
+
+Visit `status.yourdomain.com`, create your admin account, then add HTTP monitors for
+each of your eleven services. Set each monitor to check every 60 seconds.
+
+**Add a status page:**
+- In Uptime Kuma → Status Pages → New Status Page
+- Slug: `homelab`
+- Add all your monitors to it
+- Your public status page will be at `status.yourdomain.com/status/homelab`
+
+---
+
+## Service 9 — BookStack (Wiki)
+
+BookStack is a clean wiki for writing and organizing documentation.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/bookstack
+```
+
+**Ask your AI:**
+> "Set up BookStack (lscr.io/linuxserver/bookstack) with its own MariaDB database.
+> Container names: bookstack and bookstack-db. APP_URL should be https://wiki.yourdomain.com.
+> Network: kitestacks. Give me the docker-compose.yml."
+
+```bash
+cd ~/kitestacks-live/docker/bookstack && docker compose up -d
+```
+
+BookStack takes about a minute to start on first run. Visit `wiki.yourdomain.com`.
+Default login: `admin@admin.com` / `password` — change this immediately.
+
+---
+
+## Service 10 — OSTicket (Help Desk)
+
+OSTicket is a help desk and ticketing system.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/osticket
+```
+
+**Ask your AI:**
+> "Set up OSTicket using the docker image campbellsoftwaresolutions/osticket with its
+> own MySQL database. Container names: osticket-app and osticket-db. Network: kitestacks.
+> What environment variables do I need? Give me the docker-compose.yml."
+
+```bash
+cd ~/kitestacks-live/docker/osticket && docker compose up -d
+```
+
+Visit `tasks.yourdomain.com` to complete the web-based setup.
+
+---
+
+## Service 11 — Portainer (Docker Management)
+
+Portainer gives you a visual dashboard to manage all your containers.
+
+```bash
+mkdir -p ~/kitestacks-live/docker/portainer
+```
+
+**Ask your AI:**
+> "Set up Portainer CE (portainer/portainer-ce). Container name: portainer. Port 9443 (HTTPS).
+> Mount the Docker socket (/var/run/docker.sock) so it can manage containers.
+> Network: kitestacks. Give me the docker-compose.yml."
+
+```bash
+cd ~/kitestacks-live/docker/portainer && docker compose up -d
+```
+
+Visit `portainer.yourdomain.com`. Create your admin account.
+
+---
+
+## Checkpoint
+
+Run this to see all your containers:
+```bash
+docker ps --format "table {{.Names}}\t{{.Status}}"
+```
+
+You should see all of these running:
+- cloudflared
+- homepage
+- forgejo
+- authentik + authentik-worker
+- kite-openwebui + kite-litellm
+- karakeep + karakeep-chrome + karakeep-meilisearch
+- kavita
+- grafana + prometheus + node-exporter
+- uptime-kuma
+- bookstack + bookstack-db
+- osticket-app + osticket-db
+- portainer
+- authentik-postgres + authentik-redis
+
+If any are missing or show as unhealthy, check their logs:
+```bash
+docker logs <container-name>
+```
+
+Ask your AI to help diagnose any errors.
+
+---
+
+**Next:** [Step 6 — Single Sign-On (SSO)](06-sso.md)
--- a/homelab-mastery/build-guide/with-ai/06-sso.md
+++ b/homelab-mastery/build-guide/with-ai/06-sso.md
@ -0,0 +1,242 @@
+# Step 6 — Single Sign-On (SSO)
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 3–5 hours
+
+SSO (Single Sign-On) means one login for everything. After this step, you will log in
+with your Authentik account once and every service will recognize you automatically.
+No more logging in to each service separately.
+
+---
+
+## How SSO Works (Plain English)
+
+Without SSO:
+```
+You → Grafana login page → type username + password → logged in to Grafana
+You → Forgejo login page → type username + password → logged in to Forgejo
+(repeat for every service)
+```
+
+With SSO:
+```
+You → Grafana "Sign in with Authentik" button
+    → Authentik asks for login (once, or already remembered)
+    → Authentik tells Grafana "this is kenpat, let them in"
+    → Logged in to Grafana
+
+You → Forgejo "Sign in with Authentik"
+    → Already logged into Authentik → instantly logged in to Forgejo
+```
+
+The technology behind this is called **OAuth2** and **OIDC**. For now, you do not
+need to know the details — just follow the steps. (The concepts file explains it
+deeply if you are curious: [concepts/oauth2-oidc.md](../../concepts/oauth2-oidc.md))
+
+---
+
+## The Process for Each Service
+
+For every service, you do the same three things:
+
+**In Authentik:**
+1. Create an OAuth2 Provider for the service
+2. Create an Application that links to that Provider
+3. (Optional) Add a Policy to restrict who can access it
+
+**In the service:**
+4. Enter the Authentik credentials (client ID, client secret, URLs)
+
+Your AI will guide you through each one. Use this prompt template:
+
+> "I want to configure SSO for [service name] using Authentik as the OIDC provider.
+> The service is at https://[service].yourdomain.com. Walk me through:
+> 1. Creating an OAuth2 provider in Authentik's admin panel
+> 2. What redirect URI to use
+> 3. How to configure the service to use Authentik for login"
+
+---
+
+## SSO for Grafana
+
+**In Authentik admin panel (auth.yourdomain.com/if/admin/):**
+1. Go to **Applications → Providers → Create**
+2. Choose **OAuth2/OpenID Provider**
+3. Name: `Grafana`
+4. Client type: `Confidential`
+5. Redirect URIs: `https://grafana.yourdomain.com/login/generic_oauth`
+6. Scopes: openid, email, profile
+7. Save — note the **Client ID** and **Client Secret**
+
+8. Go to **Applications → Applications → Create**
+9. Name: `Grafana`, Slug: `grafana`
+10. Provider: select the Grafana provider you just created
+11. Save
+
+**In Grafana's `.env` or `docker-compose.yml` environment:**
+```
+GF_AUTH_GENERIC_OAUTH_ENABLED=true
+GF_AUTH_GENERIC_OAUTH_NAME=Authentik
+GF_AUTH_GENERIC_OAUTH_CLIENT_ID=paste-client-id-here
+GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET=paste-client-secret-here
+GF_AUTH_GENERIC_OAUTH_SCOPES=openid email profile
+GF_AUTH_GENERIC_OAUTH_AUTH_URL=https://auth.yourdomain.com/application/o/authorize/
+GF_AUTH_GENERIC_OAUTH_TOKEN_URL=https://auth.yourdomain.com/application/o/token/
+GF_AUTH_GENERIC_OAUTH_API_URL=https://auth.yourdomain.com/application/o/userinfo/
+GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH=contains(groups, 'homelab-admin') && 'Admin' || 'Viewer'
+```
+
+Restart Grafana: `docker compose restart grafana`
+
+Visit `grafana.yourdomain.com` — you should see a "Sign in with Authentik" button.
+
+---
+
+## SSO for Forgejo
+
+**In Authentik:** Create an OAuth2 Provider with:
+- Redirect URI: `https://gitforge.yourdomain.com/user/oauth2/authentik/callback`
+
+**In Forgejo:**
+- Site Administration → Authentication Sources → Add Authentication Source
+- Type: OAuth2
+- Name: `authentik`
+- OAuth2 Provider: OpenID Connect
+- Client ID and Secret from Authentik
+- OpenID Connect Discovery URL: `https://auth.yourdomain.com/application/o/forgejo/.well-known/openid-configuration`
+
+**Ask your AI:** "Walk me through adding an OAuth2 authentication source in Forgejo's admin panel."
+
+---
+
+## SSO for Karakeep
+
+**Important:** Karakeep uses NextAuth.js internally. The redirect URI is NOT the usual
+`/callback/authentik` — it is `/api/auth/callback/custom`.
+
+**In Authentik:** Create OAuth2 Provider with:
+- Redirect URI: `https://links.yourdomain.com/api/auth/callback/custom`
+
+**In Karakeep's environment:**
+```
+NEXTAUTH_URL=https://links.yourdomain.com
+NEXTAUTH_SECRET=generate-a-random-secret
+OAUTH_WELLKNOWN_URL=https://auth.yourdomain.com/application/o/karakeep/.well-known/openid-configuration
+OAUTH_CLIENT_ID=paste-client-id
+OAUTH_CLIENT_SECRET=paste-client-secret
+OAUTH_PROVIDER_NAME=Authentik
+OAUTH_ALLOW_DANGEROUS_EMAIL_ACCOUNT_LINKING=true
+```
+
+---
+
+## SSO for Kavita
+
+**In Authentik:** Create OAuth2 Provider with:
+- Redirect URI: `https://kavita.yourdomain.com/api/auth/callback`
+
+**In Kavita:** Go to Settings → OIDC (must be done through the UI, not by editing files):
+- Authority: `https://auth.yourdomain.com/application/o/kavita/` ← trailing slash required
+- Client ID and Client Secret from Authentik
+- Enabled: on
+
+**Critical:** The trailing slash in the Authority URL is required. Without it, Kavita
+gives an "issuer does not match" error.
+
+---
+
+## SSO for Open WebUI
+
+**In Authentik:** Create OAuth2 Provider with:
+- Redirect URI: `https://ai.yourdomain.com/oauth/oidc/callback`
+
+**In Open WebUI's environment:**
+```
+ENABLE_OAUTH_SIGNUP=true
+OAUTH_PROVIDER_NAME=Authentik
+OPENID_PROVIDER_URL=https://auth.yourdomain.com/application/o/openwebui/.well-known/openid-configuration
+OAUTH_CLIENT_ID=paste-client-id
+OAUTH_CLIENT_SECRET=paste-client-secret
+```
+
+---
+
+## SSO for BookStack
+
+**In Authentik:** Create OAuth2 Provider with:
+- Redirect URI: `https://wiki.yourdomain.com/oidc/callback`
+- Issuer mode: **Per Provider** (important — set this in Authentik's provider settings)
+
+**In BookStack's `.env`:**
+```
+AUTH_METHOD=oidc
+AUTH_AUTO_INITIATE=false
+OIDC_NAME=Authentik
+OIDC_DISPLAY_NAME_CLAIMS=name
+OIDC_CLIENT_ID=paste-client-id
+OIDC_CLIENT_SECRET=paste-client-secret
+OIDC_ISSUER=https://auth.yourdomain.com/application/o/bookstack/
+OIDC_ISSUER_DISCOVER=true
+```
+
+After setting this up, the BookStack cache directory needs to be writable:
+```bash
+docker exec bookstack chown -R abc:users /config/www/framework/cache/
+docker compose restart bookstack
+```
+
+---
+
+## SSO for Portainer
+
+**In Authentik:** Create OAuth2 Provider with:
+- Redirect URI: `https://portainer.yourdomain.com`
+
+**In Portainer:** Settings → Authentication → OAuth:
+- Provider: Custom
+- Client ID and Secret from Authentik
+- Authorization URL: `https://auth.yourdomain.com/application/o/authorize/`
+- Token URL: `https://auth.yourdomain.com/application/o/token/`
+- Userinfo URL: `https://auth.yourdomain.com/application/o/userinfo/`
+- Redirect URL: `https://portainer.yourdomain.com`
+- Scopes: `openid email profile`
+
+**Security note:** In Authentik, add a Policy Binding to the Portainer application
+to restrict access to your admin group only. This prevents anyone with an Authentik
+account from accessing the Docker management panel.
+
+---
+
+## Restricting Access by Group (Security)
+
+For sensitive services like Portainer, you want only administrators to access them:
+
+1. In Authentik, go to **Directory → Groups → Create**
+2. Name: `homelab-admin`
+3. Add yourself to this group
+
+4. Go to **Applications → Applications → [Portainer] → Policy Bindings**
+5. Add a binding: Group → `homelab-admin` → Allow
+
+Now only members of `homelab-admin` can use the Portainer application through SSO.
+
+---
+
+## Checkpoint
+
+Test SSO for each service:
+- [ ] Grafana — "Sign in with Authentik" works
+- [ ] Forgejo — OAuth2 login works
+- [ ] Karakeep — SSO login works
+- [ ] Kavita — "Sign in with Authentik" works
+- [ ] Open WebUI — SSO login works
+- [ ] BookStack — OIDC login works
+- [ ] Portainer — OAuth login works
+
+If any fail, check the error message and ask your AI: "I'm getting this error when
+signing in to [service] with Authentik: [paste the error]. What does it mean and
+how do I fix it?"
+
+---
+
+**Next:** [Step 7 — Cloud Failover (kscloud1)](07-cloud-failover.md)
--- a/homelab-mastery/build-guide/with-ai/07-cloud-failover.md
+++ b/homelab-mastery/build-guide/with-ai/07-cloud-failover.md
@ -0,0 +1,202 @@
+# Step 7 — Cloud Failover (kscloud1)
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 4–6 hours
+
+Right now, if your home computer goes off, your entire website goes offline. This step
+fixes that. You will turn your cloud VPS (kscloud1) into a full mirror of your homelab,
+so that when your home computer is off, kscloud1 keeps everything running.
+
+---
+
+## What You Are Building
+
+```
+Home (monk)    ←—— always developing ——→ pushes to ——→   Cloud (kscloud1)
+                                                          always live
+                                                          never goes down
+
+Cloudflare routes traffic to whichever host responds.
+If monk is off, kscloud1 handles everything by itself.
+```
+
+---
+
+## Step 7A — Set Up Tailscale on Both Machines
+
+Tailscale creates a private, encrypted connection between your home computer and your VPS.
+You need this so both machines can share a database securely.
+
+**On your home computer:**
+```bash
+curl -fsSL https://tailscale.com/install.sh | sh
+sudo tailscale up
+```
+
+Follow the link it gives you to authenticate in your browser.
+
+**On your VPS (via SSH):**
+```bash
+curl -fsSL https://tailscale.com/install.sh | sh
+sudo tailscale up
+```
+
+Authenticate again.
+
+After both are connected, check their Tailscale IPs:
+```bash
+tailscale ip -4
+```
+
+Write down both IPs — they look like `100.x.x.x`. You will use these in the next steps.
+
+**Ask your AI:** "I have Tailscale installed on two machines. How do I verify they can
+reach each other using their Tailscale IPs?"
+
+---
+
+## Step 7B — Move the Shared Databases to kscloud1
+
+For SSO to work properly across both machines, both Authentik instances must share
+one database. If they have separate databases, logins will fail roughly half the time.
+
+This means:
+- Move (or start fresh) Postgres and Redis on kscloud1
+- Configure both monk and kscloud1's Authentik to point to kscloud1's database over Tailscale
+
+**On kscloud1**, create the database containers. Use the same passwords you used on monk:
+
+```bash
+mkdir -p /opt/kitestacks/docker/authentik
+cd /opt/kitestacks/docker/authentik
+```
+
+Create `docker-compose.yml` with Postgres and Redis bound to the Tailscale IP:
+
+```yaml
+services:
+  authentik-postgres:
+    image: postgres:16-alpine
+    container_name: authentik-postgres
+    restart: unless-stopped
+    environment:
+      POSTGRES_PASSWORD: your-db-password
+      POSTGRES_USER: authentik
+      POSTGRES_DB: authentik
+    ports:
+      - "100.123.x.x:5432:5432"   # bind to Tailscale IP only
+    volumes:
+      - ./postgres:/var/lib/postgresql/data
+    networks:
+      - kitestacks
+
+  authentik-redis:
+    image: redis:alpine
+    container_name: authentik-redis
+    restart: unless-stopped
+    ports:
+      - "100.123.x.x:6379:6379"   # bind to Tailscale IP only
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+```
+
+Replace `100.123.x.x` with kscloud1's actual Tailscale IP.
+
+```bash
+docker compose up -d
+```
+
+**On monk**, update Authentik's environment to point to kscloud1's database:
+```
+AUTHENTIK_POSTGRESQL__HOST=100.123.x.x   # kscloud1's Tailscale IP
+AUTHENTIK_REDIS__HOST=100.123.x.x
+```
+
+Restart Authentik on monk:
+```bash
+cd ~/kitestacks-live/docker/authentik
+docker compose down
+docker compose up -d
+```
+
+**Ask your AI:** "I need to migrate my Authentik database from one host to another.
+How do I dump the data from my current Postgres and restore it on the new host?"
+
+---
+
+## Step 7C — Deploy All Services on kscloud1
+
+Now deploy the same services on kscloud1. SSH into your VPS and create the same
+folder structure and docker-compose files that you have on monk.
+
+```bash
+mkdir -p /opt/kitestacks/docker
+```
+
+For each service (forgejo, homepage, karakeep, kavita, grafana, etc.):
+
+1. Create the folder: `mkdir -p /opt/kitestacks/docker/<service>`
+2. Copy your docker-compose.yml from monk (with any path changes for `/opt/kitestacks/`)
+3. Copy your .env files
+4. Run `docker compose up -d`
+
+The fastest way is to have your AI help you:
+
+> "I have all my services running on my home computer at ~/kitestacks-live/docker/.
+> I want to replicate them on my VPS at /opt/kitestacks/docker/. Can you help me
+> go through each service and identify what needs to change for the VPS environment?"
+
+**Important differences on kscloud1:**
+- Authentik already points to the shared Postgres/Redis (same as monk now)
+- Forgejo should also use the shared Postgres (add a `forgejo` database to it)
+- Paths use `/opt/kitestacks/` instead of `~/kitestacks-live/`
+
+---
+
+## Step 7D — Verify Failover Works
+
+With both machines running and both cloudflared connectors active, test that failover works:
+
+1. In your Cloudflare Tunnel dashboard, you should see **2 connectors**
+2. Visit your website from your phone (not connected to home WiFi)
+3. Everything should work
+4. Now stop monk's cloudflared: `cd ~/kitestacks-live/docker/cloudflared && docker compose stop`
+5. Visit your website again from your phone
+6. Everything should still work (kscloud1 is serving it)
+7. Restart monk's cloudflared: `docker compose start cloudflared`
+
+If step 6 works, your cloud failover is complete.
+
+---
+
+## Step 7E — Set Up Uptime Kuma on kscloud1
+
+Your Conky desktop widget reads Uptime Kuma from kscloud1 (not monk). Set it up there:
+
+Deploy uptime-kuma on kscloud1 the same way you did on monk. Then push your monitors
+from monk to kscloud1 by copying the database.
+
+**Ask your AI:** "How do I copy a SQLite database from one Docker container to another
+on a different machine, safely and without data corruption?"
+
+The trick is using Python's `sqlite3.backup()` method — it creates a consistent copy
+even while the database is in use.
+
+---
+
+## Checkpoint
+
+- [ ] Tailscale is installed on both machines and they can reach each other
+- [ ] Shared Postgres and Redis are running on kscloud1's Tailscale IP
+- [ ] Both Authentik instances (monk and kscloud1) point to the shared database
+- [ ] All 11 services are running on kscloud1
+- [ ] Cloudflare Tunnel shows 2 connectors
+- [ ] Website works when monk's cloudflared is stopped
+
+---
+
+**Next:** [Step 8 — Monitoring](08-monitoring.md)
--- a/homelab-mastery/build-guide/with-ai/08-monitoring.md
+++ b/homelab-mastery/build-guide/with-ai/08-monitoring.md
@ -0,0 +1,229 @@
+# Step 8 — Monitoring
+
+**Track:** With AI (Beginner)  
+**Time for this step:** 2–3 hours
+
+Monitoring means knowing when something is wrong before your users tell you.
+In this step you will set up three layers of monitoring:
+
+1. **Grafana** — beautiful dashboards showing CPU, RAM, disk, and network over time
+2. **Uptime Kuma** — checks every 60 seconds that each service responds correctly
+3. **Conky** — a desktop widget on your home computer showing live kscloud1 status
+
+---
+
+## Monitoring Layer 1 — Grafana + Prometheus
+
+You already deployed Grafana and Prometheus in Step 5. Now configure them properly.
+
+### Edit the Prometheus Config
+
+Prometheus needs to know where to collect metrics from. Tell it about both machines:
+
+```bash
+nano ~/kitestacks-live/docker/prometheus/prometheus.yml
+```
+
+Add this content:
+```yaml
+global:
+  scrape_interval: 15s
+
+scrape_configs:
+  - job_name: 'monk-node'
+    static_configs:
+      - targets: ['node-exporter:9100']
+        labels:
+          instance: 'monk'
+
+  - job_name: 'kscloud1-node'
+    static_configs:
+      - targets: ['YOUR_VPS_IP:9100']
+        labels:
+          instance: 'kscloud1'
+```
+
+Replace `YOUR_VPS_IP` with your VPS's public IP address.
+
+**On kscloud1**, make sure node-exporter is configured to be reachable publicly:
+```yaml
+# In node-exporter's docker-compose.yml on kscloud1
+ports:
+  - "0.0.0.0:9100:9100"
+```
+
+Restart Prometheus:
+```bash
+cd ~/kitestacks-live/docker/prometheus
+docker compose restart prometheus
+```
+
+### Configure Grafana Provisioning
+
+Tell Grafana to automatically load Prometheus as a data source and load the
+Node Exporter Full dashboard:
+
+Create `~/kitestacks-live/docker/grafana/provisioning/datasources/prometheus.yml`:
+```yaml
+apiVersion: 1
+datasources:
+  - name: Prometheus
+    type: prometheus
+    uid: 000000001
+    url: http://prometheus:9090
+    isDefault: true
+```
+
+Create `~/kitestacks-live/docker/grafana/provisioning/dashboards/dashboards.yml`:
+```yaml
+apiVersion: 1
+providers:
+  - name: default
+    folder: KiteStacks
+    type: file
+    options:
+      path: /etc/grafana/provisioning/dashboards
+```
+
+The Node Exporter Full dashboard (id 1860) can be imported from Grafana's dashboard library:
+1. Log in to grafana.yourdomain.com
+2. Left menu → Dashboards → Import
+3. Enter ID: `1860`
+4. Select your Prometheus datasource
+5. Import
+
+You should now see CPU, RAM, disk, and network graphs for both monk and kscloud1.
+Switch between them using the "instance" dropdown at the top of the dashboard.
+
+---
+
+## Monitoring Layer 2 — Uptime Kuma
+
+You set up Uptime Kuma in Step 5. Now add monitors for all your services.
+
+Log in to `status.yourdomain.com` and add an HTTP monitor for each service:
+
+| Monitor Name | URL | Check Interval |
+|-------------|-----|----------------|
+| Main Website | https://www.yourdomain.com | 60s |
+| Authentik | https://auth.yourdomain.com | 60s |
+| Forgejo | https://gitforge.yourdomain.com | 60s |
+| KiteAI | https://ai.yourdomain.com | 60s |
+| Karakeep | https://links.yourdomain.com | 60s |
+| Kavita | https://kavita.yourdomain.com | 60s |
+| Grafana | https://grafana.yourdomain.com | 60s |
+| BookStack | https://wiki.yourdomain.com | 60s |
+| OSTicket | https://tasks.yourdomain.com | 60s |
+| Portainer | https://portainer.yourdomain.com | 60s |
+| kscloud1 | (ping to kscloud1 IP) | 60s |
+| Monk | (ping to monk's Tailscale IP) | 60s |
+
+Then create a Status Page:
+1. Status Pages → New Status Page
+2. Title: "KiteStacks Status"
+3. Slug: `homelab`
+4. Add all monitors to it
+
+**Push Uptime Kuma to kscloud1:**
+
+The Conky widget on your desktop reads kscloud1's Uptime Kuma, not monk's. Push monk's
+database to kscloud1 after setting up monitors:
+
+**Ask your AI:** "How do I copy a Docker named volume's SQLite database from one machine
+to another using Python's sqlite3.backup() method?"
+
+---
+
+## Monitoring Layer 3 — Conky Desktop Widget
+
+Conky is a program that draws information on your desktop background in real time.
+Your KiteStacks widget shows whether each service on kscloud1 is up (green dot) or
+down (red dot), refreshed every 15 seconds.
+
+### Install Conky
+
+```bash
+sudo apt install conky-all
+```
+
+### Install the Widget Script
+
+The widget script reads Uptime Kuma's API and formats the output for Conky.
+The script is at `~/.local/bin/kitestacks-uptime-widget.sh` in the homelab repo.
+
+Copy it to your machine:
+```bash
+mkdir -p ~/.local/bin
+cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime-widget.sh ~/.local/bin/
+chmod +x ~/.local/bin/kitestacks-uptime-widget.sh
+```
+
+Edit the script to use your kscloud1's Tailscale IP:
+```bash
+nano ~/.local/bin/kitestacks-uptime-widget.sh
+```
+
+Change the `KUMA_URL` line:
+```bash
+KUMA_URL="http://100.123.x.x:3001"   # kscloud1's Tailscale IP
+```
+
+### Enable the Conky Config
+
+```bash
+cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime.conf ~/.config/conky/kitestacks-uptime.conf
+conky -c ~/.config/conky/kitestacks-uptime.conf -d
+```
+
+The widget should appear in the top-right corner of your desktop, showing a dot for
+each service — green for up, red for down.
+
+**Ask your AI:** "How do I make Conky start automatically when I log in to my Ubuntu desktop?"
+
+---
+
+## Setting Up Alerts
+
+Uptime Kuma can send you a notification on your phone when a service goes down.
+
+**Option 1: ntfy (recommended — self-hosted)**
+You have ntfy running as a container. Set up an ntfy notification in Uptime Kuma:
+- Notification Type: ntfy
+- URL: your ntfy server URL
+- Topic: choose a topic name (e.g., `homelab-alerts`)
+
+Install the ntfy app on your phone and subscribe to your topic.
+
+**Option 2: Email**
+Configure email notifications in Uptime Kuma using your email address.
+
+**Ask your AI:** "How do I configure Uptime Kuma to send notifications via ntfy?"
+
+---
+
+## Checkpoint
+
+- [ ] Prometheus is collecting metrics from both monk and kscloud1
+- [ ] Grafana shows Node Exporter Full dashboard with both hosts
+- [ ] Uptime Kuma has monitors for all 11 services
+- [ ] Uptime Kuma status page is live at status.yourdomain.com/status/homelab
+- [ ] Uptime Kuma database has been pushed to kscloud1
+- [ ] Conky widget is showing on your desktop with live service status
+- [ ] You receive a notification when you manually pause a service in Uptime Kuma
+
+---
+
+## Congratulations — Your Homelab Is Complete
+
+You have built a production homelab with:
+- 11 self-hosted services running in Docker
+- Single sign-on via Authentik
+- Cloud failover on a Hetzner VPS
+- Private networking over Tailscale
+- Real-time monitoring via Grafana and Uptime Kuma
+- A live desktop status widget
+
+Everything you built here maps directly to enterprise cloud engineering skills.
+Every concept has a certification that covers it in depth.
+
+**Your next step:** [certifications/roadmap.md](../../certifications/roadmap.md)
--- a/homelab-mastery/build-guide/without-ai/01-linux-foundations.md
+++ b/homelab-mastery/build-guide/without-ai/01-linux-foundations.md
@ -0,0 +1,321 @@
+# Without AI — Part 1: Linux Foundations
+
+**Track:** Advanced (No AI)  
+**Time for this section:** 1–2 weeks of evenings and weekends
+
+Before you touch Docker or any service, you need a solid foundation in Linux.
+Every command you run in this homelab is a Linux command. If you skip this,
+you will be copying without understanding — which means you cannot debug when
+things go wrong.
+
+---
+
+## Total Build Time Estimate (Without AI)
+
+Before you start, here is an honest breakdown of how long this entire homelab
+takes to build from scratch — assuming you are learning as you go, working
+2–3 hours on evenings and weekends:
+
+| Phase | What You Are Learning / Building | Estimated Time |
+|-------|----------------------------------|---------------|
+| 1 — Linux Foundations | Shell, filesystem, permissions, SSH | 1–2 weeks |
+| 2 — Bash Scripting | Variables, loops, conditionals, scripts | 1–2 weeks |
+| 3 — Python Basics | Data structures, sqlite3, HTTP requests | 1–2 weeks |
+| 4 — Docker Deep Dive | Images, volumes, networks, compose | 1–2 weeks |
+| 5 — Networking | DNS, ports, TLS, Tailscale, firewalls | 1–2 weeks |
+| 6 — Full Build | Deploying all 11 services + cloud failover | 4–8 weeks |
+| 7 — Troubleshooting | Debugging, production issues, fixes | Ongoing |
+| Documentation | Writing what you built and why | 1 week |
+
+**Total: approximately 3–6 months** working part-time (evenings + weekends).
+
+**Full-time (8 hours/day):** 6–10 weeks.
+
+The wide ranges reflect the honest reality: some people hit a DNS issue that takes
+3 hours to debug. Some services take a day to configure SSO for. Budget extra time.
+The troubleshooting you will do along the way is not wasted time — it is where most
+of the real learning happens.
+
+---
+
+## What Is Linux?
+
+Linux is an operating system — like Windows or macOS — but open source, free, and
+used to run most of the internet. Your home server, your cloud VPS, and almost every
+web server in existence runs Linux.
+
+**Why Linux and not Windows Server?**
+- Free — no licensing cost
+- More control — no hidden processes you can't see or stop
+- Docker runs natively on Linux (on Windows, Docker runs inside a hidden Linux VM)
+- The entire cloud engineering industry is Linux-first
+
+You will use **Ubuntu 24.04 LTS** — the most widely used Linux distribution for servers.
+
+---
+
+## The Terminal
+
+The terminal (also called the shell or command line) is where you work. There is no
+graphical interface for most server tasks. You type a command, press Enter, read the
+output, and type the next command.
+
+Open a terminal on Ubuntu: `Ctrl + Alt + T`
+
+You will see a prompt like:
+```
+kenpat@monk:~$
+```
+
+Breaking that down:
+- `kenpat` — your username
+- `monk` — the machine name (hostname)
+- `~` — your current directory (`~` means your home directory, `/home/kenpat`)
+- `$` — indicates you are a regular user (not root/admin)
+
+---
+
+## The Filesystem
+
+Linux organizes everything in a tree of directories (folders) starting at `/` (root).
+
+```
+/
+├── home/          ← user home directories
+│   └── kenpat/    ← your home directory (~)
+├── etc/           ← system configuration files
+├── var/           ← variable data (logs, databases)
+├── usr/           ← installed programs
+├── tmp/           ← temporary files (cleared on reboot)
+├── opt/           ← optional software (we use this for kscloud1)
+└── proc/          ← virtual filesystem — represents running processes
+```
+
+**Key commands:**
+
+```bash
+pwd                    # Print Working Directory — where am I right now?
+ls                     # List files in current directory
+ls -la                 # List all files, including hidden ones, with permissions
+cd /home/kenpat        # Change Directory — move to a specific path
+cd ~                   # Go to your home directory
+cd ..                  # Go up one level
+mkdir mydir            # Make a new directory
+mkdir -p a/b/c         # Make directories including parents (-p = parents)
+rm file.txt            # Remove a file
+rm -rf mydir/          # Remove a directory and everything inside it (-r = recursive, -f = force)
+cp file.txt backup.txt # Copy a file
+mv file.txt newname.txt# Move or rename a file
+cat file.txt           # Print the contents of a file
+less file.txt          # View a file page by page (q to quit)
+nano file.txt          # Open a file in the nano text editor
+```
+
+**Practice:** Run these commands. Navigate around the filesystem. Understand what you see.
+
+```bash
+pwd                    # Where are you?
+ls /                   # What is in the root directory?
+ls /home               # What home directories exist?
+ls -la ~               # What files are in YOUR home directory? (hidden files too)
+cd /var/log            # Go to the log directory
+ls                     # What log files exist?
+cat /etc/hostname      # What is this machine's hostname?
+cd ~                   # Go back home
+```
+
+---
+
+## File Permissions
+
+Every file in Linux has permissions that control who can read it, write to it, or
+execute it. This is crucial — misconfigured permissions are a common source of bugs.
+
+```
+-rw-r--r-- 1 kenpat kenpat 1234 Jun 19 10:00 myfile.txt
+```
+
+Breaking it down:
+- `-` — file type (`d` = directory, `-` = regular file, `l` = symlink)
+- `rw-` — owner permissions: read, write, no execute
+- `r--` — group permissions: read only
+- `r--` — everyone else: read only
+- `kenpat kenpat` — owner and group
+
+```bash
+chmod 644 myfile.txt   # rw-r--r-- (owner read/write, others read)
+chmod 755 myscript.sh  # rwxr-xr-x (owner full, others read+execute)
+chmod +x myscript.sh   # Add execute permission for everyone
+chown kenpat:kenpat file.txt  # Change owner to kenpat, group to kenpat
+chown -R 1000:1000 /mydir/    # Change owner recursively for entire directory
+```
+
+**Why this matters in Docker:** Docker containers run as specific user IDs.
+If a container expects to own a file (e.g., UID 1000) but the file is owned by
+root, the container cannot write to it. Many Docker setup issues come down to
+file permission mistakes.
+
+---
+
+## Users and sudo
+
+Linux separates regular users from the administrator (called `root`).
+Root can do anything — delete system files, stop critical services, change any setting.
+Regular users cannot.
+
+`sudo` lets a trusted user run a single command as root:
+
+```bash
+sudo apt update           # Run apt update as root
+sudo systemctl restart docker   # Restart Docker as root
+sudo nano /etc/hosts      # Edit a system file as root
+```
+
+**Non-interactive sudo** (used in scripts when there is no terminal to type a password):
+```bash
+echo mypassword | sudo -S apt update
+# -S reads password from stdin (standard input)
+```
+
+**Become root entirely** (use carefully):
+```bash
+sudo -i    # Opens a root shell. Prompt changes from $ to #
+exit       # Return to regular user
+```
+
+---
+
+## SSH — Connecting to Remote Machines
+
+SSH (Secure Shell) lets you control a remote machine over an encrypted connection.
+
+```bash
+ssh kenpat@192.168.1.100          # Connect to a local machine
+ssh root@5.78.x.x                 # Connect to your VPS as root
+ssh -i ~/.ssh/mykey kenpat@host   # Connect using a specific private key
+ssh -L 5099:localhost:5000 kenpat@host  # Local port forward
+```
+
+### SSH Keys (Better Than Passwords)
+
+Instead of typing a password every time, you generate a key pair:
+- **Private key** (`~/.ssh/id_ed25519`) — stays on your machine, never shared
+- **Public key** (`~/.ssh/id_ed25519.pub`) — put this on the server
+
+```bash
+# Generate a new key pair
+ssh-keygen -t ed25519 -C "monk-to-kscloud1" -f ~/.ssh/id_ed25519_kscloud1
+
+# Copy your public key to the server
+ssh-copy-id -i ~/.ssh/id_ed25519_kscloud1.pub kenpat@your-vps-ip
+
+# Connect using the key
+ssh -i ~/.ssh/id_ed25519_kscloud1 kenpat@your-vps-ip
+```
+
+### SSH Local Port Forwarding
+
+Sometimes a service is running on a remote machine but not exposed publicly.
+You can forward a local port to a remote port through the SSH connection:
+
+```bash
+ssh -L 5099:localhost:5000 kenpat@kscloud1-tailscale-ip
+```
+
+This means: "On MY machine, port 5099 forwards to kscloud1's localhost:5000."
+Now visiting `http://localhost:5099` in your browser reaches kscloud1's port 5000.
+
+Used in this homelab to access kscloud1's Kavita directly (bypassing Cloudflare)
+when configuring OIDC settings.
+
+---
+
+## Package Management (apt)
+
+Ubuntu uses `apt` to install, update, and remove software:
+
+```bash
+sudo apt update              # Refresh the list of available packages
+sudo apt upgrade -y          # Install all available updates
+sudo apt install -y curl git # Install specific packages
+sudo apt remove package      # Remove a package
+sudo apt search keyword      # Search for a package by name
+dpkg -l | grep docker        # List installed packages matching "docker"
+```
+
+---
+
+## Processes and Services
+
+```bash
+ps aux                        # List all running processes
+ps aux | grep docker          # Find processes matching "docker"
+top                           # Live process monitor (q to quit)
+htop                          # Better live monitor (install with: sudo apt install htop)
+kill 1234                     # Send kill signal to process ID 1234
+kill -9 1234                  # Force kill (cannot be ignored)
+pkill conky                   # Kill all processes named "conky"
+
+systemctl status docker       # Check if Docker service is running
+systemctl start docker        # Start it
+systemctl stop docker         # Stop it
+systemctl restart docker      # Restart it
+systemctl enable docker       # Make it start automatically on boot
+systemctl disable docker      # Prevent it from starting on boot
+```
+
+---
+
+## Reading Logs
+
+When something breaks, you read the logs to find out why:
+
+```bash
+journalctl -u docker          # System logs for the Docker service
+journalctl -f                 # Follow all system logs live
+cat /var/log/syslog           # System log file
+tail -f /var/log/syslog       # Follow (live tail) the system log
+dmesg | tail -20              # Kernel messages, last 20 lines
+```
+
+---
+
+## Essential Tools
+
+```bash
+curl -s https://example.com           # Make an HTTP GET request
+curl -s https://example.com | head    # Pipe output through head (first 10 lines)
+wget https://example.com/file.zip     # Download a file
+grep "error" /var/log/syslog          # Search a file for a pattern
+grep -r "TUNNEL_TOKEN" ~/kitestacks-live/  # Search recursively in a directory
+find ~ -name "*.env" 2>/dev/null      # Find all .env files in home dir
+find /opt -name "docker-compose.yml"  # Find all compose files
+wc -l file.txt                        # Count lines in a file
+cut -d= -f2 file.env                  # Cut: split by = and take field 2
+tr -d '\n'                            # Remove newlines from input
+|                                     # Pipe: send output of one command to another
+>                                     # Redirect: write output to a file (overwrites)
+>>                                    # Redirect: append output to a file
+2>/dev/null                           # Redirect error output to /dev/null (discard errors)
+```
+
+---
+
+## Practice Exercises
+
+Do these before moving on:
+
+1. Navigate to `/var/log` and read the last 20 lines of `syslog`
+2. Create a directory structure: `~/practice/a/b/c/`
+3. Create a file in `c/` with your name in it using `echo "your name" > ~/practice/a/b/c/name.txt`
+4. Read it with `cat`
+5. Check its permissions with `ls -la`
+6. Change its permissions to read-only: `chmod 444 ~/practice/a/b/c/name.txt`
+7. Try to edit it — what happens?
+8. Find all `.conf` files in `/etc/` that contain the word "ubuntu"
+9. Generate an SSH key pair with `ssh-keygen`
+10. SSH into your VPS
+
+---
+
+**Next:** [Part 2 — Bash Scripting](02-bash-scripting.md)
--- a/homelab-mastery/build-guide/without-ai/02-bash-scripting.md
+++ b/homelab-mastery/build-guide/without-ai/02-bash-scripting.md
@ -0,0 +1,333 @@
+# Without AI — Part 2: Bash Scripting
+
+**Track:** Advanced (No AI)  
+**Time for this section:** 1–2 weeks
+
+Bash is the language of the Linux shell. Almost every automation script in this
+homelab is a Bash script. You do not need to master it — you need to be able to
+read it, write simple scripts, and understand what a script does before you run it.
+
+---
+
+## What Is a Script?
+
+A script is a text file containing a sequence of shell commands. Instead of typing
+commands one by one, you put them in a file and run the file.
+
+```bash
+#!/usr/bin/env bash
+# This is a comment
+
+echo "Hello from my script"
+```
+
+The first line (`#!/usr/bin/env bash`) is called the **shebang**. It tells Linux
+which interpreter to use to run this file. Without it, Linux may use the wrong shell.
+
+To run a script:
+```bash
+chmod +x myscript.sh    # Make it executable
+./myscript.sh           # Run it
+```
+
+Or without making it executable:
+```bash
+bash myscript.sh
+```
+
+---
+
+## Variables
+
+Variables store values you want to reuse:
+
+```bash
+name="kenpat"
+port=3000
+greeting="Hello, $name"
+
+echo $name          # prints: kenpat
+echo $port          # prints: 3000
+echo $greeting      # prints: Hello, kenpat
+echo "${name}s"     # prints: kenpats (braces needed when appending)
+```
+
+**Special variables:**
+```bash
+$0        # The script's own filename
+$1 $2 $3  # Command-line arguments (first, second, third)
+$#        # Number of arguments passed
+$?        # Exit code of the last command (0 = success, non-zero = error)
+$$        # Current process ID (PID)
+$HOME     # Your home directory path
+$USER     # Your username
+```
+
+**Read-only environment variables:**
+```bash
+export MY_VAR="value"    # Make available to child processes
+printenv                 # List all environment variables
+printenv MY_VAR          # Print one variable
+```
+
+---
+
+## Conditionals (if/else)
+
+```bash
+if [[ condition ]]; then
+    # commands if true
+elif [[ other_condition ]]; then
+    # commands if second condition is true
+else
+    # commands if nothing was true
+fi
+```
+
+**Common conditions:**
+```bash
+[[ -f /path/to/file ]]     # True if file exists and is a regular file
+[[ -d /path/to/dir ]]      # True if directory exists
+[[ -s /path/to/file ]]     # True if file exists and is non-empty
+[[ -z "$var" ]]            # True if variable is empty
+[[ -n "$var" ]]            # True if variable is NOT empty
+[[ "$a" == "$b" ]]         # True if strings are equal
+[[ "$a" != "$b" ]]         # True if strings are NOT equal
+[[ $n -eq 5 ]]             # True if number equals 5
+[[ $n -gt 5 ]]             # True if number is greater than 5
+[[ $n -lt 5 ]]             # True if number is less than 5
+```
+
+**Real example from the homelab:**
+```bash
+if [[ $# -ne 1 ]]; then
+    echo "Usage: $0 '<cloudflare_tunnel_token>'" >&2
+    exit 2
+fi
+```
+
+This checks that exactly one argument was provided (`$# -ne 1` means "number of args
+is not equal to 1"). If not, it prints usage instructions and exits with code 2 (error).
+The `>&2` sends the message to stderr (error output) instead of stdout (normal output).
+
+---
+
+## Loops
+
+**For loop — iterate over a list:**
+```bash
+for item in one two three; do
+    echo "Item: $item"
+done
+
+# Iterate over files
+for file in *.yml; do
+    echo "Found compose file: $file"
+done
+
+# Iterate over a range of numbers
+for i in {1..10}; do
+    echo "Number: $i"
+done
+```
+
+**While loop — repeat while a condition is true:**
+```bash
+count=0
+while [[ $count -lt 5 ]]; do
+    echo "Count: $count"
+    count=$(( count + 1 ))
+done
+
+# Wait until a container is healthy
+while [[ "$(docker inspect --format '{{.State.Health.Status}}' authentik)" != "healthy" ]]; do
+    echo "Waiting for authentik..."
+    sleep 5
+done
+echo "Authentik is healthy"
+```
+
+---
+
+## Functions
+
+```bash
+greet() {
+    local name="$1"    # local = only exists inside this function
+    echo "Hello, $name"
+}
+
+greet "kenpat"   # prints: Hello, kenpat
+greet "world"    # prints: Hello, world
+```
+
+**Why local variables matter:** Without `local`, variables are global and can
+accidentally overwrite values from other parts of the script.
+
+---
+
+## Error Handling
+
+```bash
+set -euo pipefail
+```
+
+Put this near the top of every script you write. It sets three behaviors:
+- `-e` — exit immediately if any command fails (returns non-zero exit code)
+- `-u` — exit if you use an undefined variable
+- `-o pipefail` — if any command in a pipeline fails, the whole pipeline fails
+
+Without this, a script can silently continue after an error, potentially causing
+damage downstream (like deleting data after a failed backup).
+
+**Checking a command's result:**
+```bash
+if curl -s https://example.com > /dev/null; then
+    echo "Site is up"
+else
+    echo "Site is down"
+fi
+```
+
+**Exit codes:**
+```bash
+exit 0    # Success
+exit 1    # Generic error
+exit 2    # Misuse (bad arguments)
+```
+
+---
+
+## String Manipulation
+
+```bash
+var="TUNNEL_TOKEN=abc123"
+
+# Split by delimiter, take field 2
+echo "$var" | cut -d= -f2        # prints: abc123
+
+# But what if the value itself contains = signs?
+echo "$var" | cut -d= -f2-       # prints: abc123 (f2- = from field 2 to end)
+
+# Remove trailing newline
+echo "hello" | tr -d '\n'
+
+# Convert to lowercase
+echo "HELLO" | tr '[:upper:]' '[:lower:]'
+
+# Replace text
+echo "hello world" | sed 's/world/there/'   # prints: hello there
+echo "aabbcc" | sed 's/b/B/g'               # prints: aaBBcc (g = all occurrences)
+
+# Extract with grep
+echo "addr: 192.168.1.1" | grep -oP '\d+\.\d+\.\d+\.\d+'  # prints: 192.168.1.1
+```
+
+---
+
+## Here Documents (heredoc)
+
+A heredoc lets you write multi-line strings inline:
+
+```bash
+cat <<'EOF'
+This is line one
+This is line two
+Variables like $HOME are NOT expanded (because of the quotes around EOF)
+EOF
+
+cat <<EOF
+This is line one
+HOME is: $HOME   (expanded because no quotes)
+EOF
+```
+
+Used in this homelab to write multi-line content to files:
+```bash
+cat > /tmp/fix.sql <<'EOF'
+BEGIN;
+UPDATE ServerSetting SET Value='{"enabled":true}' WHERE "Key"=40;
+COMMIT;
+EOF
+```
+
+---
+
+## Real Scripts in This Homelab
+
+### The Token Rotation Script
+
+`~/kitestacks-homelab/scripts/rollout-cloudflared-token.sh`:
+```bash
+#!/usr/bin/env bash
+set -euo pipefail
+
+if [[ $# -ne 1 ]]; then
+  echo "Usage: $0 '<cloudflare_tunnel_token>'" >&2
+  exit 2
+fi
+
+token="$1"
+monk_dir="${MONK_CLOUDFLARED_DIR:-$HOME/kitestacks-live/docker/cloudflared}"
+kscloud1_host="${KSCLOUD1_HOST:?set KSCLOUD1_HOST, for example user@host}"
+kscloud1_key="${KSCLOUD1_KEY:-$HOME/.ssh/id_ed25519_kscloud1}"
+kscloud1_dir="${KSCLOUD1_CLOUDFLARED_DIR:-/opt/kitestacks/docker/cloudflared}"
+```
+
+Walking through each line:
+- `set -euo pipefail` — fail fast and safely
+- `$# -ne 1` — check exactly one argument was given
+- `${MONK_CLOUDFLARED_DIR:-$HOME/...}` — use environment variable if set, otherwise use default
+- `${KSCLOUD1_HOST:?...}` — if `KSCLOUD1_HOST` is not set, exit with that error message
+
+This is a real production script. Read it in full at that path.
+
+---
+
+## Writing Your Own Scripts
+
+**Template for any script:**
+```bash
+#!/usr/bin/env bash
+set -euo pipefail
+
+# --- Configuration (change these) ---
+MY_VAR="${MY_ENV_VAR:-default_value}"
+TARGET_HOST="${1:?Usage: $0 <hostname>}"
+
+# --- Functions ---
+log() {
+    echo "[$(date '+%H:%M:%S')] $*"
+}
+
+die() {
+    echo "ERROR: $*" >&2
+    exit 1
+}
+
+# --- Main ---
+log "Starting..."
+
+if [[ ! -d "$TARGET_HOST" ]]; then
+    die "Directory does not exist: $TARGET_HOST"
+fi
+
+log "Done."
+```
+
+---
+
+## Practice Exercises
+
+1. Write a script that checks if Docker is running and prints "Docker is up" or "Docker is down"
+2. Write a script that takes a service name as an argument and shows its logs:
+   `./show-logs.sh forgejo`
+3. Write a script that loops through all directories in `~/kitestacks-live/docker/`
+   and prints the service name and whether it has a `.env` file
+4. Write a script that checks if a URL returns 200 OK and prints "UP" or "DOWN":
+   `./check-url.sh https://gitforge.kitestacks.com`
+5. Read and understand every line of `scripts/rollout-cloudflared-token.sh`
+
+---
+
+**Next:** [Part 3 — Python Basics](03-python-basics.md)
--- a/homelab-mastery/build-guide/without-ai/03-python-basics.md
+++ b/homelab-mastery/build-guide/without-ai/03-python-basics.md
@ -0,0 +1,347 @@
+# Without AI — Part 3: Python Basics
+
+**Track:** Advanced (No AI)  
+**Time for this section:** 1–2 weeks
+
+Python is used in this homelab for:
+1. **Database operations** — copying SQLite databases safely between machines
+2. **HTTP requests** — hitting APIs to configure services
+3. **The metrics API** — the Python FastAPI service that feeds live stats to the portal
+4. **One-off automation** — scripts that are too complex for Bash
+
+You do not need to be a Python developer. You need to read Python code, understand
+what it does, modify it for your situation, and write simple scripts.
+
+---
+
+## Installing Python
+
+Ubuntu 24.04 comes with Python 3 already installed:
+```bash
+python3 --version    # Should show 3.12.x or similar
+pip3 --version       # Package manager for Python
+```
+
+Install the packages used in this homelab:
+```bash
+pip3 install requests fastapi uvicorn psutil
+```
+
+---
+
+## Python Syntax Basics
+
+Python uses indentation (spaces) to define blocks of code instead of `{}` like many
+other languages. This is critical — wrong indentation causes errors.
+
+```python
+# This is a comment
+
+name = "kenpat"              # string
+port = 3000                  # integer
+price = 4.99                 # float
+is_running = True            # boolean
+
+print(name)                  # prints: kenpat
+print(f"Port is {port}")     # f-string: prints: Port is 3000
+print(f"{name!r}")           # repr: prints: 'kenpat' (with quotes)
+```
+
+---
+
+## Data Structures
+
+```python
+# List (ordered, mutable)
+services = ["forgejo", "grafana", "authentik"]
+services.append("portainer")         # add to end
+services[0]                          # "forgejo" (zero-indexed)
+services[-1]                         # "portainer" (last item)
+len(services)                        # 4
+
+for service in services:
+    print(service)
+
+# Dictionary (key-value pairs, like JSON)
+monitor = {
+    "name": "Forgejo",
+    "url": "https://gitforge.kitestacks.com",
+    "id": 16,
+    "active": True
+}
+
+monitor["name"]                      # "Forgejo"
+monitor.get("missing", "default")    # "default" (safe get with fallback)
+monitor.keys()                       # dict_keys(["name", "url", "id", "active"])
+
+for key, value in monitor.items():
+    print(f"{key}: {value}")
+
+# List of dicts (very common in API responses)
+monitors = [
+    {"id": 16, "name": "Forgejo"},
+    {"id": 17, "name": "Grafana"},
+]
+for m in monitors:
+    print(m["id"], m["name"])
+```
+
+---
+
+## Functions and Conditionals
+
+```python
+def check_service(name, url):
+    """Check if a service URL is reachable."""
+    if not url.startswith("https://"):
+        return False
+    print(f"Checking {name} at {url}")
+    return True
+
+result = check_service("Grafana", "https://grafana.kitestacks.com")
+print(result)   # True
+```
+
+**Conditionals:**
+```python
+status = 200
+
+if status == 200:
+    print("OK")
+elif status in (301, 302):
+    print("Redirect")
+elif status >= 500:
+    print("Server error")
+else:
+    print(f"Unexpected status: {status}")
+```
+
+---
+
+## Working with JSON
+
+Almost every API in this homelab sends and receives JSON (JavaScript Object Notation).
+Python's `json` module converts between JSON strings and Python dicts/lists:
+
+```python
+import json
+
+# JSON string to Python dict
+data = json.loads('{"name": "Forgejo", "id": 16}')
+print(data["name"])   # Forgejo
+
+# Python dict to JSON string
+obj = {"monitors": [1, 2, 3]}
+json_str = json.dumps(obj, indent=2)
+print(json_str)
+# {
+#   "monitors": [1, 2, 3]
+# }
+
+# Read JSON from a file
+with open("/tmp/kuma.meta.json") as f:
+    kuma_data = json.load(f)
+
+# Parse Uptime Kuma heartbeat data
+for monitor_id, heartbeats in kuma_data.get("heartbeatList", {}).items():
+    if heartbeats:
+        last = heartbeats[-1]
+        status = "UP" if last["status"] == 1 else "DOWN"
+        print(f"Monitor {monitor_id}: {status}")
+```
+
+---
+
+## HTTP Requests with `requests`
+
+The `requests` library makes HTTP calls easy:
+
+```python
+import requests
+
+# GET request
+response = requests.get("https://gitforge.kitestacks.com/api/v1/repos/search",
+                        headers={"Authorization": "token your-api-token"},
+                        timeout=5)
+
+print(response.status_code)   # 200
+data = response.json()        # Parse JSON response body
+print(data["data"][0]["name"])  # First repo name
+
+# POST request with JSON body
+response = requests.post(
+    "https://auth.kitestacks.com/api/v3/core/tokens/",
+    headers={"Authorization": "Bearer your-admin-token"},
+    json={"identifier": "my-token", "user": "kenpat"},
+    timeout=5
+)
+
+if response.ok:    # True for 2xx status codes
+    print("Token created:", response.json()["key"])
+else:
+    print(f"Failed: {response.status_code} {response.text}")
+```
+
+---
+
+## SQLite — The Key Database Skill in This Homelab
+
+SQLite is a database that lives in a single file. Uptime Kuma, Kavita, and other services
+use SQLite. You used Python's `sqlite3` module to copy databases safely between machines.
+
+```python
+import sqlite3
+
+# Connect to a database file
+conn = sqlite3.connect("/path/to/kuma.db")
+
+# Run a query
+cursor = conn.execute("SELECT id, name, url FROM monitor ORDER BY id")
+rows = cursor.fetchall()     # Get all results
+for row in rows:
+    print(row[0], row[1], row[2])
+
+# Insert data
+conn.execute(
+    "INSERT INTO monitor (name, type, url, active) VALUES (?, ?, ?, ?)",
+    ("BookStack", "http", "https://wiki.kitestacks.com", 1)
+)
+conn.commit()    # Save changes (without commit, nothing is written)
+
+# Use a transaction explicitly (safer for multiple changes)
+conn.execute("BEGIN")
+conn.execute("UPDATE monitor SET active=1 WHERE id=26")
+conn.execute("UPDATE monitor SET active=1 WHERE id=27")
+conn.execute("COMMIT")
+
+conn.close()
+```
+
+### The `backup()` Method — Copying Databases Safely
+
+SQLite databases in WAL mode (write-ahead log) cannot be copied with a plain file copy
+while they are in use. The `Connection.backup()` method creates a consistent snapshot:
+
+```python
+import sqlite3
+
+def safe_backup(source_path, dest_path):
+    """Copy a SQLite database safely, even if it's in use."""
+    src = sqlite3.connect(source_path)
+    dst = sqlite3.connect(dest_path)
+    src.backup(dst)       # Creates a consistent copy
+    dst.close()
+    src.close()
+    print(f"Backed up {source_path} to {dest_path}")
+
+safe_backup("/src/kuma.db", "/out/kuma.db.backup")
+```
+
+**Why a plain `cp` would fail:** SQLite in WAL mode has two extra files:
+`kuma.db-wal` (uncommitted changes) and `kuma.db-shm` (shared memory). If you copy
+the main file without those, or in the wrong order, you get a corrupted database.
+`Connection.backup()` handles all of this correctly.
+
+---
+
+## Writing a Simple FastAPI Service
+
+The kitestacks-metrics-api is a Python FastAPI service. Understanding it helps you
+modify or extend it:
+
+```python
+from fastapi import FastAPI
+import psutil
+
+app = FastAPI()
+
+@app.get("/api/health")
+def health():
+    return {"ok": True}
+
+@app.get("/api/metrics")
+def metrics():
+    return {
+        "cpu_percent": psutil.cpu_percent(interval=1),
+        "ram_percent": psutil.virtual_memory().percent,
+        "ram_used_gb": psutil.virtual_memory().used / 1e9,
+        "disk_percent": psutil.disk_usage("/").percent,
+    }
+```
+
+Run it:
+```bash
+uvicorn myapi:app --host 0.0.0.0 --port 8000
+```
+
+`psutil` reads these values from the host's `/proc` filesystem. When running inside
+a Docker container with `pid: host`, it reads the HOST's stats.
+
+---
+
+## Environment Variables in Python
+
+```python
+import os
+
+token = os.environ.get("FORGEJO_TOKEN")           # None if not set
+token = os.environ.get("FORGEJO_TOKEN", "")       # Empty string if not set
+token = os.environ["FORGEJO_TOKEN"]               # KeyError if not set (explicit)
+
+# Check and fail clearly
+token = os.environ.get("FORGEJO_TOKEN")
+if not token:
+    raise ValueError("FORGEJO_TOKEN environment variable is required")
+```
+
+---
+
+## File Operations
+
+```python
+import os
+
+# Read a file
+with open("/tmp/kuma.json") as f:
+    content = f.read()
+
+# Write a file
+with open("/tmp/output.sql", "w") as f:
+    f.write("UPDATE ServerSetting SET Value='test' WHERE \"Key\"=40;\n")
+
+# Check if a file exists
+if os.path.exists("/data/kuma.db"):
+    print("Database found")
+
+# Delete a file safely
+for fname in ["/data/kuma.db-shm", "/data/kuma.db-wal"]:
+    if os.path.exists(fname):
+        os.remove(fname)
+        print(f"Removed {fname}")
+
+# List files in a directory
+for filename in os.listdir("/app/data"):
+    print(filename)
+```
+
+---
+
+## Practice Exercises
+
+1. Write a Python script that reads `monitors.json` from Uptime Kuma's API response
+   and prints each monitor's name and status
+
+2. Write a script that connects to a SQLite database, lists all tables, and prints
+   the first 5 rows of the `monitor` table
+
+3. Write a script that uses `requests` to check if all 11 KiteStacks URLs return
+   a status code between 200 and 399, and prints a summary
+
+4. Read the kitestacks-metrics-api source code and understand what each endpoint does
+
+5. Modify the `safe_backup()` function to also delete `-shm` and `-wal` files from
+   the destination before writing (prevents WAL conflicts after restore)
+
+---
+
+**Next:** [Part 4 — Docker Deep Dive](04-docker-deep-dive.md)
--- a/homelab-mastery/build-guide/without-ai/04-docker-deep-dive.md
+++ b/homelab-mastery/build-guide/without-ai/04-docker-deep-dive.md
@ -0,0 +1,303 @@
+# Without AI — Part 4: Docker Deep Dive
+
+**Track:** Advanced (No AI)  
+**Time for this section:** 1–2 weeks
+
+Docker is the technology that runs every service in this homelab. Understanding it
+deeply — not just copying compose files — is what separates someone who can maintain
+and troubleshoot a homelab from someone who hopes nothing breaks.
+
+---
+
+## What Docker Actually Is
+
+Most explanations say "containers are like lightweight VMs." That is wrong and leads
+to confusion. Here is what a container actually is:
+
+**A container is a Linux process with isolation applied.**
+
+Two Linux kernel features provide that isolation:
+
+**Namespaces** — the container gets its own view of:
+- Filesystem (it sees `/` but it is a different tree than the host's `/`)
+- Network interfaces (its own `eth0`, its own IP on the Docker network)
+- Process list (it can only see its own processes, not the host's)
+- User IDs (it can be "root" inside without being root on the host)
+
+**cgroups (control groups)** — limits how much of the host's resources the container can use:
+- CPU cores and usage limits
+- RAM limits
+- Disk I/O limits
+- Network bandwidth limits
+
+**Result:** No second kernel, no hardware emulation, no hypervisor. The nginx process
+in your `homepage` container is a regular Linux process on your machine — it just
+thinks it is alone.
+
+---
+
+## Images vs Containers
+
+```
+Image                        Container
+─────────────────────────    ─────────────────────────────────────────
+A recipe                     A running instance made from the recipe
+Read-only, immutable         Has a writable layer on top of the image
+Stored in layers             One writable layer per container
+Shared across containers     Separate per container
+Survives container deletion  Deleted with the container (unless volume)
+```
+
+**Layers:** Docker images are built in layers. Each line in a `Dockerfile` creates a layer.
+If you update one layer, only that layer is re-downloaded. This is why pulling an update
+is fast — most layers are already local.
+
+```bash
+docker image ls                         # List local images
+docker image inspect nginx:alpine       # See image metadata and layers
+docker image history nginx:alpine       # See how the image was built, layer by layer
+docker image pull postgres:16-alpine    # Download an image explicitly
+docker image rm nginx:alpine            # Remove a local image
+```
+
+---
+
+## Docker Networks — In Depth
+
+Docker provides several networking modes:
+
+**bridge (default):** Container gets its own virtual network interface with a private IP
+(172.x.x.x range). Containers on the same bridge network can reach each other by IP
+or by name (via Docker's built-in DNS). Containers on different bridge networks are isolated.
+
+**host:** Container shares the host's network namespace entirely. `--network host` means
+no isolation — the container sees all host network interfaces and binds directly to
+host ports. Used for kitestacks-metrics-api so psutil can see real network stats.
+
+**none:** No networking at all. Rarely used.
+
+```bash
+# Create a named bridge network
+docker network create kitestacks
+
+# See all networks
+docker network ls
+
+# Inspect a network — see which containers are connected and their IPs
+docker network inspect kitestacks
+
+# Connect a running container to a network
+docker network connect kitestacks my-container
+
+# Disconnect
+docker network disconnect kitestacks my-container
+```
+
+**The DNS trick:** When two containers are on the same bridge network, Docker runs a
+DNS server at `127.0.0.11` inside each container. Container names resolve to their
+internal IPs. This is why `cloudflared` can connect to `http://grafana:3000` —
+Docker DNS resolves `grafana` to the grafana container's IP.
+
+```bash
+# Verify DNS works from inside a container
+docker exec cloudflared nslookup grafana
+docker exec cloudflared curl -s http://grafana:3000/api/health
+```
+
+---
+
+## Volumes — Persisting Data
+
+Containers are ephemeral. When you delete a container, its writable layer is gone.
+To keep data, you use volumes.
+
+**Bind mount:** You choose the path on the host.
+```yaml
+volumes:
+  - ./data:/forgejo-data           # host path : container path
+  - /home/kenpat/books:/books:ro   # :ro = read-only
+```
+Data is at `./data` on the host. You can navigate there with `cd`. You can back it up.
+
+**Named volume:** Docker manages the path.
+```yaml
+volumes:
+  - uptime-kuma:/app/data
+
+volumes:
+  uptime-kuma:              # define the named volume
+```
+Data is at `/var/lib/docker/volumes/uptime-kuma/_data/` on the host (Docker manages this).
+
+```bash
+docker volume ls                            # List named volumes
+docker volume inspect uptime-kuma           # See where it is stored
+docker volume rm uptime-kuma                # Delete a volume (and its data!)
+```
+
+**Access a named volume from a one-off container:**
+```bash
+docker run --rm -v uptime-kuma:/data alpine ls /data
+```
+
+This is the pattern used throughout this homelab to read or modify volumes without
+stopping the running service (for reads) or after stopping it (for writes).
+
+---
+
+## Docker Compose — The Full Picture
+
+Docker Compose reads a YAML file and manages the lifecycle of multiple containers.
+
+```yaml
+services:
+  forgejo:
+    image: codeberg.org/forgejo/forgejo:latest
+    container_name: forgejo           # Fixed name (not random)
+    restart: unless-stopped           # Restart on crash or host reboot
+    env_file: .env                    # Load environment variables from file
+    environment:
+      FORGEJO__server__DOMAIN: gitforge.kitestacks.com   # Override one env var
+    volumes:
+      - ./data:/data                  # Bind mount: ./data on host → /data in container
+    ports:
+      - "127.0.0.1:2222:22"          # Bind host 127.0.0.1:2222 to container port 22 (SSH)
+    networks:
+      - kitestacks
+    depends_on:
+      - authentik-postgres             # Start this service before forgejo
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+
+networks:
+  kitestacks:
+    external: true                    # Use existing network (don't create a new one)
+```
+
+**Key fields explained:**
+
+`restart: unless-stopped`
+- `no` — never restart
+- `always` — always restart, even on manual stop
+- `on-failure` — restart only if exit code is non-zero
+- `unless-stopped` — restart on crash or reboot, but not if you manually stopped it
+
+`env_file: .env`
+Reads `KEY=VALUE` pairs from a file. The `.env` file is in `.gitignore` so secrets
+never get committed to git. Always use this for passwords, tokens, and secrets.
+
+`depends_on`
+Starts services in dependency order. Does NOT wait for a service to be "ready" —
+just waits for the container to START. If you need to wait for a database to be ready,
+add a health check and use `condition: service_healthy`.
+
+**Common commands:**
+```bash
+docker compose up -d              # Start all services in background
+docker compose down               # Stop and remove containers (not volumes)
+docker compose down -v            # Stop, remove containers AND volumes (data loss!)
+docker compose restart forgejo    # Restart one service
+docker compose pull               # Pull latest images
+docker compose logs -f forgejo    # Follow logs for one service
+docker compose ps                 # Show service status
+docker compose exec forgejo bash  # Open shell in running service
+docker compose config             # Validate and show merged config
+```
+
+---
+
+## Port Mappings — When to Use Them
+
+```yaml
+ports:
+  - "3005:3000"           # host_port:container_port
+  - "127.0.0.1:3005:3000" # bind to localhost only (not accessible from outside host)
+  - "0.0.0.0:9100:9100"   # bind on all interfaces (accessible from outside)
+```
+
+**In this homelab, most services do NOT expose host ports** — they only communicate
+through the Docker network. Cloudflare Tunnel connects directly to the container via
+the Docker bridge network, so no host ports are needed for public services.
+
+The only services that need host ports:
+- `node-exporter` on kscloud1 (so Prometheus on monk can scrape it via public IP)
+- `kitestacks-metrics-api` does NOT use ports — it uses `network_mode: host`
+- `portainer` uses 9443 (HTTPS)
+
+---
+
+## Inspecting and Debugging
+
+```bash
+# See everything about a container
+docker inspect forgejo
+
+# See just its IP address on each network
+docker inspect forgejo --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'
+
+# See its environment variables (careful — this shows secrets!)
+docker inspect forgejo --format '{{range .Config.Env}}{{println .}}{{end}}'
+
+# See its mounts
+docker inspect forgejo --format '{{json .Mounts}}' | python3 -m json.tool
+
+# See resource usage
+docker stats                    # Live, all containers
+docker stats forgejo --no-stream # One snapshot for one container
+
+# See what the container's filesystem looks like
+docker exec forgejo ls /
+docker exec forgejo cat /etc/forgejo/app.ini
+docker exec forgejo find /data -name "*.db" 2>/dev/null
+```
+
+---
+
+## Common Gotchas
+
+**Containers share the host's kernel:** If you run an Alpine-based image but your
+host kernel is too old, some syscalls may not work. Rare but real.
+
+**Named volumes are invisible by default:** New developers spend hours wondering where
+data went after deleting a container. Named volumes survive `docker compose down`.
+They do NOT survive `docker compose down -v`.
+
+**Order vs readiness:** `depends_on` does not mean "wait until ready." A Postgres
+container starts in milliseconds, but PostgreSQL inside it takes 3–5 seconds to accept
+connections. Use healthchecks for real readiness checking.
+
+**Port conflicts:** Two containers cannot bind the same host port. If you get
+`Bind for 0.0.0.0:3000 failed: port is already allocated`, something else is already
+using that host port.
+
+**network_mode: host and named networks cannot coexist:**
+```yaml
+network_mode: host    # This means the container has NO network isolation
+# You cannot also add networks: [...] — they conflict
+```
+
+---
+
+## Practice Exercises
+
+1. Pull the `nginx:alpine` image and run it: `docker run -d -p 8080:80 nginx:alpine`
+   Visit `http://localhost:8080`. Then exec into it and find the nginx config.
+
+2. Run two containers (`alpine`) on the same custom network and verify they can
+   ping each other by container name
+
+3. Create a named volume and mount it in two different containers. Write a file from
+   one container and read it from the other
+
+4. Write a `docker-compose.yml` with three services: one nginx, one redis, one alpine
+   that waits for redis to be healthy before starting
+
+5. Use `docker inspect` to find the IP address of your `forgejo` container on the
+   `kitestacks` network. Confirm it matches what Docker DNS resolves.
+
+---
+
+**Next:** [Part 5 — Networking](05-networking.md)
--- a/homelab-mastery/build-guide/without-ai/05-networking.md
+++ b/homelab-mastery/build-guide/without-ai/05-networking.md
@ -0,0 +1,352 @@
+# Without AI — Part 5: Networking
+
+**Track:** Advanced (No AI)  
+**Time for this section:** 1–2 weeks
+
+Networking is the hardest part to learn and the most important. Every problem in this
+homelab ultimately involves a packet trying to get somewhere. If you understand how
+packets travel, you can debug anything.
+
+---
+
+## IP Addresses
+
+Every device on a network has an IP address — a number that identifies it.
+
+**IPv4:** Four octets (0–255) separated by dots: `192.168.1.205`
+
+**Classes of addresses:**
+
+| Range | Who Owns It | Used For |
+|-------|------------|---------|
+| `10.0.0.0/8` | Private | Corporate networks, VPNs |
+| `172.16.0.0/12` | Private | Docker bridge networks |
+| `192.168.0.0/16` | Private | Home networks (your router) |
+| `100.64.0.0/10` | Shared | Tailscale uses this range |
+| Everything else | Public | Routable on the internet |
+
+Private addresses are not routable on the internet. Your home router uses NAT
+(Network Address Translation) to let private-addressed devices reach the internet.
+
+---
+
+## Subnetting and CIDR Notation
+
+CIDR (Classless Inter-Domain Routing) notation describes a range of IP addresses:
+```
+192.168.1.0/24
+              │
+              └── prefix length: how many bits are fixed
+```
+
+An IPv4 address is 32 bits. A `/24` means the first 24 bits are fixed (the network),
+leaving 8 bits for hosts. `2^8 = 256` addresses, minus network (`.0`) and broadcast (`.255`)
+= 254 usable host addresses.
+
+| CIDR | Addresses | Usable | Example |
+|------|-----------|--------|---------|
+| `/32` | 1 | 1 | A single IP |
+| `/31` | 2 | 2 | Point-to-point link |
+| `/30` | 4 | 2 | Small link |
+| `/29` | 8 | 6 | Small subnet |
+| `/28` | 16 | 14 | |
+| `/27` | 32 | 30 | |
+| `/26` | 64 | 62 | |
+| `/25` | 128 | 126 | |
+| `/24` | 256 | 254 | Typical home/office LAN |
+| `/16` | 65,536 | 65,534 | Large network |
+| `/12` | 1,048,576 | — | Docker range: 172.16.0.0/12 |
+| `/8` | 16,777,216 | — | 10.x.x.x range |
+
+**Subnetting practice:** Calculating the host range of `172.17.0.0/16`:
+- Fixed: `172.17` (first 16 bits)
+- Variable: last 16 bits
+- Host range: `172.17.0.1` to `172.17.255.254`
+- This covers all of `172.17.x.x`
+
+**Why `/12` covers all Docker networks:**
+`172.16.0.0/12` covers `172.16.0.0` through `172.31.255.255`.
+Docker creates bridge networks in the `172.17.x.x`, `172.18.x.x`, etc. ranges.
+All of those are inside `172.16.0.0/12` — so one ufw rule covers all Docker bridges.
+
+---
+
+## Ports
+
+A port is a 16-bit number (0–65535) that identifies which service on a host should
+handle a connection.
+
+```
+IP address = the building
+Port       = the apartment number
+```
+
+**Well-known ports (0–1023):**
+| Port | Protocol | Service |
+|------|----------|---------|
+| 22 | TCP | SSH |
+| 25 | TCP | SMTP (email sending) |
+| 53 | UDP/TCP | DNS |
+| 80 | TCP | HTTP |
+| 443 | TCP | HTTPS |
+| 5432 | TCP | PostgreSQL |
+| 6379 | TCP | Redis |
+
+**Ephemeral ports (49152–65535):** OS assigns these randomly for outbound connections.
+
+**In Docker:**
+```yaml
+ports:
+  - "9100:9100"   # host:container — both the same number
+```
+Container port 9100 is mapped to host port 9100.
+External systems connect to the host IP on port 9100.
+Internally, containers on the Docker network use the container port directly.
+
+---
+
+## DNS (Domain Name System)
+
+DNS is a distributed database that maps names to IP addresses.
+
+**The hierarchy:**
+```
+. (root)
+└── com
+    └── kitestacks
+        ├── www      →  Cloudflare anycast IP
+        ├── auth     →  Cloudflare anycast IP
+        └── grafana  →  Cloudflare anycast IP
+```
+
+**Resolution process for `grafana.kitestacks.com`:**
+1. Browser checks local cache — not found
+2. Browser asks OS resolver (usually `127.0.0.53`)
+3. OS asks the configured DNS server (your home router, or 8.8.8.8)
+4. Resolver asks root nameservers: "who handles `.com`?"
+5. Root says: "Ask Verisign's servers"
+6. Resolver asks Verisign: "who handles `kitestacks.com`?"
+7. Verisign says: "Ask Cloudflare's nameservers (`vera.ns.cloudflare.com`)"
+8. Resolver asks Cloudflare: "what is `grafana.kitestacks.com`?"
+9. Cloudflare returns: "Cloudflare's anycast IP: 104.x.x.x"
+10. Browser connects to 104.x.x.x on port 443
+
+**Internal Docker DNS:**
+Inside the `kitestacks` Docker network, Docker runs a DNS server at `127.0.0.11`.
+When cloudflared resolves `grafana`, Docker DNS returns the container's bridge IP.
+
+```bash
+# Check what an external name resolves to
+dig grafana.kitestacks.com
+
+# Check DNS from inside a container
+docker exec cloudflared nslookup grafana
+docker exec cloudflared cat /etc/resolv.conf   # Shows the DNS server: 127.0.0.11
+```
+
+---
+
+## HTTP and HTTPS
+
+**HTTP:** Plain text request/response protocol. Anyone who can see the traffic can read it.
+
+```
+GET /api/health HTTP/1.1
+Host: grafana.kitestacks.com
+Accept: application/json
+
+HTTP/1.1 200 OK
+Content-Type: application/json
+
+{"ok": true}
+```
+
+**HTTPS:** HTTP inside a TLS-encrypted tunnel. The connection is encrypted from client to
+Cloudflare's edge. Between Cloudflare and your containers (inside Docker network), it is
+plain HTTP — this is fine because that traffic never leaves the host.
+
+**TLS handshake (simplified):**
+1. Client says "hello, I support these cipher suites"
+2. Server sends its certificate (proves it is `kitestacks.com`)
+3. Client verifies certificate against trusted Certificate Authorities
+4. Both sides agree on encryption keys (Diffie-Hellman key exchange)
+5. Encrypted connection established
+6. HTTP requests flow inside this encrypted tunnel
+
+In this homelab, Cloudflare handles TLS entirely. Your containers never see TLS.
+
+---
+
+## Cloudflare Tunnel — Technical Details
+
+**What cloudflared actually does:**
+
+```bash
+# Watch cloudflared connect
+docker logs cloudflared -f
+# You see: "Connection established" connIndex=0 location=ORD
+# ORD = Chicago data center (or nearest Cloudflare POP to you)
+```
+
+cloudflared establishes persistent multiplexed HTTP/2 connections to Cloudflare's
+edge network. When a request comes in:
+
+```
+Internet user → Cloudflare edge → tunnel (HTTP/2 multiplexed) → cloudflared
+                                                                       ↓
+cloudflared reads Ingress rules from Cloudflare API:
+  grafana.kitestacks.com → http://grafana:3000
+
+cloudflared → Docker DNS → grafana container IP → sends request
+```
+
+The tunnel connection uses QUIC (UDP-based) when possible, falls back to HTTPS/TCP.
+
+**Active-Active with two connectors:**
+Each connector registers separately. Cloudflare maintains a list of active connectors.
+Incoming requests are distributed across connectors by Cloudflare — no configuration
+needed on your end. If one connector drops, the others take all traffic within seconds.
+
+---
+
+## Tailscale — WireGuard Under the Hood
+
+Tailscale is a managed WireGuard VPN. Understanding WireGuard explains Tailscale.
+
+**WireGuard:**
+- Modern VPN protocol, designed in 2016
+- Uses UDP (faster than TCP-based VPNs like OpenVPN)
+- Cryptography: Curve25519 key exchange, ChaCha20-Poly1305 encryption
+- Each peer has a public/private key pair (like SSH keys)
+- Configured via static peer lists with IP allowances
+
+**The NAT problem:** Home machines are behind NAT. Their public IP is the router's IP,
+not their own. Two NAT-ed machines cannot easily make direct connections.
+
+**Tailscale's solution — UDP hole punching:**
+1. Both machines connect to Tailscale's coordination server (DERP)
+2. Tailscale orchestrates a "hole punch": both machines send packets to each other
+   simultaneously, which opens NAT mappings on both routers
+3. Direct WireGuard connection established peer-to-peer
+4. Tailscale coordination servers are no longer involved in the data path
+
+```bash
+# Check Tailscale status
+tailscale status
+
+# See your device's Tailscale IP
+tailscale ip -4
+
+# Check connectivity to kscloud1
+tailscale ping 100.123.x.x
+
+# See if connection is direct or via relay
+tailscale status --json | python3 -m json.tool | grep -A5 "kscloud1"
+```
+
+**Why Tailscale IPs are stable:** Each device's `100.x.x.x` IP is tied to its machine
+identity in Tailscale's database. It does not change when you move networks or reconnect.
+
+---
+
+## Firewalls (ufw)
+
+ufw (Uncomplicated Firewall) is a frontend for iptables/nftables.
+
+**kscloud1's firewall configuration:**
+```bash
+# View current rules
+sudo ufw status verbose
+
+# Default policies
+sudo ufw default deny incoming    # Block all inbound by default
+sudo ufw default allow outgoing   # Allow all outbound
+
+# Allow specific services
+sudo ufw allow ssh                 # Allow SSH (port 22)
+sudo ufw allow from 172.16.0.0/12 to any port 8000 proto tcp  # Docker → metrics API
+
+# Why 172.16.0.0/12 and not just the specific Docker subnet?
+# Docker creates a new bridge network with a random 172.x subnet for each network.
+# /12 covers ALL possible Docker subnets so this rule always works.
+```
+
+**The ufw/Docker conflict:** Docker modifies iptables rules directly, bypassing ufw.
+This means Docker's port mappings (`-p 9100:9100`) are accessible regardless of ufw rules.
+Only services running in `network_mode: host` are controlled by ufw.
+
+kscloud1's metrics API uses `network_mode: host`, so it needs an explicit ufw allow rule
+for Docker containers to reach it.
+
+---
+
+## Reverse Proxies
+
+A reverse proxy receives requests on behalf of backend services:
+
+```
+Client → Reverse Proxy → Backend A
+                      → Backend B
+                      → Backend C
+```
+
+In this homelab:
+- **Cloudflare + cloudflared** — the primary reverse proxy routing by hostname
+- **nginx (homepage container)** — secondary proxy forwarding `/api/*` to metrics API
+
+nginx config that proxies API calls:
+```nginx
+location /api/ {
+    proxy_pass http://host.docker.internal:8000/;
+    proxy_set_header Host $host;
+    proxy_set_header X-Real-IP $remote_addr;
+}
+```
+
+`host.docker.internal` resolves to the host machine's IP from inside a Docker container.
+This lets the nginx container reach the metrics API running in `network_mode: host`.
+
+---
+
+## Diagnosing Network Problems
+
+**"I can't reach the service from outside"**
+```bash
+# Is cloudflared running and connected?
+docker logs cloudflared | tail -20
+
+# Is the target container running and on the right network?
+docker inspect homepage --format '{{range .NetworkSettings.Networks}}{{println .}}{{end}}'
+
+# Can cloudflared reach the container?
+docker exec cloudflared curl -s http://homepage:3000
+```
+
+**"Two containers can't talk to each other"**
+```bash
+# Are they on the same network?
+docker network inspect kitestacks | grep -A5 "Containers"
+
+# DNS resolution working?
+docker exec service-a nslookup service-b
+
+# Is the target port open inside the container?
+docker exec service-b ss -tlnp
+```
+
+**"The database won't accept connections"**
+```bash
+# Is Postgres listening?
+docker exec authentik-postgres ss -tlnp | grep 5432
+
+# From another container, can we reach it?
+docker exec authentik nc -zv authentik-postgres 5432
+
+# Is it bound to the right interface on kscloud1?
+docker exec authentik-postgres ss -tlnp | grep 5432
+# Should show: *:5432 or 100.123.x.x:5432, not 127.0.0.1:5432
+```
+
+---
+
+**Next:** [Part 6 — Full Build](06-full-build.md)
--- a/homelab-mastery/build-guide/without-ai/06-full-build.md
+++ b/homelab-mastery/build-guide/without-ai/06-full-build.md
@ -0,0 +1,478 @@
+# Without AI — Part 6: Full Build
+
+**Track:** Advanced (No AI)  
+**Time for this section:** 4–8 weeks
+
+You now have the foundations: Linux, Bash, Python, Docker, and Networking.
+This section builds the entire KiteStacks homelab from scratch — command by command,
+with every command explained.
+
+---
+
+## Before You Start
+
+You need:
+- Ubuntu 24.04 installed on your home PC (monk) and your VPS (kscloud1)
+- A domain name with DNS managed by Cloudflare
+- SSH key access to kscloud1
+- Tailscale account and CLI installed on both machines
+- Cloudflare account with a tunnel created (token saved)
+
+---
+
+## Phase 1 — Prepare Both Machines
+
+Run on **both monk and kscloud1**:
+
+```bash
+# Update the system
+sudo apt update && sudo apt upgrade -y
+
+# Install essential tools
+sudo apt install -y curl git nano wget python3 python3-pip ufw
+
+# Install Docker
+sudo apt install -y ca-certificates curl
+sudo install -m 0755 -d /etc/apt/keyrings
+sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
+  -o /etc/apt/keyrings/docker.asc
+sudo chmod a+r /etc/apt/keyrings/docker.asc
+echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
+  https://download.docker.com/linux/ubuntu \
+  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
+  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
+sudo apt update
+sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
+
+# Enable and start Docker
+sudo systemctl enable docker
+sudo systemctl start docker
+
+# Add your user to the docker group (avoids sudo for every docker command)
+sudo usermod -aG docker $USER
+# Log out and back in for this to take effect
+
+# Create the shared Docker network
+docker network create kitestacks
+```
+
+On **kscloud1** specifically, set up the firewall:
+
+```bash
+sudo ufw default deny incoming
+sudo ufw default allow outgoing
+sudo ufw allow ssh
+# Allow Docker bridge networks to reach host port 8000 (metrics API)
+sudo ufw allow from 172.16.0.0/12 to any port 8000 proto tcp
+sudo ufw --force enable
+sudo ufw status verbose
+```
+
+Install Tailscale on both machines:
+```bash
+curl -fsSL https://tailscale.com/install.sh | sh
+sudo tailscale up
+# Follow the URL to authenticate
+tailscale ip -4   # Note this IP — you will use it throughout the build
+```
+
+---
+
+## Phase 2 — Cloudflared (Tunnel Connector)
+
+Run on **monk**:
+
+```bash
+mkdir -p ~/kitestacks-live/docker/cloudflared
+cd ~/kitestacks-live/docker/cloudflared
+
+cat > .env <<'EOF'
+TUNNEL_TOKEN=your-tunnel-token-from-cloudflare
+EOF
+
+cat > docker-compose.yml <<'EOF'
+services:
+  cloudflared:
+    image: cloudflare/cloudflared:latest
+    container_name: cloudflared
+    restart: unless-stopped
+    command: tunnel --no-autoupdate run
+    environment:
+      - TUNNEL_TOKEN=${TUNNEL_TOKEN:?set TUNNEL_TOKEN in .env}
+    networks:
+      - default
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+EOF
+
+docker compose up -d
+docker logs cloudflared   # Confirm "Connection established"
+```
+
+**Why `${TUNNEL_TOKEN:?set TUNNEL_TOKEN in .env}`:**
+The `:?` syntax means: if the variable is unset or empty, exit with the given error message.
+This prevents silently running cloudflared with no token (which would produce a confusing error).
+
+Repeat on **kscloud1** using the same token, same docker-compose.yml, at `/opt/kitestacks/docker/cloudflared/`.
+
+---
+
+## Phase 3 — Shared Database Layer (on kscloud1)
+
+The shared Postgres and Redis will run on kscloud1. Both monk's and kscloud1's Authentik
+will point to these. Forgejo will use the same Postgres (different database).
+
+On **kscloud1**:
+
+```bash
+# Get kscloud1's Tailscale IP
+TAILSCALE_IP=$(tailscale ip -4)
+echo "Tailscale IP: $TAILSCALE_IP"
+
+mkdir -p /opt/kitestacks/docker/authentik
+cd /opt/kitestacks/docker/authentik
+
+# Generate a strong Postgres password
+PG_PASS=$(openssl rand -base64 32 | tr -d '/+=')
+echo "Postgres password: $PG_PASS"  # Save this
+
+cat > .env <<EOF
+PG_PASS=${PG_PASS}
+EOF
+
+cat > docker-compose.yml <<EOF
+services:
+  authentik-postgres:
+    image: postgres:16-alpine
+    container_name: authentik-postgres
+    restart: unless-stopped
+    environment:
+      POSTGRES_PASSWORD: \${PG_PASS}
+      POSTGRES_USER: authentik
+      POSTGRES_DB: authentik
+    ports:
+      - "${TAILSCALE_IP}:5432:5432"
+    volumes:
+      - ./postgres:/var/lib/postgresql/data
+    networks:
+      - kitestacks
+      - authentik_default
+
+  authentik-redis:
+    image: redis:7-alpine
+    container_name: authentik-redis
+    restart: unless-stopped
+    ports:
+      - "${TAILSCALE_IP}:6379:6379"
+    networks:
+      - kitestacks
+      - authentik_default
+
+networks:
+  kitestacks:
+    external: true
+  authentik_default:
+    name: authentik_default
+EOF
+
+docker compose up -d
+docker ps   # Confirm both containers are Up
+
+# Verify Postgres is listening on Tailscale IP only (NOT 0.0.0.0)
+docker exec authentik-postgres ss -tlnp | grep 5432
+# Expected: LISTEN  0.0.0.0:5432 or 100.x.x.x:5432
+```
+
+**Why the Tailscale IP binding matters:**
+`"${TAILSCALE_IP}:5432:5432"` tells Docker: bind host port 5432 only on the Tailscale
+interface. If you used `"5432:5432"` (or `"0.0.0.0:5432:5432"`), Postgres would be
+reachable from the public internet — a serious security risk. Only devices on your
+Tailscale network can reach `100.x.x.x:5432`.
+
+Create the Forgejo database:
+```bash
+docker exec -e PGPASSWORD="${PG_PASS}" authentik-postgres \
+  psql -U authentik -c "CREATE USER forgejo WITH PASSWORD 'forgejo-password-here';"
+docker exec -e PGPASSWORD="${PG_PASS}" authentik-postgres \
+  psql -U authentik -c "CREATE DATABASE forgejo OWNER forgejo;"
+```
+
+---
+
+## Phase 4 — Authentik (SSO)
+
+On **monk** first:
+
+```bash
+mkdir -p ~/kitestacks-live/docker/authentik
+cd ~/kitestacks-live/docker/authentik
+
+# Get kscloud1's Tailscale IP
+KSCLOUD1_TAILSCALE=100.123.x.x   # Replace with your actual value
+
+# Generate Authentik secret key (must be same on both hosts)
+SECRET_KEY=$(openssl rand -base64 60 | tr -d '\n')
+echo "Secret key: $SECRET_KEY"    # Save this — both hosts need the SAME key
+
+cat > .env <<EOF
+PG_PASS=your-postgres-password-from-phase-3
+AUTHENTIK_SECRET_KEY=${SECRET_KEY}
+AUTHENTIK_POSTGRESQL__HOST=${KSCLOUD1_TAILSCALE}
+AUTHENTIK_POSTGRESQL__USER=authentik
+AUTHENTIK_POSTGRESQL__NAME=authentik
+AUTHENTIK_POSTGRESQL__PASSWORD=your-postgres-password-from-phase-3
+AUTHENTIK_REDIS__HOST=${KSCLOUD1_TAILSCALE}
+AUTHENTIK_BOOTSTRAP_EMAIL=your@email.com
+AUTHENTIK_BOOTSTRAP_PASSWORD=choose-strong-password
+EOF
+
+cat > docker-compose.yml <<'EOF'
+services:
+  authentik:
+    image: ghcr.io/goauthentik/server:latest
+    container_name: authentik
+    restart: unless-stopped
+    command: server
+    env_file: .env
+    networks:
+      - kitestacks
+
+  authentik-worker:
+    image: ghcr.io/goauthentik/server:latest
+    container_name: authentik-worker
+    restart: unless-stopped
+    command: worker
+    env_file: .env
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+EOF
+
+docker compose up -d
+
+# Wait for Authentik to be healthy (takes ~2 minutes on first boot)
+until [[ "$(docker inspect --format '{{.State.Health.Status}}' authentik)" == "healthy" ]]; do
+  echo "Waiting for Authentik... $(docker inspect --format '{{.State.Health.Status}}' authentik)"
+  sleep 10
+done
+echo "Authentik is healthy"
+```
+
+**What happens on first boot:** Authentik runs database migrations (creates all tables),
+generates cryptographic keys, and starts the server. The worker process handles
+background jobs (email, background flows). Both need the same `.env` file.
+
+**Why `AUTHENTIK_REDIS__HOST` and not just `REDIS_HOST`:**
+Authentik uses a config format where `__` in environment variable names means "nested key".
+`AUTHENTIK_POSTGRESQL__HOST` maps to `authentik.postgresql.host` in the config tree.
+
+On **kscloud1**, create the same Authentik setup pointing to the local Postgres:
+```bash
+# On kscloud1, AUTHENTIK_POSTGRESQL__HOST should be authentik-postgres
+# (via the Docker network), not the Tailscale IP
+# kscloud1's Authentik is on the same Docker network as Postgres
+```
+
+---
+
+## Phase 5 — Forgejo
+
+On **monk**:
+
+```bash
+mkdir -p ~/kitestacks-live/docker/forgejo
+cd ~/kitestacks-live/docker/forgejo
+
+KSCLOUD1_TAILSCALE=100.123.x.x   # kscloud1's Tailscale IP
+
+cat > .env <<EOF
+FORGEJO__database__DB_TYPE=postgres
+FORGEJO__database__HOST=${KSCLOUD1_TAILSCALE}:5432
+FORGEJO__database__NAME=forgejo
+FORGEJO__database__USER=forgejo
+FORGEJO__database__PASSWD=forgejo-password-from-phase-3
+FORGEJO__server__DOMAIN=gitforge.yourdomain.com
+FORGEJO__server__ROOT_URL=https://gitforge.yourdomain.com
+FORGEJO__server__SSH_DOMAIN=gitforge.yourdomain.com
+EOF
+
+cat > docker-compose.yml <<'EOF'
+services:
+  forgejo:
+    image: codeberg.org/forgejo/forgejo:latest
+    container_name: forgejo
+    restart: unless-stopped
+    env_file: .env
+    volumes:
+      - ./data:/data
+    networks:
+      - kitestacks
+
+networks:
+  kitestacks:
+    external: true
+EOF
+
+docker compose up -d
+docker logs forgejo -f   # Watch for errors
+```
+
+Visit `gitforge.yourdomain.com`. Complete the initial setup, then create your admin account.
+
+On **kscloud1**: Same configuration. Both Forgejo instances point to the same Postgres `forgejo` database — so repos, users, and settings are identical on both.
+
+---
+
+## Phase 6 — All Remaining Services
+
+For each remaining service, the pattern is the same:
+
+1. `mkdir -p ~/kitestacks-live/docker/<service>`
+2. Create `.env` with secrets
+3. Create `docker-compose.yml`
+4. `docker compose up -d`
+5. Verify with `docker ps` and `docker logs <container>`
+
+Detailed compose files for each service are in `~/kitestacks-homelab/apps/<service>/`.
+Use those as your reference — read each file before running it.
+
+Key services and their main configuration points:
+
+**Karakeep:** Provider ID is `custom` (not `authentik`) — OAuth redirect URI is
+`https://links.yourdomain.com/api/auth/callback/custom`.
+
+**Kavita:** OIDC must be configured via web UI (Settings → OIDC), not by file editing.
+Authority URL requires trailing slash.
+
+**BookStack:** After first start, fix cache permissions:
+```bash
+docker exec bookstack chown -R abc:users /config/www/framework/cache/
+docker compose restart bookstack
+```
+
+**kitestacks-metrics-api:**
+```yaml
+services:
+  kitestacks-metrics-api:
+    image: your-metrics-api-image   # Build from apps/kitestacks-metrics-api/
+    container_name: kitestacks-metrics-api
+    restart: unless-stopped
+    network_mode: host    # Must be host — not kitestacks network
+    pid: host             # Must be host — reads /proc for real stats
+    environment:
+      - FORGEJO_API_BASE=https://gitforge.yourdomain.com
+      - FORGEJO_TOKEN=your-forgejo-api-token
+```
+
+Note: `network_mode: host` and `networks:` cannot coexist. The metrics API is reachable
+at `host.docker.internal:8000` from other containers.
+
+---
+
+## Phase 7 — SSO Configuration
+
+For each service, in Authentik admin panel (`auth.yourdomain.com/if/admin/`):
+
+1. **Applications → Providers → Create → OAuth2/OpenID Provider**
+   - Client type: Confidential
+   - Redirect URIs: service-specific (see SSO guide)
+   - Signing key: authentik Self-signed Certificate
+   - Scopes: openid, email, profile
+
+2. **Applications → Applications → Create**
+   - Provider: the one you just created
+   - Launch URL: the service's public URL
+
+3. (For sensitive services) **Policy Binding** → restrict to `homelab-admin` group
+
+OAuth2 code TTL — increase to prevent `invalid_grant` during monk reconnect:
+```bash
+# Connect to shared Postgres from kscloud1
+docker exec -it authentik-postgres psql -U authentik -d authentik
+
+-- Increase code lifetime for all providers to 10 minutes
+UPDATE authentik_providers_oauth2_oauth2provider
+SET access_code_validity = '00:10:00';
+
+-- Restart both Authentik instances after this
+\q
+```
+
+---
+
+## Phase 8 — Push Everything to kscloud1
+
+With monk as the source, push configurations to kscloud1:
+
+```bash
+# For each service, copy the docker-compose.yml and .env (with paths adjusted)
+# The standard pattern:
+for service in forgejo karakeep kavita grafana uptime-kuma bookstack osticket portainer; do
+  ssh -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x \
+    "mkdir -p /opt/kitestacks/docker/$service"
+  scp -i ~/.ssh/id_ed25519_kscloud1 \
+    ~/kitestacks-live/docker/$service/docker-compose.yml \
+    ~/kitestacks-live/docker/$service/.env \
+    kenpat@100.123.x.x:/opt/kitestacks/docker/$service/
+done
+```
+
+Then on kscloud1, start each service:
+```bash
+for service in forgejo karakeep kavita grafana uptime-kuma bookstack osticket portainer; do
+  cd /opt/kitestacks/docker/$service
+  docker compose up -d
+done
+```
+
+Verify all 11 services return the expected status:
+```bash
+for url in www auth gitforge ai links kavita grafana status wiki tasks portainer; do
+  code=$(curl -s -o /dev/null -w "%{http_code}" "https://${url}.yourdomain.com" --max-time 5)
+  echo "${url}.yourdomain.com: ${code}"
+done
+```
+
+All should return 200 or 302 (redirect to login).
+
+---
+
+## Committing Everything to Forgejo
+
+Once your homelab is working, commit all configurations:
+
+```bash
+cd ~/kitestacks-live
+git init
+git remote add origin https://gitforge.yourdomain.com/kenpat/kitestacks-live.git
+
+# Add a .gitignore BEFORE adding files — never commit secrets
+cat > .gitignore <<'EOF'
+**/.env
+**/data/
+**/postgres/
+**/config/
+**/*.db
+**/*.db-shm
+**/*.db-wal
+EOF
+
+git add docker-compose.yml docker/*/docker-compose.yml
+git commit -m "initial: all service compose files"
+git push origin main
+```
+
+Your `.env` files (which contain passwords and tokens) must NEVER be committed.
+The `.gitignore` above prevents this.
+
+---
+
+**Next:** [Part 7 — Troubleshooting](07-troubleshooting.md)
--- a/homelab-mastery/build-guide/without-ai/07-troubleshooting.md
+++ b/homelab-mastery/build-guide/without-ai/07-troubleshooting.md
@ -0,0 +1,389 @@
+# Without AI — Part 7: Troubleshooting
+
+**Track:** Advanced (No AI)  
+**Time for this section:** Ongoing (this is a reference you return to)
+
+Troubleshooting is not a step you complete — it is a skill you build over time.
+This section teaches the methodology and documents the real issues encountered
+building KiteStacks, with full explanations of how each was diagnosed and fixed.
+
+---
+
+## The Troubleshooting Mindset
+
+Before running any command, form a hypothesis. Before Googling, read the error.
+
+**The diagnostic loop:**
+1. **Observe** — what exactly is failing? URL? Error message? Which service?
+2. **Hypothesize** — what could cause this? List 2–3 possibilities
+3. **Test** — run the simplest command to prove or disprove your hypothesis
+4. **Narrow** — eliminate possibilities until one remains
+5. **Fix** — apply the fix
+6. **Verify** — confirm the fix worked
+7. **Document** — write what broke and what fixed it
+
+The most common mistake: jumping to step 5 without completing steps 2–4.
+
+---
+
+## Diagnostic Commands to Know Cold
+
+```bash
+# Container status
+docker ps                          # All running containers
+docker ps -a                       # All containers (including stopped)
+docker inspect <container>         # Full container config and state
+
+# Logs
+docker logs <container>            # All logs
+docker logs <container> --tail 50  # Last 50 lines
+docker logs <container> -f         # Follow live
+docker logs <container> --since 5m # Last 5 minutes
+
+# Network
+docker exec <container> curl -s http://other-container:port/health
+docker exec <container> nslookup other-container
+docker exec <container> ss -tlnp
+docker network inspect kitestacks
+
+# Disk and resources
+docker system df                   # Docker disk usage
+docker stats --no-stream           # One-shot resource usage
+df -h                              # Host disk usage
+free -h                            # Host RAM
+
+# DNS and HTTP from host
+curl -sv https://grafana.kitestacks.com  # -v = verbose (shows headers, TLS)
+dig grafana.kitestacks.com               # DNS lookup
+```
+
+---
+
+## Real Issues Encountered Building KiteStacks
+
+### Issue 1 — SSO: `invalid_grant` on OAuth Login (50% of the time)
+
+**Symptom:** Clicking "Sign in with Authentik" in Grafana, Kavita, etc. sometimes
+worked and sometimes showed `invalid_grant: The provided authorization grant is invalid`.
+Happened roughly 50% of the time. No correlation to time of day.
+
+**Observation:** The error appeared specifically after the authorization code redirect,
+during the token exchange step.
+
+**Hypothesis:**
+1. Authentik configuration wrong (but then it would fail 100% of the time)
+2. Network issue (but HTTP 400 means request reached Authentik)
+3. The code created in step 1 is not found in step 2
+
+**Testing:**
+```bash
+# Check if both Authentik instances have the same database
+docker exec authentik psql -U authentik -h $KSCLOUD1_IP -c "SELECT count(*) FROM authentik_providers_oauth2_authorizationcode;"
+# Monk's Authentik: count = 3
+# kscloud1's Authentik: count = 1
+# Different! Step 1 created the code in one DB, step 2 looked in the other.
+```
+
+**Root cause:** Two Authentik instances, two separate Postgres databases. Cloudflare
+routes `/authorize` and `/application/o/token/` independently — they can hit different hosts.
+
+**Fix:** Migrate both Authentik instances to a single shared Postgres, hosted on kscloud1,
+bound to the Tailscale IP only.
+
+```bash
+# 1. Dump monk's Authentik DB
+docker exec authentik-postgres pg_dump -U authentik authentik --clean --if-exists \
+  > /tmp/authentik_dump.sql
+
+# 2. Restore to kscloud1's new shared Postgres
+scp /tmp/authentik_dump.sql kenpat@100.123.x.x:/tmp/
+ssh kenpat@100.123.x.x "docker exec -i authentik-postgres psql -U authentik -d authentik \
+  < /tmp/authentik_dump.sql"
+
+# 3. Update monk's Authentik .env to point to kscloud1's Tailscale IP
+AUTHENTIK_POSTGRESQL__HOST=100.123.x.x
+AUTHENTIK_REDIS__HOST=100.123.x.x
+
+# 4. Remove monk's local Postgres and Redis
+docker stop authentik-postgres authentik-redis   # Stop, don't delete (keep data as backup)
+
+# 5. Restart monk's Authentik
+docker compose up -d
+```
+
+**Verification:** Logged in from a browser with DevTools open, watching Network tab.
+`/authorize` returned 302 with a code. `/token` returned 200 with a JWT. Done.
+
+**Lesson:** Stateful services with active-active routing need shared state. Any session,
+token, or code stored in one instance's database is invisible to the other instance.
+
+---
+
+### Issue 2 — Phantom Third Connector in Cloudflare Dashboard
+
+**Symptom:** Cloudflare Tunnel showed 3 active connectors when only 2 were expected
+(monk + kscloud1). Which was the third?
+
+**Investigation:**
+```bash
+# Check running Docker containers for cloudflared
+docker ps | grep cloudflared
+# Shows: one cloudflared container — expected
+
+# Check for non-Docker cloudflared processes
+ps aux | grep cloudflared
+# Shows: TWO processes!
+# /usr/bin/cloudflared (system-installed, running as a systemd service)
+# /usr/local/bin/cloudflared (Docker container)
+```
+
+**Root cause:** A cloudflared systemd service was installed separately from the Docker
+container. Both connected to the same tunnel with the same token, registering as separate connectors.
+
+```bash
+# Verify the systemd service
+sudo systemctl status cloudflared
+
+# Fix: disable the systemd service
+sudo systemctl stop cloudflared
+sudo systemctl disable cloudflared
+
+# Verify only one connector process remains
+ps aux | grep cloudflared
+```
+
+**Verification:** Cloudflare dashboard refreshed to show 2 connectors within 30 seconds.
+
+**Lesson:** A service installed via package manager AND in Docker is a recipe for duplicate
+processes. Check both `docker ps` and `ps aux` when troubleshooting unexpected behavior.
+
+---
+
+### Issue 3 — Karakeep SSO "Redirect URI Error"
+
+**Symptom:** After configuring Authentik OAuth2 for Karakeep, clicking "Sign in"
+showed "Redirect URI Error: The provided redirect_uri does not match any of the
+allowed redirect URIs" from Authentik.
+
+**Investigation:**
+```bash
+# Check what redirect URI was used in the OAuth2 request
+# Read from Authentik's logs
+docker logs authentik --tail 100 | grep "redirect_uri"
+# Shows: redirect_uri=https://links.kitestacks.com/api/auth/callback/authentik
+```
+
+**Root cause:** Karakeep uses NextAuth.js internally with provider ID `custom`.
+NextAuth constructs callback URLs as `/api/auth/callback/<provider-id>`.
+The provider ID is `custom`, not `authentik`.
+
+So the callback is `/api/auth/callback/custom`, not `/api/auth/callback/authentik`.
+
+**Fix:**
+```bash
+# Update Authentik's OAuth2 provider for Karakeep in the shared Postgres
+docker exec -it authentik-postgres psql -U authentik -d authentik
+
+BEGIN;
+UPDATE authentik_providers_oauth2_oauth2provider
+SET _redirect_uris = '["https://links.kitestacks.com/api/auth/callback/custom"]'
+WHERE name = 'Karakeep';
+COMMIT;
+
+-- Verify
+SELECT name, _redirect_uris FROM authentik_providers_oauth2_oauth2provider WHERE name = 'Karakeep';
+\q
+```
+
+Restart Authentik on both hosts:
+```bash
+docker compose restart authentik authentik-worker
+# Wait for healthy before testing
+```
+
+**Lesson:** When you get a redirect URI mismatch, always check what URI the APP is
+actually sending — not what you think it should send. The app's logs or browser DevTools
+Network tab show the actual request.
+
+---
+
+### Issue 4 — Kavita OIDC Config Gets Wiped on Restart
+
+**Symptom:** Configured Kavita's OIDC settings by editing `kavita.db` directly
+(using sqlite3). Settings looked correct in the DB. After `docker compose restart kavita`,
+the OIDC config was reset to empty/disabled.
+
+**Investigation:**
+```bash
+# Check the ServerSetting row before and after restart
+docker exec -it kavita sqlite3 /kavita/config/kavita.db \
+  "SELECT Value, RowVersion FROM ServerSetting WHERE \"Key\"=40;"
+# Before restart: {"enabled":true,"authority":"...","clientId":"kavita",...}, RowVersion=8
+# After restart: {"enabled":false,"authority":"","clientId":"","clientSecret":"",...}, RowVersion=10
+# RowVersion incremented by 2 — Kavita wrote to the row twice during startup
+```
+
+**Root cause:** Kavita validates and resets `ServerSetting` rows during startup from
+its own defaults. Any value that does not pass Kavita's internal validation (including
+OIDC config with the wrong format) gets reset to defaults. Direct SQL writes do not
+go through Kavita's validation pipeline, so they get overwritten.
+
+**Fix:** Use Kavita's own Settings UI via SSH port forwarding to bypass Cloudflare
+and reach kscloud1's Kavita directly:
+
+```bash
+# Forward kscloud1's Kavita port to localhost
+ssh -L 5099:localhost:5000 -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x -N &
+# Now visit http://localhost:5099 in browser
+# Log in with your Kavita credentials
+# Settings → OIDC → configure there
+# Click Save → changes survive restart
+```
+
+**Verification:** After saving in the UI, checked `RowVersion` was not incrementing on restart.
+
+**Lesson:** Do not write directly to application databases unless you know the app does not
+reinitialize those values on startup. Use the application's own APIs or UI.
+
+**Critical detail:** The Authority URL MUST have a trailing slash:
+`https://auth.kitestacks.com/application/o/kavita/`
+Without it: "issuer does not match" error, because Authentik's `openid-configuration`
+returns an `issuer` field that includes the trailing slash, and Kavita compares them exactly.
+
+---
+
+### Issue 5 — SSO Login Fails After monk Reconnects
+
+**Symptom:** When monk went offline and came back, SSO logins failed for 5–10 minutes
+with `invalid_grant`, then started working again.
+
+**Investigation:**
+Timeline reconstruction:
+- T+0: monk goes offline (power or network)
+- T+0: kscloud1 handles all traffic solo — SSO works fine, codes stored in shared DB
+- T+5min: monk comes back online, cloudflared reconnects
+- T+5min to T+8min: monk's Authentik is still starting (container startup takes ~3–4 min)
+- During this window: Cloudflare routes some `/authorize` to kscloud1, some `/token` to monk
+- Monk's Authentik hasn't finished starting — it responds with errors or invalid state
+
+**Root cause:** The OAuth2 authorization code has a 1-minute TTL (default). Monk's Authentik
+takes 3–5 minutes to fully start. During startup, Cloudflare is already routing traffic to
+monk's cloudflared (which is running), but monk's Authentik is not ready.
+
+Codes created on kscloud1 expire before monk's Authentik is healthy enough to serve them.
+
+**Fix:** Increase the OAuth2 code TTL from 1 minute to 10 minutes:
+
+```bash
+docker exec -it authentik-postgres psql -U authentik -d authentik
+
+UPDATE authentik_providers_oauth2_oauth2provider
+SET access_code_validity = '00:10:00';
+
+\q
+```
+
+Restart both Authentik instances. Now codes have a 10-minute window — enough for monk
+to finish starting before the code expires.
+
+**Alternative/additional fix:** Add a health check to monk's cloudflared or Authentik
+that keeps cloudflared from accepting traffic until Authentik is healthy.
+
+---
+
+### Issue 6 — kscloud1 SSH Key Auth Broken After Long Absence
+
+**Symptom:** After not connecting to kscloud1 for several weeks, `ssh kenpat@kscloud1`
+returned "Permission denied (publickey)".
+
+**Investigation:**
+```bash
+ssh -v -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x
+# Verbose output showed: offered key was not accepted
+# No other errors — key was being offered but rejected
+```
+
+**Root cause:** The `authorized_keys` file on kscloud1 had somehow been reset or corrupted
+(possibly from a VPS maintenance event or snapshot restore).
+
+**Fix:** Use Hetzner's console (web-based terminal that does not require SSH):
+1. Hetzner dashboard → Server → Console
+2. Log in as root (reset root password via Hetzner UI if needed)
+3. Restore the public key:
+
+```bash
+# On kscloud1 via Hetzner console
+mkdir -p /home/kenpat/.ssh
+cat >> /home/kenpat/.ssh/authorized_keys << 'EOF'
+ssh-ed25519 AAAA... your-public-key-here
+EOF
+chmod 700 /home/kenpat/.ssh
+chmod 600 /home/kenpat/.ssh/authorized_keys
+chown -R kenpat:kenpat /home/kenpat/.ssh
+```
+
+**Lesson:** Always keep your public key backed up. Cloud providers (Hetzner, AWS, DigitalOcean)
+all have web-based console access for exactly this situation. Never rely only on SSH for
+access to a remote server.
+
+---
+
+### Issue 7 — ufw Blocking Docker Container to Host Port
+
+**Symptom:** The portal homepage on kscloud1 showed "0%" and "Offline" for the System Status
+widget. On monk it showed real values.
+
+**Investigation:**
+```bash
+# Test the metrics API directly from inside the homepage container on kscloud1
+docker exec homepage-backup curl -s http://host.docker.internal:8000/api/metrics
+# No response after timeout
+
+# Test from host directly
+curl -s http://localhost:8000/api/metrics
+# Returns real metrics immediately
+
+# Check ufw rules
+sudo ufw status verbose
+# default deny incoming — no specific rule for port 8000
+```
+
+**Root cause:** The `kitestacks-metrics-api` container runs with `network_mode: host`.
+When `homepage-backup` calls `host.docker.internal:8000`, the kernel sees the source IP
+as the Docker bridge network (`172.x.x.x`). ufw's `default deny incoming` blocks it.
+
+Docker's iptables bypass (that allows published ports to work despite ufw) does not apply
+here because this is host-to-host traffic, not container-published port traffic.
+
+**Fix:**
+```bash
+sudo ufw allow from 172.16.0.0/12 to any port 8000 proto tcp
+sudo ufw status verbose   # Verify rule added
+```
+
+`172.16.0.0/12` covers all Docker bridge subnets (172.16.x.x through 172.31.x.x).
+
+**Verification:**
+```bash
+docker exec homepage-backup curl -s http://host.docker.internal:8000/api/metrics
+# Now returns: {"cpu_percent": 4.2, "ram_percent": 71.3, ...}
+```
+
+---
+
+## General Troubleshooting Cheatsheet
+
+| Symptom | First Commands to Run |
+|---------|----------------------|
+| Container won't start | `docker logs <container>` |
+| Container starts then crashes | `docker logs <container> --tail 30` |
+| Can't reach service from browser | `docker exec cloudflared curl -s http://<service>:<port>` |
+| SSL/TLS error in browser | `curl -sv https://yourdomain.com` (check Cloudflare is resolving) |
+| SSO failing with invalid_grant | Check both Authentik instances point to same shared Postgres |
+| Database error | Check data directory permissions: `ls -la ./data/` |
+| Port already in use | `sudo ss -tlnp | grep :<port>` |
+| Out of disk space | `df -h` and `docker system df` |
+| Out of RAM | `free -h` and `docker stats --no-stream` |
+| Can't ping between containers | `docker network inspect kitestacks` |
+| Forgejo 502 | `docker logs forgejo` — likely DB connection issue |
+| Authentik won't start | Check it can reach `$KSCLOUD1_TAILSCALE:5432` (Tailscale up?) |
--- a/homelab-mastery/certifications/roadmap.md
+++ b/homelab-mastery/certifications/roadmap.md
@ -132,12 +132,12 @@ Given where you are today:

 | Timeframe | Milestone |
 |-----------|-----------|
-| Next 1–2 months | CompTIA A+ Core 2 ✅ |
-| Months 3–8 | CCNA |
-| Months 9–11 | AWS SAA-C03 |
-| Months 12–14 | AWS SysOps Associate |
-| Months 15–18 | CKA (or CompTIA Cloud+) |
-| Months 18+ | AI/ML certs |
+| **July 7, 2026** | **CompTIA A+ Core 2** — exam goal (hard deadline July 12) |
+| Months 1–6 after A+ | CCNA |
+| Months 7–9 after A+ | AWS SAA-C03 |
+| Months 10–12 after A+ | AWS SysOps Associate |
+| Months 13–16 after A+ | CKA (or CompTIA Cloud+) |
+| Months 16+ after A+ | AI/ML certs |

 ---

--- a/homelab-mastery/concepts/oauth2-oidc.md
+++ b/homelab-mastery/concepts/oauth2-oidc.md
@ -6,16 +6,16 @@ This is the concept that most people get wrong. Understanding it cold will impre

 ## The Problem SSO Solves

-Without SSO: 9 services = 9 separate user databases. To add a friend:
+Without SSO: 11 services = 11 separate user databases. To add a friend:
 - Create account in Forgejo
 - Create account in Grafana
 - Create account in Open WebUI
 - Create account in Kavita
- ... 9 times
+- ... 11 times

-To remove their access: 9 places to deactivate.
+To remove their access: 11 places to deactivate.

-With SSO: 1 account in Authentik. Access to all 9 services. Deactivate once.
+With SSO: 1 account in Authentik. Access to all 11 services. Deactivate once.

 ---

@ -168,4 +168,4 @@ Authentik acts as a reverse proxy in front of the app. The user authenticates wi

 ## What to Say About SSO

-> *"I implemented single sign-on across all nine services using Authentik as the OIDC identity provider. Each service is registered as an OAuth2 client with a unique client ID and redirect URI. The OAuth2 authorization code flow means user credentials only ever go to Authentik — other services receive a signed JWT and never see the password. I hit a distributed systems issue in production where authorization codes were being invalidated by active-active load balancing across two hosts — I diagnosed it by tracing the OAuth2 flow and fixed it by sharing a single Postgres database between both Authentik instances over a private Tailscale network."*
+> *"I implemented single sign-on across all eleven services using Authentik as the OIDC identity provider. Each service is registered as an OAuth2 client with a unique client ID and redirect URI. The OAuth2 authorization code flow means user credentials only ever go to Authentik — other services receive a signed JWT and never see the password. I hit a distributed systems issue in production where authorization codes were being invalidated by active-active load balancing across two hosts — I diagnosed it by tracing the OAuth2 flow and fixed it by sharing a single Postgres database between both Authentik instances over a private Tailscale network."*
--- a/homelab-mastery/interview-prep/explain-the-project.md
+++ b/homelab-mastery/interview-prep/explain-the-project.md
@ -2,19 +2,19 @@

 ## The 30-Second Version (LinkedIn DM, recruiter screen)

-> *"I built a self-hosted homelab running a public website at kitestacks.com with nine services — including a Git platform, AI assistant, eBook library, monitoring stack, and SSO. It runs on my home PC with a Hetzner cloud VPS as a live failover, connected through Cloudflare Tunnel so no ports are exposed on my home network. Everything is containerized with Docker and documented in a private Forgejo repo."*
+> *"I built a self-hosted homelab running a public website at kitestacks.com with eleven services — including a Git platform, AI assistant, eBook library, bookmark manager, wiki, help desk, monitoring stack, and SSO. It runs on my home PC with a Hetzner cloud VPS as a live failover, connected through Cloudflare Tunnel so no ports are exposed on my home network. Everything is containerized with Docker and documented in a private Forgejo repo."*

 ---

 ## The 2-Minute Version (phone screen, LinkedIn intro)

-> *"I built KiteStacks — a multi-host self-hosted platform running at kitestacks.com. The core is nine services containerized with Docker: a Forgejo Git instance, Grafana monitoring, Authentik for single sign-on, Open WebUI for AI access, Kavita for reading, Karakeep for bookmarks, OpenProject for tasks, Uptime Kuma for monitoring, and a custom portal I built myself.*
+> *"I built KiteStacks — a multi-host self-hosted platform running at kitestacks.com. The core is eleven services containerized with Docker: a custom portal, Forgejo Git instance, Authentik for single sign-on, Open WebUI for AI access, Karakeep for bookmarks, Kavita for reading, Grafana with Prometheus for monitoring, Uptime Kuma for uptime checks, BookStack for documentation, OSTicket for help desk, and Portainer for container management.*
 >
 > *It runs on my home machine with a Hetzner VPS as a permanent cloud replica — active-active load balanced through Cloudflare Tunnel so the site stays up even when I'm traveling and my home network is down.*
 >
 > *The hardest part was a production SSO bug where OAuth2 authorization codes were being invalidated by the active-active routing — I traced the OAuth2 flow, identified it as a split-database problem, and solved it by migrating both hosts to a shared Postgres instance accessible only over a private Tailscale network.*
 >
-> *I'm currently studying for the CCNA to formalize the networking knowledge this project required."*
+> *I'm currently studying for CompTIA A+ Core 2 (exam goal July 2026), then CCNA to formalize the networking knowledge this project required."*

 ---

@ -52,7 +52,7 @@ Be ready to go deep on any of these topics. Know the answers cold.

 **"How does the monitoring work?"**

-> *"Prometheus scrapes metrics from two node-exporter instances every 15 seconds — one on the home machine via Docker DNS and one on the Hetzner VPS via its public IP. Grafana visualizes both with the Node Exporter Full dashboard, and you can switch between hosts with an instance picker. Uptime Kuma runs external HTTP checks against all nine public subdomains and would alert me if any went down."*
+> *"Prometheus scrapes metrics from two node-exporter instances every 15 seconds — one on the home machine via Docker DNS and one on the Hetzner VPS via its public IP. Grafana visualizes both with the Node Exporter Full dashboard, and you can switch between hosts with an instance picker. Uptime Kuma runs external HTTP checks against all eleven public subdomains and alerts me if any go down."*

 ---

--- a/homelab-mastery/learning-path/README.md
+++ b/homelab-mastery/learning-path/README.md
@ -2,13 +2,13 @@

 ## Your Advantage

-You don't have a blank canvas. You have a live production system you built. Most people study networking in a textbook. You configured Cloudflare DNS, set up Tailscale, debugged a Docker networking ufw issue, and traced a distributed systems bug in OAuth2. That's hands-on experience that study alone can't replicate.
+You don't have a blank canvas. You have a live production system you built — eleven services running across two hosts with SSO, active-active failover, and shared databases. Most people study networking in a textbook. You configured Cloudflare DNS, set up Tailscale, debugged a Docker networking ufw issue, and traced a distributed systems bug in OAuth2. That's hands-on experience that study alone can't replicate.

 The goal now: attach the vocabulary, depth, and theory to things you've already done.

 ---

-## Phase 1 — Complete A+ Core 2 (Now)
+## Phase 1 — Complete A+ Core 2 (Exam goal: July 7, 2026)

 **Focus areas that directly map to your homelab:**

@ -66,16 +66,18 @@ The CCNA will make everything in your homelab make deeper sense. After CCNA, re-
 |-----|------------------------|
 | EC2 | Hetzner VPS (kscloud1) |
 | S3 | Static file storage |
-| VPC | Docker bridge network |
+| VPC | Docker bridge network (kitestacks) |
 | ALB + CloudFront | Cloudflare Tunnel + edge |
-| RDS | Authentik Postgres |
-| ElastiCache | Authentik Redis |
+| RDS | Shared Postgres on kscloud1 (Authentik + Forgejo) |
+| ElastiCache | Shared Redis on kscloud1 |
 | CloudWatch | Prometheus + Grafana |
 | Route 53 | Cloudflare DNS |
-| IAM | Authentik RBAC / groups |
+| IAM | Authentik RBAC / groups (homelab-admin) |
 | Secrets Manager | .env files (what you'd replace) |
 | ECS / Fargate | Docker Compose (what you use) |
 | VPC Peering | Tailscale overlay |
+| Confluence/SharePoint | BookStack |
+| ServiceNow | OSTicket |

 ---