docs: comprehensive homelab-mastery rewrite with full build guides

Complete documentation suite for KiteStacks covering all 11 services across 2-host active-active architecture. Includes beginner track (with AI, 8 files) and advanced track (without AI, 7 files) with time estimates, real troubleshooting cases, and command-by-command explanations. Updates certifications roadmap to reflect July 7 2026 A+ Core 2 exam goal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 01:08:43 -05:00 · 2026-06-19 01:08:43 -05:00 · 1e8319ee75
commit 1e8319ee75
parent e3cfa80d98
24 changed files with 5243 additions and 298 deletions
--- a/homelab-mastery/README.md
+++ b/homelab-mastery/README.md
@ -1,48 +1,109 @@
-# Homelab Mastery — KiteStacks Learning Guide
+# KiteStacks Homelab — Master Guide
 **Owner:** kenpat  
-**Purpose:** Everything needed to understand, explain, rebuild, and build a career around the KiteStacks homelab project.
+**Domain:** kitestacks.com  
 **Status:** Live and running  
 **Last Updated:** 2026-06-19
 ---
-## Your Current Status
+## What Is KiteStacks?
-| Milestone | Status |
+KiteStacks is a self-hosted homelab — a real, production web platform running on two computers
-|-----------|--------|
+that serves eleven public websites to the internet, 24 hours a day, even when the home machine
-| CompTIA A+ Core 1 | ✅ Passed — highest score in class (22 people) |
+is off.
-| CompTIA A+ Core 2 | 🔄 In progress |
+
-| CCNA | 📅 Next |
+It is not a tutorial project. It is not a demo. It runs at a real domain, with real users,
-| Cloud / AI certs | 📅 After CCNA |
+real uptime monitoring, and real failover. Every service is protected by single sign-on (SSO),
 meaning one account unlocks everything. All traffic goes through Cloudflare's global network —
 no ports are open on the home router, and the home IP address is never exposed.
 ### The One-Paragraph Summary
 > *KiteStacks is a self-hosted homelab running eleven public-facing services behind Cloudflare
 > Tunnel with no open ports on the home router. All logins are handled by Authentik — a
 > self-hosted identity provider using OIDC/OAuth2, so one account unlocks every service.
 > A Hetzner cloud VPS (kscloud1) acts as a permanent cloud replica: if the home machine (monk)
 > goes offline, kscloud1 keeps everything running with zero downtime. Both hosts share a single
 > Postgres and Redis database over a private Tailscale VPN, so SSO logins always work regardless
 > of which server answers. Monitoring runs via Prometheus, Grafana, Uptime Kuma, and a desktop
 > Conky widget that shows live kscloud1 service health at a glance.*
 ---
-## What This Repo Is
+## The Two Computers
-You built a production homelab — a real multi-host, highly available web platform with SSO, monitoring, cloud failover, and AI services. Most people learning DevOps do tutorials with fake projects. You have a real one running at a real domain.
+| Name | What It Is | Role |
 |------|-----------|------|
 | **monk** | Home PC (ThinkPad T14s) | Development machine. Code and configs are built here, then pushed to kscloud1. |
 | **kscloud1** | Hetzner VPS in Germany | Always-live production server. Receives what monk pushes. Stays up even if monk is off. |
-This repo exists so you can:
+A third machine — the **Samurai desktop** — will eventually join as a second home connector,
-1. **Understand** what everything does at the conceptual level
+adding more redundancy when it is running.
-2. **Explain it** confidently to a hiring manager, recruiter, or LinkedIn connection
+
-3. **Rebuild it** from scratch on a new machine if you ever need to
+---
-4. **Map it** to real certifications and career paths
+
 ## The Eleven Public Services
 | Service | URL | What It Does |
 |---------|-----|-------------|
 | **Portal** | www.kitestacks.com | The homepage — links to everything, live system stats |
 | **Authentik** | auth.kitestacks.com | SSO login provider — one account for all services |
 | **Forgejo** | gitforge.kitestacks.com | Self-hosted Git — stores all code and documentation |
 | **Open WebUI** | ai.kitestacks.com | AI chat interface (ChatGPT-style, self-hosted) |
 | **Karakeep** | links.kitestacks.com | Bookmark and read-it-later manager |
 | **Kavita** | kavita.kitestacks.com | eBook and manga library |
 | **Grafana** | grafana.kitestacks.com | Monitoring dashboards — CPU, RAM, network |
 | **Uptime Kuma** | status.kitestacks.com | Service uptime status page |
 | **BookStack** | wiki.kitestacks.com | Self-hosted wiki and documentation platform |
 | **OSTicket** | tasks.kitestacks.com | Help desk and ticket tracking system |
 | **Portainer** | portainer.kitestacks.com | Docker container management dashboard |
 ---
 ## Navigation
-| Section | What's Inside |
+| Section | What Is Inside |
-|---------|--------------|
+|---------|---------------|
-| [certifications/](certifications/roadmap.md) | Full cert roadmap for cloud engineering, what each cert proves, study order |
+| [architecture/overview.md](architecture/overview.md) | How the whole system is wired together — diagrams, traffic flow |
-| [architecture/](architecture/overview.md) | How the entire system works, why it was built this way |
+| [architecture/services.md](architecture/services.md) | Every service: container name, port, volume, command reference |
-| [concepts/](concepts/) | Deep dives on every technology: Docker, networking, OAuth2, Tailscale, etc. |
+| [architecture/decisions.md](architecture/decisions.md) | Why each technology was chosen over the alternatives |
-| [build-guide/](build-guide/README.md) | Step-by-step rebuild from a blank machine, with explanations of every decision |
+| [build-guide/README.md](build-guide/README.md) | How to build this from scratch — choose beginner (AI) or advanced |
-| [interview-prep/](interview-prep/explain-the-project.md) | Exactly what to say to hiring managers, common questions + model answers |
+| [concepts/docker.md](concepts/docker.md) | What Docker actually is and how containers work |
-| [learning-path/](learning-path/README.md) | Structured study plan, free resources, what to learn in what order |
+| [concepts/networking.md](concepts/networking.md) | DNS, ports, TLS, Tailscale, Cloudflare Tunnel, firewalls |
 | [concepts/oauth2-oidc.md](concepts/oauth2-oidc.md) | How SSO works — OAuth2, OIDC, JWTs explained simply |
 | [concepts/linux.md](concepts/linux.md) | Linux commands, file ownership, sudo, SSH tunnels |
 | [certifications/roadmap.md](certifications/roadmap.md) | Cert path from A+ to CKA — what to study and in what order |
 | [interview-prep/explain-the-project.md](interview-prep/explain-the-project.md) | What to say to hiring managers — model answers |
 | [learning-path/README.md](learning-path/README.md) | Structured study plan, free resources, daily habits |
 ---
-## The One-Paragraph Project Summary
+## Where to Start
-> *KiteStacks is a self-hosted homelab running nine public-facing services behind Cloudflare Tunnel, with full SSO via Authentik (OIDC/OAuth2), active-active cloud failover on a Hetzner VPS, private networking over Tailscale, and real-time monitoring via Prometheus and Grafana. The platform serves a public domain (kitestacks.com) and stays online even when the primary home machine is off — all running on commodity hardware with no open ports on the home router.*
+**If you want to understand what you built:**
 → [architecture/overview.md](architecture/overview.md)
-That is what you built. Now learn to own every word of it.
+**If you want to rebuild it from scratch:**
 → [build-guide/README.md](build-guide/README.md) — pick your track
 **If you have an interview coming up:**
 → [interview-prep/explain-the-project.md](interview-prep/explain-the-project.md)
 **If you want to understand the tech behind it:**
 → Pick a topic in [concepts/](concepts/)
 **If you want to know what certifications to study next:**
 → [certifications/roadmap.md](certifications/roadmap.md)
 ---
 ## Certification Progress
 | Cert | Status |
 |------|--------|
 | CompTIA A+ Core 1 | ✅ Passed — highest score in class (22 people) |
 | CompTIA A+ Core 2 | 🔄 In progress — exam goal July 7, 2026 |
 | CCNA | 📅 Next after A+ Core 2 |
 | AWS Solutions Architect Associate | 📅 After CCNA |
 | CKA (Kubernetes) | 📅 After AWS certs |
--- a/homelab-mastery/architecture/decisions.md
+++ b/homelab-mastery/architecture/decisions.md
@ -1,12 +1,16 @@
 # Architecture Decisions — The Why Behind Every Choice
-For every technology choice, there was a reason. Understanding the "why" is what separates someone who copied commands from someone who designed a system.
+For every technology choice, there was a reason. Understanding the "why" is what separates
 someone who copied commands from someone who designed a system.
 **Last Updated:** 2026-06-19
 ---
 ## Why Docker Instead of Running Services Directly?
-**Problem:** Running 15+ services directly on a Linux host creates dependency hell — different Python versions, conflicting library versions, services affecting each other.
+**Problem:** Running 15+ services directly on a Linux host creates dependency conflicts —
 different Python versions, conflicting library versions, services that break each other on updates.
 **Options considered:**
 - Bare metal: install each app directly on the OS
@ -16,13 +20,15 @@ For every technology choice, there was a reason. Understanding the "why" is what
 **Decision:** Docker
 **Why:**
- Each container has its own filesystem, dependencies, and runtime — they can't conflict
+- Each container has its own filesystem and runtime — they can't conflict
- Starting/stopping/updating one service doesn't affect others
+- Starting, stopping, or updating one service doesn't affect others
- The `docker-compose.yml` file IS the documentation — it shows exactly what the service needs to run
+- The `docker-compose.yml` file IS the documentation — it shows exactly what the service needs
 - Portability: move the same compose file to a new machine and it works identically
- Isolation: if Karakeep gets compromised, it can't easily touch Forgejo's data
+- `restart: unless-stopped` means containers self-heal after a crash or host reboot
-**What you'd say to a hiring manager:** *"I containerized every service using Docker and Docker Compose so each has isolated dependencies and the entire deployment is reproducible from a single YAML file."*
+**What to say in an interview:**
 > *"I containerized every service using Docker Compose so each has isolated dependencies
 > and the entire deployment is reproducible from a single YAML file."*
 ---
@ -30,170 +36,247 @@ For every technology choice, there was a reason. Understanding the "why" is what
 **Problem:** How do you make home services accessible from the internet?
-**Traditional approach:** Open port 80 and 443 on the home router, configure NAT, point DNS to home IP.
+**Traditional approach:** Open ports 80 and 443 on the home router, configure NAT,
 point DNS to your home IP address.
 **Problems with that:**
- Exposes your home IP address publicly (DDoS risk, can be found, ISP tracks it)
+- Your home IP is public (DDoS risk, can be scanned and targeted)
- Dynamic home IP means DNS breaks every time IP changes
+- Dynamic home IP means DNS breaks every time the ISP changes it
- Some ISPs block residential port 80/443
+- Some ISPs block residential ports 80 and 443
- Router configuration is error-prone and varies by hardware
+- Router configuration is fragile and varies by hardware
 **Decision:** Cloudflare Tunnel (cloudflared)
 **Why:**
- cloudflared makes an OUTBOUND connection to Cloudflare — no inbound ports needed
+- cloudflared makes an outbound connection to Cloudflare — no inbound ports needed at all
- Home IP never exposed
+- Home IP is never exposed to the public internet
- Works regardless of ISP restrictions
+- Works on any ISP, any network, any firewall
- Cloudflare handles TLS/HTTPS — you don't manage SSL certificates
+- Cloudflare handles TLS certificates automatically (no Let's Encrypt setup)
 - Free tier covers everything needed
- Bonus: built-in DDoS protection
+- Built-in DDoS protection at Cloudflare's edge
-**The trade-off:** You depend on Cloudflare. If Cloudflare has an outage, your site goes down even if your hardware is fine. This is acceptable — Cloudflare's uptime is better than most home internet connections.
+**The tradeoff:** You depend on Cloudflare. If Cloudflare has an outage, your site goes down
 even if your hardware is fine. Acceptable — Cloudflare's uptime exceeds most home ISPs.
 ---
-## Why Authentik for SSO Instead of Separate Logins Per App?
+## Why Authentik for SSO?
-**Problem:** 9 services means 9 different usernames and passwords to manage. Adding a user requires going into 9 admin panels. Removing access means 9 places to deactivate.
+**Problem:** Eleven services means eleven separate usernames and passwords. Adding a user
 means eleven admin panels. Removing access means eleven places to deactivate.
 **Options:**
- Separate logins per service (no SSO)
+- No SSO — separate logins per service
- Authelia (simpler, forward-auth proxy only)
+- Authelia — simpler, forward-auth proxy only
- Authentik (full OIDC provider, more complex)
+- Authentik — full OIDC provider, more complex to set up
- Keycloak (enterprise-grade, very heavy)
+- Keycloak — enterprise-grade, very heavy on RAM
 **Decision:** Authentik
 **Why:**
 - One account controls access to everything
- Apps that support native OIDC (Grafana, Kavita, Open WebUI, Karakeep) get real SSO — the user is authenticated inside the app
+- Apps that support native OIDC (Grafana, Kavita, Karakeep, Open WebUI, Portainer, BookStack,
- Can restrict which groups can access which applications (Portainer restricted to homelab-admin group)
+  Forgejo) get real SSO — user is authenticated inside the app with a JWT, not just at a proxy
- Self-hosted — user data stays on your infrastructure
+- Access policies per application (Portainer restricted to `homelab-admin` group only)
- Authentik supports both native OIDC (for apps that support it) and proxy provider (for apps that don't)
+- Self-hosted — user data never leaves your infrastructure
-**The trade-off:** Authentik is complex to set up and has a significant memory footprint. Authelia would be simpler. But Authelia only does forward-auth proxy — it can't give an app a real JWT. Authentik does both.
+**Why not Authelia:** Authelia only does forward-auth proxy. It blocks the login page until
 authenticated, but the app itself never receives user identity. Authentik sends a real JWT
 with user email and name — apps can create user accounts automatically on first login.
 ---
 ## Why a Shared Postgres Instead of Separate Authentik Databases?
-**Problem:** After setting up active-active failover, users kept getting `invalid_grant` errors when signing in through SSO.
+**Problem:** After deploying two Cloudflare Tunnel connectors, users got `invalid_grant`
 errors when signing in through SSO — roughly 50% of the time.
-**Root cause:** OAuth2 authorization codes are rows in a database. The flow is:
+**Root cause:** OAuth2 authorization codes are short-lived rows in a database.
 1. `/authorize` → code stored in Database A (monk's Authentik)
 2. `/token` → looks for code in Database B (kscloud1's Authentik)
 3. Code not found → `invalid_grant`
-Cloudflare Tunnel load-balances between monk and kscloud1 for every HTTP request. Steps 1 and 2 of the OAuth flow can hit different hosts.
+```
 Step 1: /authorize → creates code → stored in monk's Authentik DB
 Step 2: /token     → looks for code → hits kscloud1's Authentik DB → NOT FOUND
 ```
 Cloudflare load-balances every HTTP request independently. Steps 1 and 2 of the OAuth2
 flow can hit completely different hosts. The code exists in one database but not the other.
 **Options:**
- Sync databases continuously (complex, slow, conflict-prone)
+- Sync both databases continuously (complex, slow, conflict-prone)
 - Use sticky sessions (Cloudflare paid feature)
- Share one database (simple, reliable)
+- Share one database between both Authentik instances
-**Decision:** Shared Postgres on kscloud1, accessible only over Tailscale
+**Decision:** Single shared Postgres + Redis hosted on kscloud1, accessible only over Tailscale
 **Why:**
- Both monk and kscloud1 Authentik read/write the same database — authorization codes always found
+- Both connectors' Authentik instances read and write the same database
- Tailscale binding means the database is never exposed to the public internet (security)
+- Authorization codes are always found regardless of which host handles which request
- Simple: one line change in each `docker-compose.yml` to point to a different host
+- Database is bound to kscloud1's Tailscale IP — never reachable from the public internet
- Cost: free (already paying for kscloud1)
+- Simple configuration change: one environment variable pointing to the shared host
-**The trade-off:** If kscloud1 goes down and Tailscale connectivity breaks, monk's Authentik can't start. Rollback procedure: restore monk's compose to use a local Postgres.
+**The tradeoff:** If kscloud1 and Tailscale both go down, monk's Authentik can't connect
 to the database and fails to start. Rollback: restore local Postgres in monk's compose file.
 ---
 ## Why Tailscale Instead of WireGuard or OpenVPN?
-**Problem:** Need private networking between monk (home) and kscloud1 (Hetzner cloud) without exposing the Authentik database to the public internet.
+**Problem:** Need private networking between monk (home) and kscloud1 (Hetzner cloud).
 The shared Authentik database must not be exposed to the public internet.
 **Options:**
- WireGuard: manual key exchange, manual routing, technical to configure
+- WireGuard: manual key exchange, manual routing, hard to configure through NAT
- OpenVPN: even more complex, slower
+- OpenVPN: complex, slower, more overhead
 - Tailscale: managed WireGuard, automatic key exchange, works behind NAT
 **Decision:** Tailscale
 **Why:**
- Works instantly — install, authenticate, done
+- Works in minutes: install, authenticate, done
- Handles NAT traversal automatically (monk is behind home router NAT)
+- Handles NAT traversal automatically — monk is behind home router NAT
- Devices get stable 100.x.x.x IPs regardless of actual network location
+- Every device gets a stable `100.x.x.x` IP regardless of location
 - Free for up to 100 devices
- Uses WireGuard under the hood — same encryption, much easier configuration
+- WireGuard underneath — same encryption, much easier operation
-**The trade-off:** Tailscale is a managed service — you trust Tailscale's coordination servers. The actual data is encrypted peer-to-peer (Tailscale can't see it), but they control device authentication. Self-hosted alternative: Headscale.
+**The tradeoff:** You trust Tailscale's coordination servers to manage device authentication.
 Actual data is encrypted peer-to-peer (Tailscale never sees it), but they control who can
 join your network. Self-hosted alternative if needed: Headscale.
 ---
-## Why Active-Active Instead of Active-Passive Failover?
+## Why Active-Active Failover Instead of Active-Passive?
-**The context:** The user travels. When away from home, monk might be inaccessible (home network down, ISP outage, power). kscloud1 should keep the site running.
+**The situation:** The user travels. When away from home, monk may be unreachable.
 kscloud1 must keep the site running.
-**Active-Passive:** kscloud1 only starts serving if monk is detected as down. Cloudflare would need health checks and failover rules.
+**Active-Passive:** kscloud1 only starts serving if Cloudflare detects monk as down.
 Requires health checks, failover rules, and a delay before traffic switches.
-**Active-Active:** Both monk and kscloud1 are always in the Cloudflare Tunnel rotation. Every request might hit either host.
+**Active-Active:** Both monk and kscloud1 are always in the Cloudflare Tunnel rotation.
 Every request may hit either host at any time.
 **Decision:** Active-Active
 **Why:**
- Simpler: no health checks to configure, no failover logic
+- No failover logic needed — both are always live
- Instant: if monk goes down, kscloud1 is already handling 50% of traffic
+- Instant: if monk goes down, kscloud1 is already handling traffic
- Free: Cloudflare Tunnel active-active is free; health-check-based failover requires paid plans
+- Free: Cloudflare Tunnel active-active is included; health-check-based failover is paid
-**The trade-off:** Stateful apps (Forgejo, OpenProject, Kavita) have separate databases on each host. A user might see different data depending on which host answers. This was explicitly accepted: the point is uptime, not data consistency across hosts.
+**The tradeoff:** Stateful apps with separate databases (Kavita, Karakeep) may show
 different data depending on which host answers. Explicitly accepted — the priority is
 uptime, not data consistency across hosts. Forgejo and Authentik share databases so
 they are consistent.
 ---
-## Why nginx for the Portal Instead of a Pre-Built Dashboard?
+## Why a Custom Portal Instead of a Pre-Built Dashboard?
 **Options:**
- gethomepage (what was used before) — nice but limited customization
+- Homepage (gethomepage) — nice but limited customization
 - Heimdall — similar limitations
- Custom static site + nginx — full control
+- Custom static HTML/CSS/JS + nginx — full control, full ownership
-**Decision:** Custom static HTML/CSS/JS + nginx
+**Decision:** Custom static site
 **Why:**
- Complete visual control — the cyberpunk theme, the layout, every pixel
+- Complete visual control — the cyberpunk theme, layout, every card, every color
- Static files served by nginx are extremely fast and reliable
+- Static files + nginx are extremely fast and reliable (no Node.js, no build step)
- Can proxy the metrics API for real-time stats without CORS issues
+- nginx proxies the `/api/*` endpoints to the metrics API without CORS issues
- No framework dependencies — no Node.js, no build step, just files
+- No dependency on external frameworks that can change or break
-**The trade-off:** More work to build and maintain than a pre-built dashboard. But you now understand every line of it.
+**The tradeoff:** More work to build and maintain. But you understand every line of it,
 and you can explain exactly why every piece is there.
 ---
 ## Why Python + FastAPI for the Metrics API?
-**Problem:** The portal needs real-time system stats (CPU, RAM, network), weather, and Forgejo activity. These can't come from static HTML files.
+**Problem:** The portal needs live system stats (CPU, RAM, network), weather, and
 Forgejo git activity. Static HTML can't provide these.
-**Options:**
+**Decision:** Python FastAPI with `psutil`
 - Shell scripts + cron → write stats to a JSON file the frontend reads
 - Node.js + Express
 - Python + FastAPI
 **Decision:** Python FastAPI
 **Why:**
- Python's `psutil` library reads system metrics with one line of code
+- `psutil` reads host system metrics in one line of Python
- FastAPI is modern, fast, and automatically documents the API
+- FastAPI auto-generates API documentation and handles async requests well
 - Python is readable — easy to understand and modify
 - `async/await` means the API doesn't block while waiting for weather API responses
 - Python is readable — you can understand and modify the code
-**The special requirement:** The container needs `network_mode: host` and `pid: host`. Without these:
+**Special requirements:**
- `network_mode: host`: the container can see the host's network interfaces and report real network throughput (not container-level)
+- `network_mode: host` — container shares host network namespace so psutil sees real
- `pid: host`: psutil can read the host's `/proc` filesystem, showing actual system stats instead of container stats
+  network interfaces, not the container's virtual interface
 - `pid: host` — container can read the host's `/proc` filesystem for accurate process stats
 Without these flags, the API would report container-level stats instead of actual laptop stats.
 ---
-## Why the Forgejo Repo for Documentation?
+## Why Forgejo Instead of GitHub or GitLab?
-You could keep documentation in Notion, Google Docs, or a wiki.
+**Problem:** Need to store all homelab code, configs, and documentation in version control.
-**Why Forgejo:**
+**Options:**
- It's self-hosted — you own the data
+- GitHub: free, reliable, but your configs and docs are on someone else's server
- Git tracks every change with a timestamp and message
+- GitLab: self-hostable but heavy (4GB+ RAM for full install)
- The documentation lives alongside the configs it describes
+- Forgejo: lightweight GitHub-like self-hosted Git, fork of Gitea
 - Hiring managers can see the commit history and read your documentation directly
-**What this shows to a hiring manager:** You treat documentation like code — version-controlled, structured, maintained.
+**Decision:** Forgejo
 **Why:**
 - Self-hosted — configs and documentation stay on your infrastructure
 - Very lightweight — uses less than 100MB RAM
 - GitHub-compatible API — tools that work with GitHub also work with Forgejo
 - Full UI with code review, issues, CI/CD (Forgejo Actions)
 - Shows commit history and documentation to anyone you give access to
 **The tradeoff:** You maintain it yourself. If Forgejo goes down, git operations fail.
 Mitigated by kscloud1 running a replica and the shared Postgres.
 ---
 ## Why OSTicket for the Help Desk?
 **What it replaced:** OpenProject (project management tool on tasks.kitestacks.com)
 **Why OpenProject was removed:**
 - OpenProject CE (Community Edition) requires an Enterprise Edition license for SSO
 - The SSO button simply does not appear in CE — it is a hard paywall with no workaround
 - OpenProject is also resource-heavy for what it provides
 **Why OSTicket:**
 - Lightweight and runs well on the existing stack
 - Email integration works (SMTP via Gmail app password — confirmed working)
 - Handles the ticket/task tracking use case without the licensing barrier
 ---
 ## Why BookStack for the Wiki?
 **Problem:** Need a place for long-form documentation that's more structured than markdown files.
 **Decision:** BookStack
 **Why:**
 - Clean, organized UI: Shelves → Books → Chapters → Pages hierarchy
 - WYSIWYG editor — easy to write docs without markdown syntax
 - Authentik OIDC SSO works natively
 - API available — docs can be pushed programmatically from scripts or CI
 **Key gotcha:** Cache directory must be writable by the container user.
 `chown -R abc:users /config/www/framework/cache/` is required after first install.
 ---
 ## Why the Forgejo Shared Postgres?
 **Problem:** With two connectors in active-active, Forgejo on monk and kscloud1 had
 separate SQLite databases. Repos created on one weren't visible on the other.
 **Fix:** Migrated both Forgejo instances to a single shared PostgreSQL database on kscloud1
 (same shared server as Authentik's Postgres). Both connectors now serve identical Forgejo data.
 **How it was done:**
 - `forgejo dump --database postgres` — exported clean SQL from monk's Forgejo
 - Dropped the pgloader schema (had wrong structure), reloaded the clean SQL
 - Both compose files point to `authentik-postgres:5432` database `forgejo`, user `forgejo`
 - kscloud1's Forgejo joined the `authentik_default` Docker network to reach authentik-postgres
--- a/homelab-mastery/architecture/overview.md
+++ b/homelab-mastery/architecture/overview.md
@ -1,138 +1,169 @@
 # KiteStacks Architecture — Full System Overview
 **Last Updated:** 2026-06-19
 ---
 ## The Big Picture
 ```
-                        INTERNET
+                          INTERNET
-                           │
+                             │
-                    ┌──────▼──────┐
+                      ┌──────▼──────┐
-                    │  Cloudflare  │  DNS + TLS termination
+                      │  Cloudflare  │  DNS + TLS termination
-                    │   (edge)     │  Zero Trust Tunnel
+                      │   (edge)     │  Tunnel routing
-                    └──────┬──────┘
+                      └──────┬──────┘
-                           │  HTTPS (443) only
+                             │  HTTPS only — home IP never exposed
-          ┌────────────────┼────────────────┐
+              ┌──────────────┴──────────────┐
-          │ connector 1    │ connector 2    │ connector 3
+              │ connector 1                 │ connector 2
-          │                │               │
+              │                             │
-   ┌──────▼──────┐         │        ┌──────▼──────┐
+       ┌──────▼──────┐               ┌──────▼──────┐
-   │    MONK     │         │        │   KSCLOUD1  │
+       │    MONK     │               │   KSCLOUD1  │
-   │ (home PC)   │         │        │ (Hetzner VPS│
+       │ (ThinkPad   │               │ (Hetzner VPS│
-   │             │  Active │        │  5.78.x.x)  │
+       │  T14s, home)│               │  Germany)   │
-   │ All 9       │  Active │        │             │
+       │             │               │             │
-   │ services    │         │        │ All 9       │
+       │ Development │               │ ALWAYS LIVE │
-   │             │         │        │ services    │
+       │ Pushes to → │               │ Receives ←  │
-   └──────┬──────┘         │        └──────┬──────┘
+       │ kscloud1    │               │ from monk   │
-          │                │               │
+       └──────┬──────┘               └──────┬──────┘
-          └────────────────┼───────────────┘
+              │                             │
-                     TAILSCALE VPN
+              └─────────── TAILSCALE ───────┘
-                    (100.x.x.x range)
+                         (100.x.x.x range)
-                           │
+                         Encrypted peer-to-peer
-                  ┌────────▼────────┐
+                                 │
-                  │  SHARED DB LAYER │
+                    ┌────────────▼────────────┐
-                  │  on kscloud1    │
+                    │    SHARED DATABASE LAYER │
-                  │  Postgres :5432  │
+                    │    hosted on kscloud1    │
-                  │  Redis    :6379  │
+                    │                         │
-                  │  (Tailscale     │
+                    │  PostgreSQL  :5432       │
-                  │   only, private)│
+                    │  Redis       :6379       │
-                  └─────────────────┘
+                    │                         │
                    │  Bound to Tailscale IP   │
                    │  only — not public       │
                    └─────────────────────────┘
 ```
 **The key idea:** Cloudflare holds two persistent outbound connections — one from monk,
 one from kscloud1. Every request to kitestacks.com arrives at Cloudflare, which routes
 it to whichever connector responds. If monk goes offline, kscloud1 handles everything.
 Your home IP is never involved.
 ---
 ## How Work Flows Between the Two Hosts
 ```
 monk (dev)  ──push──►  kscloud1 (prod, always live)
 ```
 - **monk** is where changes are made: editing config files, testing new services, writing code
 - **kscloud1** receives those changes and is always serving live traffic
 - If monk is off, kscloud1 continues serving the last pushed state — users see no downtime
 - A third machine (Samurai desktop) is planned as a future second home connector
 ---
 ## The Eleven Public Services
 | Service | Container | URL | What It Does |
 |---------|-----------|-----|-------------|
 | Portal | `homepage` | www.kitestacks.com | Custom homepage — links, live stats, cyberpunk theme |
 | Authentik | `authentik` | auth.kitestacks.com | SSO identity provider — handles all logins |
 | Forgejo | `forgejo` | gitforge.kitestacks.com | Self-hosted Git (like GitHub) |
 | Open WebUI | `kite-openwebui` | ai.kitestacks.com | AI chat interface |
 | Karakeep | `karakeep` | links.kitestacks.com | Bookmark and read-it-later manager |
 | Kavita | `kavita` | kavita.kitestacks.com | eBook and manga reader |
 | Grafana | `grafana` | grafana.kitestacks.com | Monitoring dashboards |
 | Uptime Kuma | `uptime-kuma` | status.kitestacks.com | Public status page and uptime monitoring |
 | BookStack | `bookstack` | wiki.kitestacks.com | Self-hosted wiki / docs platform |
 | OSTicket | `osticket-app` | tasks.kitestacks.com | Help desk ticketing system |
 | Portainer | `portainer` | portainer.kitestacks.com | Docker management dashboard |
 ## The Infrastructure Services (Internal Only)
 | Container | What It Does |
 |-----------|-------------|
 | `cloudflared` | Cloudflare Tunnel connector — outbound connection to Cloudflare edge |
 | `prometheus` | Metrics collector — scrapes node-exporter every 15 seconds |
 | `node-exporter` | Exposes host CPU/RAM/disk/network metrics for Prometheus |
 | `blackbox-exporter` | HTTP probe monitor — checks endpoints are returning 200 |
 | `kite-litellm` | LLM proxy — routes AI requests to OpenRouter (many free models) |
 | `kitestacks-metrics-api` | Python FastAPI — serves live stats and Forgejo activity to portal |
 | `ntfy` | Push notification server — sends alerts to phone |
 | `flux` | GitOps controller — watches Forgejo, deploys changes automatically |
 | `authentik-worker` | Background job processor for Authentik |
 | `authentik-ldap` | LDAP proxy layer for Authentik |
 ---
 ## How Traffic Flows — Step by Step
 ### Someone visits www.kitestacks.com
 ```
 1. Browser → DNS lookup "www.kitestacks.com"
 2. DNS returns Cloudflare's anycast IP (not your home IP)
 3. Browser → HTTPS request to Cloudflare edge
 4. Cloudflare reads Host header: "www.kitestacks.com"
 5. Cloudflare routes request through active tunnel connector
   (monk or kscloud1 — whichever responds first)
 6. cloudflared resolves "homepage" via Docker DNS
 7. Request hits nginx in the homepage container
 8. nginx serves static HTML/CSS/JS from ./public/
 9. Browser JavaScript calls /api/metrics and /api/activity
 10. nginx proxies those to kitestacks-metrics-api (Python, host network)
 11. metrics-api reads CPU/RAM via psutil (sees real host, not container)
 12. metrics-api calls Forgejo API for recent commits
 13. Browser renders complete page with live stats
 ```
 ### Someone clicks "Sign In with Authentik"
 ```
 1. App (e.g. Grafana) redirects browser to:
   https://auth.kitestacks.com/application/o/authorize/
   ?client_id=grafana&redirect_uri=...&response_type=code
 2. Cloudflare routes this to a cloudflared connector
 3. Authentik shows login page
 4. User enters username + password
 5. Authentik validates against shared Postgres (on kscloud1, over Tailscale)
 6. Authentik creates an authorization code (row in DB) and redirects:
   https://grafana.kitestacks.com/login/generic_oauth?code=abc123
 7. Grafana backend POSTs to auth.kitestacks.com/application/o/token/
   with code=abc123 and client_secret
 8. THIS REQUEST may hit a DIFFERENT connector than step 2 did
   → This is why the shared DB matters: the code must exist in one DB,
     not two separate ones that might be out of sync
 9. Authentik finds code=abc123 in shared Postgres, validates it
 10. Authentik returns JWT (access_token + id_token)
 11. Grafana reads user's email from JWT, creates/updates local user
 12. User is logged in — never re-enters password for other SSO apps
 ```
 ---
-## Every Service and What It Does
+## The Shared Database — Why It Exists
-### The Nine Public Services
+After deploying two connectors (monk + kscloud1), users got `invalid_grant` errors when
 signing in. The cause: each host had its own separate Authentik database. The OAuth2 flow
 makes two separate HTTP requests:
-| Service | Container Name | What It Does | Why It's Here |
+1. `/authorize` → creates authorization code → stored in Database A
-|---------|---------------|--------------|---------------|
+2. `/application/o/token/` → looks up authorization code → hits Database B → **not found**
 | **Portal** | `homepage` | The public website (kitestacks.com) — custom nginx serving static HTML/CSS/JS with a cyberpunk theme | Front door to everything. Shows system stats, recent activity, links to all services |
 | **Authentik** | `authentik` | Identity provider — handles all logins via OIDC/OAuth2 SSO | Single place to manage all user accounts and access control |
 | **Forgejo** | `forgejo` | Self-hosted Git platform (like GitHub but yours) | Store all homelab code, config, and documentation |
 | **OpenProject** | `openproject` | Project management (like Jira) | Task tracking, project planning |
 | **Open WebUI** | `kite-openwebui` | ChatGPT-like AI chat interface | Access multiple AI models through one interface |
 | **Karakeep** | `karakeep` | Bookmark and read-it-later manager | Save links, articles, and content |
 | **Kavita** | `kavita` | eBook and manga reader | Personal digital library |
 | **Grafana** | `grafana` | Monitoring dashboards | Visualize CPU, RAM, network, uptime across both hosts |
 | **Uptime Kuma** | `uptime-kuma` | Status page and uptime monitoring | Monitor that all 9 services are up and alert if they go down |
-### The Infrastructure Services (Not Public-Facing)
+Cloudflare load-balances requests, so steps 1 and 2 can hit different hosts.
-| Service | What It Does |
+**Fix:** Both connectors point to a single shared Postgres+Redis hosted on kscloud1.
-|---------|-------------|
+It is bound only to kscloud1's Tailscale IP (`100.123.x.x`) — never the public IP.
-| `cloudflared` | Cloudflare Tunnel connector — creates encrypted outbound tunnel to Cloudflare edge |
+Only devices on the Tailscale network can connect.
 | `prometheus` | Metrics collection — scrapes system stats from both monk and kscloud1 every 15 seconds |
 | `node-exporter` | Exposes host system metrics (CPU, RAM, disk, network) for Prometheus to scrape |
 | `kite-litellm` | LLM proxy gateway — routes AI requests to OpenRouter (multiple free models) |
 | `portainer` | Docker management UI — visual interface to manage all containers |
 | `kitestacks-metrics-api` | Python FastAPI service — serves real-time system stats, weather, and Forgejo activity to the portal |
---
+**Forgejo** also uses this shared Postgres (separate database on the same server).
-
+Both monk's and kscloud1's Forgejo read from the same data, so git repos are consistent
-## How Traffic Flows
+regardless of which connector serves the request.
 ### When Someone Visits www.kitestacks.com
 ```
 1. Browser sends HTTPS request to www.kitestacks.com
 2. DNS resolves to Cloudflare's anycast IP (not your home IP)
 3. Cloudflare terminates TLS — your home router never sees HTTPS
 4. Cloudflare routes the request through the tunnel to whichever
   cloudflared connector responds first (monk or kscloud1)
 5. cloudflared resolves "homepage" via Docker DNS
 6. Request hits the nginx container serving the static portal
 7. Portal's JavaScript fetches /api/metrics and /api/activity
   from the kitestacks-metrics-api container via nginx proxy
 8. Page renders with live system stats and recent git activity
 ```
 ### When Someone Clicks "Sign In with Authentik"
 ```
 1. App (e.g., Grafana) redirects browser to auth.kitestacks.com/application/o/authorize/
 2. Authentik presents login page
 3. User enters credentials — Authentik validates against its database
   (stored on kscloud1's Postgres, shared over Tailscale)
 4. Authentik generates an authorization code and redirects back to Grafana
 5. Grafana's backend calls auth.kitestacks.com/application/o/token/
   to exchange the code for an access token
 6. Authentik validates the code (found in shared DB) and returns a JWT
 7. Grafana reads the user's email/name from the JWT and logs them in
 ```
 **The critical detail:** Steps 1 and 5 can hit different tunnel connectors (monk vs kscloud1). The authorization code from step 4 must exist in whichever database step 5 hits. That's why both connectors point to the SAME Postgres on kscloud1 — otherwise step 5 returns `invalid_grant` because the code isn't found.
 ---
 ## The Two Hosts in Detail
 ### Monk (Primary Home Machine)
 - **Role:** Primary production host
 - **Network:** Home LAN, no open ports on router (Cloudflare Tunnel handles all inbound)
 - **Services:** All 9 public services + all infrastructure services
 - **Data:** Each service has its own database/storage
 - **Authentik DB:** Points to kscloud1's Postgres over Tailscale (100.x.x.x)
 ### kscloud1 (Hetzner VPS)
 - **Role:** Permanent cloud replica — always on, even when monk is off (travel, power outage, etc.)
 - **Network:** Public IP, Cloudflare Tunnel connector 3
 - **Services:** Full replica of all 9 public services (separate databases except Authentik)
 - **Hosts:** The shared Authentik Postgres + Redis (bound to Tailscale interface only)
 - **Resources:** 3 vCPU, 3.7 GB RAM — tight but functional
 ### What's the Same Across Both
 - Same Cloudflare Tunnel token (different connector IDs assigned automatically)
 - Same Authentik database (shared via Tailscale)
 - Same Authentik secret key (required for JWT validation)
 - Same kavita.db (one-time sync — users and OIDC config)
 ### What's Different Across Both
 - Forgejo data (separate repos — accepted inconsistency)
 - OpenProject data (separate projects)
 - Karakeep bookmarks (separate)
 - Kavita book files (monk has them, kscloud1 doesn't — covers synced, books not)
 ---
@ -141,81 +172,109 @@
 Every container joins the `kitestacks` external Docker bridge network:
 ```bash
 # Create once on each host:
 docker network create kitestacks
 ```
-This is what makes Cloudflare Tunnel work. The cloudflared container is also on this network, so when Cloudflare tells cloudflared to route `http://grafana:3000`, Docker's internal DNS resolves `grafana` to the grafana container's IP on that network.
+All service containers and the cloudflared container join this network. Docker provides
 built-in DNS: when cloudflared needs to route to Grafana, it resolves the hostname `grafana`
 to that container's IP address on the bridge network.
-Without this shared network, cloudflared can't reach the service containers by name.
+```
 cloudflared → "grafana" → Docker DNS → 172.x.x.x:3000 → grafana container
 ```
 Without this shared network, cloudflared cannot reach services by name.
 ---
-## Why No Open Ports on the Router
+## Why No Open Ports on the Home Router
-Traditional homelab: open port 80/443 on home router → NAT to home server → expose home IP.
+Traditional approach: open port 80 and 443 on the router → NAT to home server → home IP in DNS.
-Problems with that:
+Problems:
- Your home IP is public (DDoS risk, targeted attacks)
+- Home IP is exposed publicly (DDoS target, ISP tracks it)
- Router configuration is fragile
+- Dynamic home IP breaks DNS when it changes
- ISP can change your IP (dynamic IP)
+- Some ISPs block residential port 80/443
- Some ISPs block port 80/443
+- Router misconfiguration = exposed server
-Cloudflare Tunnel approach:
+**Cloudflare Tunnel approach:**
- cloudflared container makes an OUTBOUND connection to Cloudflare
+- cloudflared makes one outbound HTTPS connection to Cloudflare edge servers
- Cloudflare holds that connection open
+- Cloudflare holds that connection open permanently
- Inbound requests come through Cloudflare, over that existing outbound tunnel
+- All inbound traffic arrives over that existing outbound connection
- Your home IP is never exposed
+- The home router sees only one outbound HTTPS connection — nothing unusual
- Works on any network, any ISP, any firewall
+- Home IP is never in DNS, never exposed
-This is why you can run a public website from a home PC with zero router configuration.
+**Result:** A public website running on a home PC with zero router configuration and
 no exposed home IP address.
 ---
 ## Tailscale — The Private Backbone
-Tailscale creates a private overlay network (VPN mesh) across all your devices:
+Tailscale creates an encrypted overlay network across all your devices.
 Every device gets a stable `100.x.x.x` IP regardless of physical location.
 ```
-monk (100.x.x.x) ←—— encrypted ——→ kscloud1 (100.x.x.x)
+monk       100.85.x.x  ←── WireGuard ───► 100.123.x.x  kscloud1
-monk (100.x.x.x) ←—— encrypted ——→ pixel-6 (100.x.x.x)
+samurai    100.74.x.x  ←── WireGuard ───► 100.123.x.x  kscloud1
 phone      100.x.x.x   ←── WireGuard ───► 100.123.x.x  kscloud1
 ```
-Used in this project for:
+Used in this homelab for:
-1. **Shared Authentik DB:** kscloud1's Postgres binds to its Tailscale IP, not its public IP. Only devices on the tailnet can connect. Monk points to that address.
+
-2. **Forgejo activity feed:** On kscloud1, the metrics API fetches recent commits from monk's Forgejo via monk's Tailscale IP — so both portal instances show the same activity feed.
+1. **Shared Authentik DB:** kscloud1 Postgres and Redis are bound to `100.123.x.x` only.
-3. **SSH/Admin access:** You can SSH into any device on the tailnet from anywhere.
+   Monk's Authentik connects to that address. Traffic is encrypted peer-to-peer.
 2. **SSH admin access:** SSH to kscloud1 from anywhere using its Tailscale IP.
   Even behind a hotel firewall or mobile data — Tailscale routes around it.
 3. **Uptime monitoring:** The Conky desktop widget on monk reads Uptime Kuma status
   from kscloud1 directly via Tailscale (not through Cloudflare), so it shows the
   true kscloud1-side status.
 ---
 ## The Monitoring Stack
 ```
-node-exporter (monk)  →  prometheus (monk)  →  grafana (monk)
+                  ┌──────────────┐
-node-exporter (kscloud1) ↗       (scrapes 5.78.x.x:9100)
+monk's            │  node-exporter│ ← exposes CPU/RAM/disk/network
 node-exporter     │  port 9100    │
                  └──────┬───────┘
                         │ scrape every 15s
                  ┌──────▼───────┐
 kscloud1's  ───► │  prometheus   │ (also scrapes kscloud1:9100 via public IP)
 metrics           └──────┬───────┘
                         │
                  ┌──────▼───────┐
                  │   grafana    │ ← visualize both hosts, switch via instance picker
                  └──────────────┘
 Uptime Kuma → HTTP checks every 60s → all 13 public service URLs
 Conky widget → reads Uptime Kuma API on kscloud1 → shows live dot per service
 ```
 Prometheus scrapes metrics every 15 seconds from:
 - `node-exporter:9100` — monk's own node-exporter (via Docker DNS)
 - `5.78.x.x:9100` — kscloud1's node-exporter (via public IP, port exposed 0.0.0.0)
 Grafana visualizes both, letting you switch between hosts in the instance picker.
 ---
 ## The Portal Architecture
-The portal is NOT gethomepage or any pre-built dashboard. It's a custom-built static site:
+The portal is a custom static site — not a pre-built dashboard:
 ```
-nginx (container: "homepage")
+nginx container ("homepage")
-  ├── /         → serves static HTML/CSS/JS from ./public/
+  ├── /           → static HTML/CSS/JS (cyberpunk theme, service cards)
-  └── /api/*    → proxy_pass to kitestacks-metrics-api:8000 (host)
+  └── /api/*      → proxy_pass → kitestacks-metrics-api on host
-kitestacks-metrics-api (network_mode: host, pid: host)
+kitestacks-metrics-api (Python FastAPI, network_mode: host, pid: host)
-  ├── GET /api/metrics   → psutil reads HOST's CPU/RAM/disk/network
+  ├── GET /api/metrics   → psutil reads HOST CPU/RAM/disk/network
-  ├── GET /api/weather   → wttr.in API → current weather by IP geolocation
+  ├── GET /api/weather   → wttr.in API → current conditions
-  ├── GET /api/activity  → Forgejo API → recent commits
+  ├── GET /api/activity  → Forgejo API → recent commits across all repos
  └── GET /api/health    → {"ok": true}
 ```
-The metrics API runs with `network_mode: host` and `pid: host` so it reads the HOST machine's process table and `/proc` filesystem — not the container's. Without this, it would report container stats, not laptop stats.
+`network_mode: host` — the container shares the host's network namespace.
 Without it, psutil would report the container's stats, not the laptop's.
 `pid: host` — the container can see the host's process table via `/proc`.
 Without it, system stats would be wrong.
--- a/homelab-mastery/architecture/services.md
+++ b/homelab-mastery/architecture/services.md
@ -0,0 +1,388 @@
 # KiteStacks — Complete Service Reference
 Every service that runs in KiteStacks: what it does, where it lives, how to manage it,
 and what commands to use. This is the day-to-day operations reference.
 **Last Updated:** 2026-06-19
 ---
 ## Quick Reference — All Containers on monk
 ```
 docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
 ```
 | Container | Purpose | Public URL |
 |-----------|---------|-----------|
 | `homepage` | Portal / main website | www.kitestacks.com |
 | `authentik` | SSO identity provider | auth.kitestacks.com |
 | `authentik-worker` | Authentik background jobs | — |
 | `authentik-ldap` | LDAP interface for Authentik | — |
 | `authentik-ldap-proxy` | LDAP proxy | — |
 | `forgejo` | Git platform | gitforge.kitestacks.com |
 | `kite-openwebui` | AI chat | ai.kitestacks.com |
 | `kite-litellm` | LLM proxy gateway | — |
 | `karakeep` | Bookmarks | links.kitestacks.com |
 | `karakeep-chrome` | Headless browser for Karakeep | — |
 | `karakeep-meilisearch` | Search engine for Karakeep | — |
 | `kavita` | eBook reader | kavita.kitestacks.com |
 | `grafana` | Monitoring dashboards | grafana.kitestacks.com |
 | `uptime-kuma` | Status page | status.kitestacks.com |
 | `bookstack` | Wiki / docs | wiki.kitestacks.com |
 | `bookstack-db` | MariaDB for BookStack | — |
 | `osticket-app` | Help desk | tasks.kitestacks.com |
 | `osticket-db` | MySQL for OSTicket | — |
 | `portainer` | Docker management UI | portainer.kitestacks.com |
 | `cloudflared` | Tunnel connector | — |
 | `prometheus` | Metrics collector | — |
 | `node-exporter` | Host metrics exporter | — |
 | `blackbox-exporter` | HTTP probe monitor | — |
 | `kitestacks-metrics-api` | System stats API for portal | — |
 | `ntfy` | Push notifications | — |
 | `flux` | GitOps controller | — |
 ---
 ## Service Deep Dives
 ### homepage — Portal
 **What it is:** Custom-built static website served by nginx.
 **Directory:** `~/kitestacks-live/docker/kitestacks-portal/`
 **Public files:** `./public/index.html` — edit this to change what the portal shows
 **Config:** `./nginx.conf` — nginx routing rules
 ```bash
 # Restart portal
 cd ~/kitestacks-live/docker/kitestacks-portal
 docker compose restart homepage
 # Edit the portal
 nano public/index.html
 # View nginx logs
 docker logs homepage -f
 ```
 **Ports:** 3005:3000 (host:container). Cloudflare Tunnel uses container port 3000 directly.
 ---
 ### authentik — SSO Identity Provider
 **What it is:** Self-hosted OAuth2/OIDC identity provider. Handles all logins for every service.
 **Directory:** `~/kitestacks-live/docker/authentik/`
 **Database:** Shared PostgreSQL on kscloud1 at `100.123.x.x:5432`, database `authentik`
 **Redis:** Shared Redis on kscloud1 at `100.123.x.x:6379`
 ```bash
 cd ~/kitestacks-live/docker/authentik
 # Start all Authentik services
 docker compose up -d
 # Check health (wait for "healthy" before testing SSO)
 docker inspect --format '{{.State.Health.Status}}' authentik
 docker inspect --format '{{.State.Health.Status}}' authentik-worker
 # Run a Django management command (admin tasks, user management)
 docker exec authentik ak shell
 # View logs
 docker logs authentik -f
 docker logs authentik-worker -f
 ```
 **SSO apps configured in Authentik:**
 - Grafana, Forgejo, Kavita, Karakeep, Open WebUI, Portainer, BookStack
 **Key Authentik admin panel:** https://auth.kitestacks.com/if/admin/
 **Important:** OAuth2 code TTL is set to 10 minutes (increased from default 1 minute)
 to allow monk's Authentik to finish starting up after a reconnect before codes expire.
 ---
 ### forgejo — Git Platform
 **What it is:** Self-hosted Git. Stores all homelab code, configs, and documentation.
 **Directory:** `~/kitestacks-live/docker/forgejo/`
 **Database:** Shared PostgreSQL on kscloud1, database `forgejo`, user `forgejo`
 **Data volume:** `./data/` (repositories, avatars, attachments)
 ```bash
 cd ~/kitestacks-live/docker/forgejo
 # Start
 docker compose up -d
 # Admin commands
 docker exec -u git forgejo forgejo admin user list
 docker exec -u git forgejo forgejo admin user create --username newuser --password pass --email e@mail.com --admin
 # View logs
 docker logs forgejo -f
 # API token for automation
 # Token: stored in .env — used by kitestacks-metrics-api for activity feed
 ```
 **API base URL:** `https://gitforge.kitestacks.com/api/v1/`
 **Local access (via Cloudflare):** gitforge.kitestacks.com
 ---
 ### kite-openwebui — AI Chat
 **What it is:** Self-hosted ChatGPT-like interface connected to LiteLLM proxy.
 **Directory:** `~/kitestacks-live/docker/kite-openwebui/`
 **Backend:** `kite-litellm` — routes to OpenRouter (many models, free tier available)
 ```bash
 cd ~/kitestacks-live/docker/kite-openwebui
 docker compose up -d
 docker logs kite-openwebui -f
 docker logs kite-litellm -f
 ```
 **SSO:** Authentik OIDC — "Sign in with Authentik" on login page.
 ---
 ### karakeep — Bookmarks
 **What it is:** Bookmark manager and read-it-later tool. Saves full page content.
 **Directory:** `~/kitestacks-live/docker/karakeep/`
 **Depends on:** `karakeep-chrome` (headless Chromium for page capture) + `karakeep-meilisearch` (search)
 ```bash
 cd ~/kitestacks-live/docker/karakeep
 docker compose up -d
 # SSO callback URL: https://links.kitestacks.com/api/auth/callback/custom
 # (NextAuth.js uses "custom" as the provider ID, not "authentik")
 ```
 **SSO:** Authentik OAuth2 — redirect URI must be `/api/auth/callback/custom` (not `/callback/authentik`)
 ---
 ### kavita — eBook Reader
 **What it is:** eBook, manga, and comic library.
 **Directory:** `~/kitestacks-live/docker/kavita/`
 **Book files:** `./library/books/` — add books here, then scan library in Kavita UI
 **Config/DB:** `./config/kavita.db` (SQLite)
 ```bash
 cd ~/kitestacks-live/docker/kavita
 docker compose up -d
 docker logs kavita -f
 # If you change OIDC settings, use the Kavita UI at kavita.kitestacks.com/settings
 # Do NOT edit kavita.db directly for OIDC config — Kavita overwrites it on restart
 # Use SSH port-forward to access kscloud1's Kavita directly if needed:
 # ssh -L 5099:localhost:5000 kenpat@kscloud1-tailscale-ip
 # Then visit http://localhost:5099
 ```
 **SSO:** Authentik OIDC — Authority URL must end with trailing slash:
 `https://auth.kitestacks.com/application/o/kavita/`
 ---
 ### grafana — Monitoring Dashboards
 **What it is:** Visualizes metrics collected by Prometheus.
 **Directory:** `~/kitestacks-live/docker/grafana/`
 **Provisioning:** `./provisioning/` — auto-loads datasource (Prometheus) and dashboard (Node Exporter Full)
 **Data:** Named Docker volume `grafana-data`
 ```bash
 cd ~/kitestacks-live/docker/grafana
 docker compose up -d
 docker logs grafana -f
 ```
 **Dashboards auto-loaded:**
 - Node Exporter Full (id 1860) — CPU, RAM, disk, network for both monk and kscloud1
 - Switch between hosts using the "instance" variable at top of dashboard
 **SSO:** Authentik OAuth2. Local admin login also works.
 ---
 ### uptime-kuma — Status Page
 **What it is:** Uptime monitoring with a public status page.
 **Directory:** `~/kitestacks-live/docker/uptime-kuma/`
 **Database:** Named Docker volume `uptime-kuma` (SQLite kuma.db)
 **Status page slug:** `homelab` → https://status.kitestacks.com/status/homelab
 ```bash
 cd ~/kitestacks-live/docker/uptime-kuma
 docker compose up -d
 docker logs uptime-kuma -f
 # To push kuma.db to kscloud1 after changes (monk → kscloud1):
 # See scripts/sync-kuma.sh (or follow the sqlite backup pattern)
 ```
 **Monitors configured:** All 11 public services + kscloud1 ping + Monk ping + Samurai ping.
 **Conky widget:** Reads kscloud1's Uptime Kuma directly via Tailscale IP at
 `http://100.123.x.x:3001/api/status-page/homelab`. This means the widget shows
 kscloud1's health, not monk's — which is what matters for production status.
 ---
 ### bookstack — Wiki
 **What it is:** Self-hosted documentation wiki with a clean UI.
 **Directory:** `~/kitestacks-live/docker/bookstack/`
 **Database:** MariaDB container `bookstack-db`
 **Config:** `.env` file (APP_URL, DB settings, OIDC config)
 ```bash
 cd ~/kitestacks-live/docker/bookstack
 docker compose up -d
 docker logs bookstack -f
 # BookStack API (used to push docs from Forgejo):
 # Token created via: DB injection + bcrypt hash for API key
 # Token ID/secret stored in .env
 ```
 **SSO:** Authentik OIDC. Key config:
 - `OIDC_ISSUER=https://auth.kitestacks.com/application/o/bookstack/`
 - `OIDC_ISSUER_DISCOVER=true`
 - Cache dir must be writable: `chown -R abc:users /config/www/framework/cache/`
 ---
 ### osticket-app — Help Desk
 **What it is:** OSTicket help desk and ticketing system.
 **Directory:** `~/kitestacks-live/docker/osticket/`
 **Database:** MySQL container `osticket-db`
 **URL:** tasks.kitestacks.com (took over from OpenProject)
 ```bash
 cd ~/kitestacks-live/docker/osticket
 docker compose up -d
 docker logs osticket-app -f
 ```
 **SMTP:** Configured for smtp.gmail.com:587 using kitestacks.helpdesk@gmail.com.
 App password stored in `ost_email` table (smtp_auth_creds=1 for all email entries).
 **Confirmed working:** Email delivery verified 2026-06-19.
 ---
 ### portainer — Docker Management
 **What it is:** Web UI for managing Docker containers on both monk and kscloud1.
 **Directory:** `~/kitestacks-live/docker/portainer/`
 **URL:** portainer.kitestacks.com
 ```bash
 cd ~/kitestacks-live/docker/portainer
 docker compose up -d
 ```
 **SSO:** Authentik OAuth2 (AuthenticationMethod=3). User kenpat7177@gmail.com pre-created as admin.
 **Security:** Authentik PolicyBinding restricts Portainer app to `homelab-admin` group only.
 ---
 ### cloudflared — Tunnel Connector
 **What it is:** Creates the outbound tunnel to Cloudflare. This is what makes all
 public services reachable without opening ports on the router.
 **Directory:** `~/kitestacks-live/docker/cloudflared/`
 **Token:** Read from `.env` file as `TUNNEL_TOKEN` (never hardcoded in docker-compose.yml)
 ```bash
 cd ~/kitestacks-live/docker/cloudflared
 docker compose up -d
 docker logs cloudflared -f
 # To rotate the token (runs on both monk and kscloud1):
 # ~/kitestacks-homelab/scripts/rollout-cloudflared-token.sh '<new-token>'
 ```
 **Tunnel ID:** 5e60ea8e-a543-49b6-bab5-325f39441e00
 **Account:** Cloudflare dashboard → Zero Trust → Networks → Tunnels
 ---
 ### prometheus + node-exporter — Metrics
 **What it is:** Prometheus collects time-series metrics. node-exporter exposes host stats.
 **Directory:** `~/kitestacks-live/docker/prometheus/`
 **Config:** `./prometheus.yml` — defines scrape targets
 ```bash
 cd ~/kitestacks-live/docker/prometheus
 docker compose up -d
 docker logs prometheus -f
 # Scrape targets configured:
 # - node-exporter:9100 (monk, via Docker DNS)
 # - 5.78.x.x:9100     (kscloud1, via public IP — node-exporter exposed on 0.0.0.0)
 ```
 ---
 ## Common Operations
 ### Restart a single service
 ```bash
 cd ~/kitestacks-live/docker/<service-name>
 docker compose restart <container-name>
 ```
 ### View live logs
 ```bash
 docker logs <container-name> -f
 # -f = follow (live tail). Ctrl+C to stop.
 ```
 ### Update a service to latest image
 ```bash
 cd ~/kitestacks-live/docker/<service-name>
 docker compose pull
 docker compose up -d
 ```
 ### Check all container health at once
 ```bash
 docker ps --format "table {{.Names}}\t{{.Status}}"
 ```
 ### Enter a container's shell
 ```bash
 docker exec -it <container-name> bash
 # or sh if bash isn't available:
 docker exec -it <container-name> sh
 ```
 ### Check disk and memory usage
 ```bash
 docker system df        # Docker disk usage
 free -h                 # RAM usage
 df -h                   # Disk usage
 ```
 ### Push a kuma.db update to kscloud1
 ```bash
 # 1. Make changes to monk's Uptime Kuma (add monitors, etc.)
 # 2. Backup monk's db:
 docker run --rm -v uptime-kuma:/src:ro -v /tmp:/out python:3-alpine \
  python3 -c "import sqlite3; s=sqlite3.connect('/src/kuma.db'); b=sqlite3.connect('/out/kuma.db.push'); s.backup(b); b.close(); s.close()"
 # 3. Transfer and restore on kscloud1:
 gzip -c /tmp/kuma.db.push | ssh -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x \
  "gunzip > /home/kenpat/kuma.db.push"
 # Then on kscloud1: stop uptime-kuma, restore via same sqlite.backup() pattern, restart
 ```
--- a/homelab-mastery/build-guide/README.md
+++ b/homelab-mastery/build-guide/README.md
@ -0,0 +1,85 @@
 # KiteStacks Build Guide — Choose Your Path
 This guide teaches you how to build the entire KiteStacks homelab from a blank machine.
 There are two tracks. Pick the one that fits where you are right now.
 ---
 ## Track A — With AI (Beginner)
 **Who this is for:** Someone with zero or very little tech experience.
 You do not need to know Linux, Docker, or networking. You just need to be able to
 follow instructions and copy commands.
 **How it works:** You use an AI assistant (Claude, ChatGPT, or similar) as your guide
 throughout the build. The AI explains what each command does in plain language before
 you run it. You never copy something without understanding what it does — the AI makes
 sure of that.
 **Time to complete:** 2–4 weeks of evenings and weekends (2–3 hours per session).
 **What you will have at the end:** A fully working homelab identical to KiteStacks.
 → **[Start the AI-Assisted Build](with-ai/01-what-you-need.md)**
 ---
 ## Track B — Without AI (Advanced)
 **Who this is for:** Someone who wants to understand everything deeply and build skills
 along the way — not just copy commands but know what every line does and why.
 **How it works:** You build the homelab from scratch, learning Bash scripting, Python,
 Docker internals, Linux administration, and networking as you go. Every command is
 explained in full. No shortcuts.
 **Time to complete:** 3–6 months of consistent part-time study and building
 (evenings and weekends). Full-time: 6–10 weeks.
 **What you will learn:** Linux, Bash scripting, Python, Docker, networking (DNS, ports,
 TLS, firewalls), OAuth2/OIDC, infrastructure design, and troubleshooting methodology.
 → **[Start the Advanced Build](without-ai/01-linux-foundations.md)**
 ---
 ## What Both Tracks Build
 By the end of either track you will have:
 - ✅ A public domain (e.g. kitestacks.com) serving real websites
 - ✅ Eleven self-hosted services running in Docker
 - ✅ Single sign-on — one account for everything
 - ✅ A cloud VPS as a permanent backup — site stays up when your home PC is off
 - ✅ Private networking between home and cloud via Tailscale VPN
 - ✅ Real-time monitoring with Grafana and Uptime Kuma
 - ✅ A desktop widget showing live service status
 ---
 ## Hardware and Accounts Needed (Both Tracks)
 ### Hardware
 - Any PC or laptop running Linux (or you can install Linux on it) — minimum 8GB RAM, 100GB disk
 - A domain name — buy from Cloudflare Registrar, Namecheap, or similar (~$10–15/year)
 - A credit card for the cloud VPS (~€4–5/month on Hetzner — less than a coffee)
 ### Accounts to Create
 - **Cloudflare** — free account at cloudflare.com
 - **Hetzner** — cloud VPS provider at hetzner.com (or any VPS: DigitalOcean, Vultr, Linode)
 - **Tailscale** — free at tailscale.com (up to 100 devices)
 - **OpenRouter** — free AI model access at openrouter.ai (for the AI chat service)
 ### What You Are Building On
 ```
 Home PC (monk)
  └── Ubuntu or similar Linux OS
  └── Docker + Docker Compose
  └── ~15 containers running
 Cloud VPS (kscloud1)
  └── Ubuntu Linux
  └── Docker + Docker Compose
  └── Same 15 containers running (replica)
  └── Shared PostgreSQL + Redis
 ```
--- a/homelab-mastery/build-guide/with-ai/01-what-you-need.md
+++ b/homelab-mastery/build-guide/with-ai/01-what-you-need.md
@ -0,0 +1,182 @@
 # Step 1 — What You Need Before You Start
 **Track:** With AI (Beginner)  
 **Time for this step:** 1–2 hours
 Welcome. You are about to build a real, working homelab that serves websites to the
 actual internet. It sounds complicated, but with an AI assistant helping you every step
 of the way, you can absolutely do this even if you have never used Linux before.
 ---
 ## How to Use This Guide
 Throughout this build, whenever you see a command like this:
 ```bash
 docker ps
 ```
 That is something you type into a terminal (a black window where you type commands).
 Before you type any command, **ask your AI assistant what it does**. Say:
 > "What does this command do: `docker ps`"
 The AI will explain it in plain language. Never run a command you do not understand.
 That is the rule throughout this entire build.
 ---
 ## What You Need
 ### 1. A Computer to Run Everything On
 You need a PC or laptop that will be your home server. This will be called **monk**
 throughout this guide (that is just a nickname — you can call it whatever you want).
 Minimum specs:
 - **RAM:** 8 GB (16 GB recommended — you will run about 15 programs at once)
 - **Storage:** 100 GB free space
 - **Operating system:** Linux (Ubuntu 22.04 or 24.04 recommended)
 If your computer currently runs Windows, you have two options:
 - Install Ubuntu alongside Windows (dual boot)
 - Replace Windows with Ubuntu entirely (easier, recommended)
 **Ask your AI:** "How do I install Ubuntu 24.04 on my computer?"
 ---
 ### 2. A Domain Name
 A domain name is your address on the internet — for example, `kitestacks.com`.
 You need to buy one. It costs about $10–15 per year.
 **Where to buy:** Cloudflare Registrar (registrar.cloudflare.com) is recommended
 because you will use Cloudflare for everything else and it keeps things in one place.
 **Tips for picking a domain:**
 - Keep it short and memorable
 - `.com` is most professional
 - Avoid hyphens and numbers
 **Ask your AI:** "How do I buy a domain name on Cloudflare Registrar?"
 ---
 ### 3. A Cloudflare Account
 Cloudflare is the service that sits between the internet and your home computer.
 It hides your home IP address, handles all the security, and routes traffic to
 your services. Best part: everything you need is on their free plan.
 Go to cloudflare.com and create a free account.
 If you bought your domain from Cloudflare Registrar, your account is already set up.
 If you bought it elsewhere, you will need to move it to Cloudflare — ask your AI how.
 ---
 ### 4. A Cloud VPS (Virtual Private Server)
 A VPS is a small computer that rents space in a data center. It runs 24 hours a day
 even when your home computer is off. This is what keeps your websites online when
 you are travelling or when your home internet goes down.
 **Recommended provider:** Hetzner (hetzner.com) — excellent value, based in Germany.
 **Plan to choose:** CX22 — 2 vCPU, 4 GB RAM, 40 GB disk — approximately €4/month.
 Create a Hetzner account, then ask your AI: "How do I create a new CX22 VPS on Hetzner
 with Ubuntu 24.04?"
 This second computer will be called **kscloud1** throughout this guide.
 ---
 ### 5. A Tailscale Account
 Tailscale is a free service that creates a private, encrypted connection between your
 home computer and your cloud VPS. Think of it as a private tunnel that only your
 devices can use.
 Go to tailscale.com and create a free account.
 ---
 ### 6. An OpenRouter Account (for AI services)
 OpenRouter gives you access to dozens of AI models for free (with rate limits) or
 for very low cost. Your KiteStacks AI service will use this.
 Go to openrouter.ai and create a free account.
 ---
 ## Setting Up Your Home Computer (monk)
 Once Ubuntu is installed on your home computer, open a terminal. On Ubuntu,
 press `Ctrl + Alt + T` to open one.
 You will see something like:
 ```
 kenpatmonk@monk:~$
 ```
 That `$` means you are ready to type commands.
 **First, update your system. Ask your AI what this does, then run it:**
 ```bash
 sudo apt update && sudo apt upgrade -y
 ```
 **Then install some tools you will need:**
 ```bash
 sudo apt install -y curl git nano wget
 ```
 **Ask your AI:** "What does `sudo apt install` do and why do I need curl, git, nano, and wget?"
 ---
 ## Setting Up Your Cloud VPS (kscloud1)
 After creating your VPS on Hetzner, you will get an IP address (something like `5.78.233.28`).
 You connect to it using a tool called SSH.
 **Ask your AI:** "What is SSH and how do I connect to my VPS from Ubuntu?"
 The basic command looks like this:
 ```bash
 ssh root@YOUR_VPS_IP
 ```
 Replace `YOUR_VPS_IP` with the actual IP Hetzner gave you.
 Once connected, update the VPS just like you did on your home computer:
 ```bash
 apt update && apt upgrade -y
 ```
 ---
 ## Checkpoint
 Before moving to Step 2, make sure you have:
 - [ ] Ubuntu installed and running on your home computer
 - [ ] A domain name purchased and pointing to Cloudflare
 - [ ] A Cloudflare account (free)
 - [ ] A Hetzner VPS created with Ubuntu (noted your VPS IP address)
 - [ ] A Tailscale account (free)
 - [ ] An OpenRouter account (free)
 - [ ] You can open a terminal on your home computer
 - [ ] You can SSH into your VPS
 If any of these are not done, stop here and ask your AI for help completing them
 before moving on. Every future step assumes all of these are in place.
 ---
 **Next:** [Step 2 — DNS and Cloudflare Setup](02-dns-and-cloudflare.md)
--- a/homelab-mastery/build-guide/with-ai/02-dns-and-cloudflare.md
+++ b/homelab-mastery/build-guide/with-ai/02-dns-and-cloudflare.md
@ -0,0 +1,129 @@
 # Step 2 — DNS and Cloudflare Setup
 **Track:** With AI (Beginner)  
 **Time for this step:** 1–2 hours
 In this step you will set up Cloudflare so your domain points to Cloudflare's servers,
 and you will create the Cloudflare Tunnel that allows the internet to reach your home
 computer without exposing your home IP address.
 ---
 ## What Is Happening Here?
 When someone types `www.kitestacks.com` into a browser, their computer asks a system
 called DNS: "What is the IP address for kitestacks.com?"
 Normally, that answer would be your home IP address. But we do NOT want that — your
 home IP could change, could be targeted by attackers, or could be blocked by your ISP.
 Instead, the DNS answer will be Cloudflare's IP address. Traffic goes to Cloudflare,
 Cloudflare sends it to your computer through a tunnel, and your home IP is never involved.
 **Ask your AI:** "Can you explain in simple terms how Cloudflare Tunnel works?"
 ---
 ## Step 2A — Add Your Domain to Cloudflare
 If you bought your domain from Cloudflare Registrar, skip to Step 2B.
 If you bought it elsewhere (Namecheap, GoDaddy, etc.):
 1. Log in to Cloudflare at cloudflare.com
 2. Click "Add a site"
 3. Enter your domain name
 4. Choose the Free plan
 5. Cloudflare will give you two nameserver addresses (like `vera.ns.cloudflare.com`)
 6. Go to your domain registrar's website and replace the nameservers with Cloudflare's
 **Ask your AI:** "How do I change nameservers on [your registrar]?"
 It can take up to 24 hours for nameserver changes to propagate worldwide, but usually
 it happens within an hour.
 ---
 ## Step 2B — Create Your Cloudflare Tunnel
 A Cloudflare Tunnel is the invisible connection between your home computer and Cloudflare.
 Your home computer reaches out to Cloudflare (outbound connection). Cloudflare holds that
 connection open. When someone visits your website, Cloudflare sends the request back through
 that existing connection. Your home router never needs to be configured.
 **To create a tunnel:**
 1. In your Cloudflare dashboard, go to: **Zero Trust → Networks → Tunnels**
 2. Click **"Create a tunnel"**
 3. Choose **"Cloudflared"** as the connector type
 4. Name your tunnel (e.g., `kitestacks-tunnel`)
 5. Cloudflare will show you a token — a long string of characters starting with `eyJ`
 6. **Save this token somewhere safe** — you will need it in Step 3
 ---
 ## Step 2C — Add Public Hostnames to the Tunnel
 A public hostname tells Cloudflare: "When someone visits this URL, send the traffic
 to this container on my home computer."
 You will set up hostnames for all eleven of your services. For each one:
 1. In the tunnel settings, click **"Public Hostnames"**
 2. Click **"Add a public hostname"**
 Add all of these (you will complete the services in later steps, but adding the
 hostnames now means they are ready):
 | Subdomain | Domain | Service | URL |
 |-----------|--------|---------|-----|
 | www | yourdomain.com | http://homepage:3000 | www.yourdomain.com |
 | auth | yourdomain.com | http://authentik:9000 | auth.yourdomain.com |
 | gitforge | yourdomain.com | http://forgejo:3000 | gitforge.yourdomain.com |
 | ai | yourdomain.com | http://kite-openwebui:8080 | ai.yourdomain.com |
 | links | yourdomain.com | http://karakeep:3000 | links.yourdomain.com |
 | kavita | yourdomain.com | http://kavita:5000 | kavita.yourdomain.com |
 | grafana | yourdomain.com | http://grafana:3000 | grafana.yourdomain.com |
 | status | yourdomain.com | http://uptime-kuma:3001 | status.yourdomain.com |
 | wiki | yourdomain.com | http://bookstack:80 | wiki.yourdomain.com |
 | tasks | yourdomain.com | http://osticket-app:80 | tasks.yourdomain.com |
 | portainer | yourdomain.com | https://portainer:9443 | portainer.yourdomain.com |
 For the `portainer` entry, enable **"No TLS Verify"** (Portainer uses its own self-signed certificate internally).
 Replace `yourdomain.com` with your actual domain throughout.
 **Ask your AI:** "What does the 'service' field in a Cloudflare Tunnel hostname mean?
 Why do I use `http://homepage:3000` instead of an IP address?"
 ---
 ## Step 2D — Create the Docker Network
 Everything in this homelab runs in Docker (covered in the next step), and all the
 containers need to be able to talk to each other and to the Cloudflare connector.
 They do this by being on the same Docker network.
 On your **home computer**, run:
 ```bash
 docker network create kitestacks
 ```
 You will also do this on your **cloud VPS** in a later step.
 **Ask your AI:** "What is a Docker network and why do all containers need to be on the same one?"
 ---
 ## Checkpoint
 Before moving to Step 3, make sure:
 - [ ] Your domain is on Cloudflare (nameservers changed or bought from Cloudflare)
 - [ ] You created a Cloudflare Tunnel and saved the tunnel token
 - [ ] You added all 11 public hostnames to the tunnel
 - [ ] You ran `docker network create kitestacks` on your home computer
 ---
 **Next:** [Step 3 — Installing Docker](03-docker-setup.md)
--- a/homelab-mastery/build-guide/with-ai/03-docker-setup.md
+++ b/homelab-mastery/build-guide/with-ai/03-docker-setup.md
@ -0,0 +1,196 @@
 # Step 3 — Installing Docker
 **Track:** With AI (Beginner)  
 **Time for this step:** 30–60 minutes (on both your home computer and your VPS)
 Docker is the technology that runs all your services. Think of it like a machine that
 can run many small, isolated programs at the same time — each program thinks it is
 the only one on the computer, even though they are all sharing the same hardware.
 Each program is called a **container**. You will have about 15 containers running.
 ---
 ## What Is Docker? (Plain English)
 Imagine you want to run fifteen different apps on your computer. If you installed them
 all directly, they might conflict — one app needs Python version 3.9, another needs 3.11,
 and they fight over which one to use. Docker solves this by giving each app its own
 little bubble where it has exactly what it needs, completely separate from everything else.
 A **container** is one of those bubbles.
 A **Docker image** is the recipe for making a bubble.
 **Docker Compose** is a tool that lets you describe multiple containers in one file
 and start them all with one command.
 **Ask your AI:** "Can you explain Docker containers vs Docker images using a simple analogy?"
 ---
 ## Installing Docker on Your Home Computer (monk)
 Run these commands one at a time. Before each one, ask your AI what it does.
 ```bash
 # Install required packages
 sudo apt install -y ca-certificates curl
 # Add Docker's official GPG key (proves the software is authentic)
 sudo install -m 0755 -d /etc/apt/keyrings
 sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
 sudo chmod a+r /etc/apt/keyrings/docker.asc
 # Add Docker's package source
 echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
 # Update package list and install Docker
 sudo apt update
 sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
 ```
 Now let Docker start automatically when your computer boots:
 ```bash
 sudo systemctl enable docker
 sudo systemctl start docker
 ```
 Add yourself to the Docker group so you do not need `sudo` every time:
 ```bash
 sudo usermod -aG docker $USER
 ```
 **Log out and log back in** (or reboot) for this change to take effect.
 Test that Docker is installed:
 ```bash
 docker --version
 docker compose version
 ```
 You should see version numbers printed. If you see errors, ask your AI to help.
 ---
 ## Installing Docker on Your Cloud VPS (kscloud1)
 SSH into your VPS and run the exact same commands as above. The process is identical.
 ```bash
 ssh root@YOUR_VPS_IP
 ```
 Then run all the same installation commands.
 ---
 ## Your First Container — Cloudflared (Tunnel Connector)
 The first container you will run is `cloudflared` — this is what creates the tunnel
 between your computer and Cloudflare. Without this, nothing else can be reached from
 the internet.
 **On your home computer**, create a folder for it:
 ```bash
 mkdir -p ~/kitestacks-live/docker/cloudflared
 cd ~/kitestacks-live/docker/cloudflared
 ```
 Create a file called `.env` that holds your tunnel token:
 ```bash
 nano .env
 ```
 Inside the file, type:
 ```
 TUNNEL_TOKEN=paste-your-token-here
 ```
 Replace `paste-your-token-here` with the token you saved from Step 2.
 Press `Ctrl+X`, then `Y`, then `Enter` to save.
 Now create the `docker-compose.yml` file:
 ```bash
 nano docker-compose.yml
 ```
 Paste this content:
 ```yaml
 services:
  cloudflared:
    image: cloudflare/cloudflared:latest
    container_name: cloudflared
    restart: unless-stopped
    command: tunnel --no-autoupdate run
    environment:
      - TUNNEL_TOKEN=${TUNNEL_TOKEN:?set TUNNEL_TOKEN in .env}
    networks:
      - default
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 Save and close the file. Then start it:
 ```bash
 docker compose up -d
 ```
 Check that it is running:
 ```bash
 docker ps
 ```
 You should see `cloudflared` in the list with a status of `Up`.
 Check the logs to confirm it connected:
 ```bash
 docker logs cloudflared
 ```
 You should see something like "Connection established" or "Registered tunnel connection".
 **Ask your AI:** "What does `restart: unless-stopped` mean in a Docker Compose file?"
 ---
 ## Run Cloudflared on Your VPS Too
 SSH into your VPS and do the exact same thing. Use the **same tunnel token** — Cloudflare
 will register this as a second connector for the same tunnel. If your home computer goes
 offline, the VPS will keep serving traffic.
 ```bash
 mkdir -p /opt/kitestacks/docker/cloudflared
 cd /opt/kitestacks/docker/cloudflared
 ```
 Create the same `.env` and `docker-compose.yml` files, then:
 ```bash
 docker compose up -d
 docker logs cloudflared
 ```
 ---
 ## Checkpoint
 Before moving to Step 4:
 - [ ] Docker is installed on your home computer
 - [ ] Docker is installed on your VPS
 - [ ] `docker ps` shows `cloudflared` running on both machines
 - [ ] `docker logs cloudflared` shows successful connection on both
 Go to your Cloudflare Tunnel dashboard. Under your tunnel, you should now see
 **2 connectors** listed — one from your home computer and one from your VPS.
 If you only see one, wait a few minutes and refresh.
 ---
 **Next:** [Step 4 — Core Services](04-core-services.md)
--- a/homelab-mastery/build-guide/with-ai/04-core-services.md
+++ b/homelab-mastery/build-guide/with-ai/04-core-services.md
@ -0,0 +1,298 @@
 # Step 4 — Core Services: Portal, Forgejo, and Authentik
 **Track:** With AI (Beginner)  
 **Time for this step:** 3–5 hours
 These three services form the foundation of KiteStacks:
 - **Portal** — the homepage that links to everything
 - **Forgejo** — stores all your code and configurations in Git
 - **Authentik** — handles all logins for every service (SSO)
 Set these up first. Everything else depends on them.
 ---
 ## How Docker Compose Files Work
 Every service in this homelab has its own folder with a `docker-compose.yml` file.
 That file describes the service: what image to use, what environment variables to set,
 what folders to use for data, and what network to join.
 You will create these files using `nano` (a simple text editor in the terminal).
 **Ask your AI:** "Can you explain what each section of a docker-compose.yml file does:
 services, image, container_name, restart, environment, volumes, networks?"
 ---
 ## Service 1 — The Portal (Homepage)
 The portal is your home page at `www.yourdomain.com`. It shows links to all your
 services and displays live system stats.
 ```bash
 mkdir -p ~/kitestacks-live/docker/kitestacks-portal/public
 cd ~/kitestacks-live/docker/kitestacks-portal
 ```
 Create `docker-compose.yml`:
 ```yaml
 services:
  homepage:
    image: nginx:alpine
    container_name: homepage
    restart: unless-stopped
    volumes:
      - ./public:/usr/share/nginx/html:ro
      - ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 Create a basic `nginx.conf`:
 ```nginx
 server {
    listen 3000;
    root /usr/share/nginx/html;
    index index.html;
    location / {
        try_files $uri $uri/ /index.html;
    }
 }
 ```
 Create a basic `public/index.html` to test:
 ```html
 <!DOCTYPE html>
 <html>
 <head><title>KiteStacks</title></head>
 <body>
  <h1>KiteStacks is live!</h1>
 </body>
 </html>
 ```
 Start it:
 ```bash
 docker compose up -d
 docker ps
 ```
 Visit `www.yourdomain.com` in a browser. You should see your page.
 If it works, you have confirmed the tunnel is routing correctly.
 **Ask your AI:** "I want to build a proper homepage for my homelab. It should have a
 dark cyberpunk theme with cards for each of my services. Can you help me write the HTML?"
 Work with your AI to build the portal you want. The KiteStacks portal source is in
 `~/kitestacks-homelab/apps/kitestacks-portal/` as reference.
 ---
 ## Service 2 — Forgejo (Git)
 Forgejo stores all your code. You will push your homelab configs to it so everything
 is version-controlled and you never lose your work.
 First, set up the shared PostgreSQL database (Forgejo will use this):
 ```bash
 mkdir -p ~/kitestacks-live/docker/postgres
 cd ~/kitestacks-live/docker/postgres
 ```
 Create `.env`:
 ```
 POSTGRES_USER=authentik
 POSTGRES_PASSWORD=choose-a-strong-password-here
 POSTGRES_DB=authentik
 ```
 Create `docker-compose.yml`:
 ```yaml
 services:
  authentik-postgres:
    image: postgres:16-alpine
    container_name: authentik-postgres
    restart: unless-stopped
    env_file: .env
    volumes:
      - ./data:/var/lib/postgresql/data
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 ```bash
 docker compose up -d
 ```
 Now create the Forgejo service:
 ```bash
 mkdir -p ~/kitestacks-live/docker/forgejo
 cd ~/kitestacks-live/docker/forgejo
 ```
 Create `.env`:
 ```
 FORGEJO_DB_TYPE=postgres
 FORGEJO_DB_HOST=authentik-postgres:5432
 FORGEJO_DB_NAME=forgejo
 FORGEJO_DB_USER=forgejo
 FORGEJO_DB_PASSWD=choose-a-strong-password-here
 FORGEJO_DOMAIN=gitforge.yourdomain.com
 FORGEJO_SSH_DOMAIN=gitforge.yourdomain.com
 FORGEJO_ROOT_URL=https://gitforge.yourdomain.com
 ```
 Create `docker-compose.yml`:
 ```yaml
 services:
  forgejo:
    image: codeberg.org/forgejo/forgejo:latest
    container_name: forgejo
    restart: unless-stopped
    env_file: .env
    volumes:
      - ./data:/data
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 ```bash
 docker compose up -d
 docker logs forgejo -f
 ```
 Wait for it to finish starting (about 30 seconds), then visit `gitforge.yourdomain.com`.
 You will see a Forgejo setup page — follow the on-screen instructions to create your admin account.
 **Ask your AI:** "How do I create a repository on Forgejo and push my local files to it?"
 ---
 ## Service 3 — Authentik (Single Sign-On)
 Authentik is the most complex service to set up, but it is worth it — once done,
 you log in once and every other service recognizes you automatically.
 First, set up Redis (Authentik needs this for session management):
 ```bash
 mkdir -p ~/kitestacks-live/docker/redis
 cd ~/kitestacks-live/docker/redis
 ```
 Create `docker-compose.yml`:
 ```yaml
 services:
  authentik-redis:
    image: redis:alpine
    container_name: authentik-redis
    restart: unless-stopped
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 ```bash
 docker compose up -d
 ```
 Now create Authentik:
 ```bash
 mkdir -p ~/kitestacks-live/docker/authentik
 cd ~/kitestacks-live/docker/authentik
 ```
 Generate a secret key (run this and save the output):
 ```bash
 openssl rand -base64 60 | tr -d '\n'
 ```
 Create `.env` (replace the values):
 ```
 PG_PASS=same-postgres-password-from-above
 AUTHENTIK_SECRET_KEY=paste-the-generated-key-here
 AUTHENTIK_BOOTSTRAP_EMAIL=your@email.com
 AUTHENTIK_BOOTSTRAP_PASSWORD=choose-a-strong-admin-password
 AUTHENTIK_POSTGRESQL__HOST=authentik-postgres
 AUTHENTIK_POSTGRESQL__USER=authentik
 AUTHENTIK_POSTGRESQL__NAME=authentik
 AUTHENTIK_POSTGRESQL__PASSWORD=same-postgres-password-from-above
 AUTHENTIK_REDIS__HOST=authentik-redis
 ```
 Create `docker-compose.yml`:
 ```yaml
 services:
  authentik:
    image: ghcr.io/goauthentik/server:latest
    container_name: authentik
    restart: unless-stopped
    command: server
    env_file: .env
    networks:
      - kitestacks
  authentik-worker:
    image: ghcr.io/goauthentik/server:latest
    container_name: authentik-worker
    restart: unless-stopped
    command: worker
    env_file: .env
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 ```bash
 docker compose up -d
 ```
 Authentik takes about 2 minutes to start on first run (it sets up the database).
 Watch the logs:
 ```bash
 docker logs authentik -f
 ```
 When you see "Starting authentik server" it is ready.
 Visit `auth.yourdomain.com` and log in with the bootstrap email and password you set.
 **Ask your AI:** "I have Authentik running. How do I create an OAuth2 provider for Grafana
 so it can use SSO? Walk me through the steps in the Authentik admin panel."
 Use the same process (with your AI's help) to create OAuth2 providers for each service
 as you add them in the next steps.
 ---
 ## Checkpoint
 Before moving to Step 5:
 - [ ] Portal is live at `www.yourdomain.com`
 - [ ] Forgejo is live at `gitforge.yourdomain.com` with your admin account created
 - [ ] Authentik is live at `auth.yourdomain.com` and you can log in
 - [ ] You can see all three containers in `docker ps`
 ---
 **Next:** [Step 5 — All Remaining Services](05-all-services.md)
--- a/homelab-mastery/build-guide/with-ai/05-all-services.md
+++ b/homelab-mastery/build-guide/with-ai/05-all-services.md
@ -0,0 +1,266 @@
 # Step 5 — All Remaining Services
 **Track:** With AI (Beginner)  
 **Time for this step:** 4–8 hours (take breaks — deploy one service at a time)
 In this step you will deploy the remaining eight services. For each one:
 1. Create the folder
 2. Create the `docker-compose.yml` file
 3. Run `docker compose up -d`
 4. Verify it is working
 5. Move on to the next one
 For each service, ask your AI to explain the docker-compose file before you run it.
 ---
 ## How to Use Your AI for Each Service
 For every service in this step, you can say to your AI:
 > "I am setting up [service name] in my KiteStacks homelab. It is a self-hosted [description].
 > Can you give me a docker-compose.yml for it that joins a network called 'kitestacks'?
 > I want to understand each part before I run it."
 Then ask follow-up questions about anything you do not understand.
 ---
 ## Service 4 — Open WebUI + LiteLLM (AI Chat)
 Open WebUI is your ChatGPT-style interface. LiteLLM sits behind it and routes your
 AI requests to OpenRouter (where you have free model access).
 ```bash
 mkdir -p ~/kitestacks-live/docker/kite-openwebui
 mkdir -p ~/kitestacks-live/docker/kite-litellm
 ```
 **Ask your AI:**
 > "I want to set up Open WebUI (ghcr.io/open-webui/open-webui) with LiteLLM as the
 > backend. LiteLLM should route to OpenRouter. Can you give me docker-compose files
 > for both? Container names: kite-openwebui and kite-litellm. Network: kitestacks."
 Work with your AI to get the right environment variables (you will need your OpenRouter
 API key from openrouter.ai).
 Start both:
 ```bash
 cd ~/kitestacks-live/docker/kite-litellm && docker compose up -d
 cd ~/kitestacks-live/docker/kite-openwebui && docker compose up -d
 ```
 Visit `ai.yourdomain.com` and create your admin account.
 ---
 ## Service 5 — Karakeep (Bookmarks)
 Karakeep saves bookmarks, articles, and links. It uses a headless Chrome browser
 to capture the full content of pages you save.
 ```bash
 mkdir -p ~/kitestacks-live/docker/karakeep
 ```
 **Ask your AI:**
 > "I want to set up Karakeep (ghcr.io/karakeep/karakeep) for bookmark management.
 > It needs a headless Chrome container (browserless/chrome) for page capture and
 > a Meilisearch container for search. Container names: karakeep, karakeep-chrome,
 > karakeep-meilisearch. All on the 'kitestacks' network. Give me one docker-compose.yml
 > for all three."
 ```bash
 cd ~/kitestacks-live/docker/karakeep && docker compose up -d
 ```
 Visit `links.yourdomain.com`.
 **Important:** When you set up SSO for Karakeep in Step 6, note that Karakeep uses
 NextAuth.js with the provider ID `custom` — so the OAuth2 redirect URL will be
 `https://links.yourdomain.com/api/auth/callback/custom` (not `/callback/authentik`).
 This is a common mistake. Make a note of it now.
 ---
 ## Service 6 — Kavita (eBook Reader)
 Kavita lets you read ebooks, manga, and comics from a library you maintain.
 ```bash
 mkdir -p ~/kitestacks-live/docker/kavita/library/books
 mkdir -p ~/kitestacks-live/docker/kavita/config
 ```
 **Ask your AI:**
 > "I want to set up Kavita (jvmilazz0/kavita) as an ebook reader. Container name: kavita.
 > The library should be mounted from ./library/books into the container. Config directory
 > at ./config. Network: kitestacks. Give me the docker-compose.yml."
 ```bash
 cd ~/kitestacks-live/docker/kavita && docker compose up -d
 ```
 Visit `kavita.yourdomain.com` and create your admin account. Add your books by placing
 ebook files in `~/kitestacks-live/docker/kavita/library/books/` and scanning the library
 in Kavita's settings.
 **Important for SSO:** Kavita's OIDC settings must be configured through the Kavita web UI,
 not by editing files directly. The Authority URL must end with a trailing slash:
 `https://auth.yourdomain.com/application/o/kavita/`
 ---
 ## Service 7 — Grafana (Monitoring Dashboards)
 Grafana shows you beautiful graphs of your server's CPU, RAM, network, and disk usage.
 ```bash
 mkdir -p ~/kitestacks-live/docker/grafana/provisioning/datasources
 mkdir -p ~/kitestacks-live/docker/grafana/provisioning/dashboards
 ```
 **Ask your AI:**
 > "I want to set up Grafana (grafana/grafana) with Prometheus as the data source.
 > I want the 'Node Exporter Full' dashboard (id 1860) to auto-load via provisioning.
 > Container name: grafana. Network: kitestacks. Give me the docker-compose.yml and
 > the provisioning YAML files for the datasource and dashboard."
 ```bash
 cd ~/kitestacks-live/docker/grafana && docker compose up -d
 ```
 Visit `grafana.yourdomain.com`.
 **Also set up Prometheus and node-exporter (Grafana needs these for data):**
 **Ask your AI:**
 > "I want to set up Prometheus to scrape metrics from node-exporter running on the same
 > host. Container names: prometheus and node-exporter. Network: kitestacks. Give me the
 > docker-compose.yml and prometheus.yml config file."
 ---
 ## Service 8 — Uptime Kuma (Status Page)
 Uptime Kuma monitors all your services and shows a public status page.
 ```bash
 mkdir -p ~/kitestacks-live/docker/uptime-kuma
 ```
 **Ask your AI:**
 > "Set up Uptime Kuma (louislam/uptime-kuma). Container name: uptime-kuma. Network: kitestacks.
 > Use a named volume called 'uptime-kuma' for data. Give me the docker-compose.yml."
 ```bash
 cd ~/kitestacks-live/docker/uptime-kuma && docker compose up -d
 ```
 Visit `status.yourdomain.com`, create your admin account, then add HTTP monitors for
 each of your eleven services. Set each monitor to check every 60 seconds.
 **Add a status page:**
 - In Uptime Kuma → Status Pages → New Status Page
 - Slug: `homelab`
 - Add all your monitors to it
 - Your public status page will be at `status.yourdomain.com/status/homelab`
 ---
 ## Service 9 — BookStack (Wiki)
 BookStack is a clean wiki for writing and organizing documentation.
 ```bash
 mkdir -p ~/kitestacks-live/docker/bookstack
 ```
 **Ask your AI:**
 > "Set up BookStack (lscr.io/linuxserver/bookstack) with its own MariaDB database.
 > Container names: bookstack and bookstack-db. APP_URL should be https://wiki.yourdomain.com.
 > Network: kitestacks. Give me the docker-compose.yml."
 ```bash
 cd ~/kitestacks-live/docker/bookstack && docker compose up -d
 ```
 BookStack takes about a minute to start on first run. Visit `wiki.yourdomain.com`.
 Default login: `admin@admin.com` / `password` — change this immediately.
 ---
 ## Service 10 — OSTicket (Help Desk)
 OSTicket is a help desk and ticketing system.
 ```bash
 mkdir -p ~/kitestacks-live/docker/osticket
 ```
 **Ask your AI:**
 > "Set up OSTicket using the docker image campbellsoftwaresolutions/osticket with its
 > own MySQL database. Container names: osticket-app and osticket-db. Network: kitestacks.
 > What environment variables do I need? Give me the docker-compose.yml."
 ```bash
 cd ~/kitestacks-live/docker/osticket && docker compose up -d
 ```
 Visit `tasks.yourdomain.com` to complete the web-based setup.
 ---
 ## Service 11 — Portainer (Docker Management)
 Portainer gives you a visual dashboard to manage all your containers.
 ```bash
 mkdir -p ~/kitestacks-live/docker/portainer
 ```
 **Ask your AI:**
 > "Set up Portainer CE (portainer/portainer-ce). Container name: portainer. Port 9443 (HTTPS).
 > Mount the Docker socket (/var/run/docker.sock) so it can manage containers.
 > Network: kitestacks. Give me the docker-compose.yml."
 ```bash
 cd ~/kitestacks-live/docker/portainer && docker compose up -d
 ```
 Visit `portainer.yourdomain.com`. Create your admin account.
 ---
 ## Checkpoint
 Run this to see all your containers:
 ```bash
 docker ps --format "table {{.Names}}\t{{.Status}}"
 ```
 You should see all of these running:
 - cloudflared
 - homepage
 - forgejo
 - authentik + authentik-worker
 - kite-openwebui + kite-litellm
 - karakeep + karakeep-chrome + karakeep-meilisearch
 - kavita
 - grafana + prometheus + node-exporter
 - uptime-kuma
 - bookstack + bookstack-db
 - osticket-app + osticket-db
 - portainer
 - authentik-postgres + authentik-redis
 If any are missing or show as unhealthy, check their logs:
 ```bash
 docker logs <container-name>
 ```
 Ask your AI to help diagnose any errors.
 ---
 **Next:** [Step 6 — Single Sign-On (SSO)](06-sso.md)
--- a/homelab-mastery/build-guide/with-ai/06-sso.md
+++ b/homelab-mastery/build-guide/with-ai/06-sso.md
@ -0,0 +1,242 @@
 # Step 6 — Single Sign-On (SSO)
 **Track:** With AI (Beginner)  
 **Time for this step:** 3–5 hours
 SSO (Single Sign-On) means one login for everything. After this step, you will log in
 with your Authentik account once and every service will recognize you automatically.
 No more logging in to each service separately.
 ---
 ## How SSO Works (Plain English)
 Without SSO:
 ```
 You → Grafana login page → type username + password → logged in to Grafana
 You → Forgejo login page → type username + password → logged in to Forgejo
 (repeat for every service)
 ```
 With SSO:
 ```
 You → Grafana "Sign in with Authentik" button
    → Authentik asks for login (once, or already remembered)
    → Authentik tells Grafana "this is kenpat, let them in"
    → Logged in to Grafana
 You → Forgejo "Sign in with Authentik"
    → Already logged into Authentik → instantly logged in to Forgejo
 ```
 The technology behind this is called **OAuth2** and **OIDC**. For now, you do not
 need to know the details — just follow the steps. (The concepts file explains it
 deeply if you are curious: [concepts/oauth2-oidc.md](../../concepts/oauth2-oidc.md))
 ---
 ## The Process for Each Service
 For every service, you do the same three things:
 **In Authentik:**
 1. Create an OAuth2 Provider for the service
 2. Create an Application that links to that Provider
 3. (Optional) Add a Policy to restrict who can access it
 **In the service:**
 4. Enter the Authentik credentials (client ID, client secret, URLs)
 Your AI will guide you through each one. Use this prompt template:
 > "I want to configure SSO for [service name] using Authentik as the OIDC provider.
 > The service is at https://[service].yourdomain.com. Walk me through:
 > 1. Creating an OAuth2 provider in Authentik's admin panel
 > 2. What redirect URI to use
 > 3. How to configure the service to use Authentik for login"
 ---
 ## SSO for Grafana
 **In Authentik admin panel (auth.yourdomain.com/if/admin/):**
 1. Go to **Applications → Providers → Create**
 2. Choose **OAuth2/OpenID Provider**
 3. Name: `Grafana`
 4. Client type: `Confidential`
 5. Redirect URIs: `https://grafana.yourdomain.com/login/generic_oauth`
 6. Scopes: openid, email, profile
 7. Save — note the **Client ID** and **Client Secret**
 8. Go to **Applications → Applications → Create**
 9. Name: `Grafana`, Slug: `grafana`
 10. Provider: select the Grafana provider you just created
 11. Save
 **In Grafana's `.env` or `docker-compose.yml` environment:**
 ```
 GF_AUTH_GENERIC_OAUTH_ENABLED=true
 GF_AUTH_GENERIC_OAUTH_NAME=Authentik
 GF_AUTH_GENERIC_OAUTH_CLIENT_ID=paste-client-id-here
 GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET=paste-client-secret-here
 GF_AUTH_GENERIC_OAUTH_SCOPES=openid email profile
 GF_AUTH_GENERIC_OAUTH_AUTH_URL=https://auth.yourdomain.com/application/o/authorize/
 GF_AUTH_GENERIC_OAUTH_TOKEN_URL=https://auth.yourdomain.com/application/o/token/
 GF_AUTH_GENERIC_OAUTH_API_URL=https://auth.yourdomain.com/application/o/userinfo/
 GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH=contains(groups, 'homelab-admin') && 'Admin' || 'Viewer'
 ```
 Restart Grafana: `docker compose restart grafana`
 Visit `grafana.yourdomain.com` — you should see a "Sign in with Authentik" button.
 ---
 ## SSO for Forgejo
 **In Authentik:** Create an OAuth2 Provider with:
 - Redirect URI: `https://gitforge.yourdomain.com/user/oauth2/authentik/callback`
 **In Forgejo:**
 - Site Administration → Authentication Sources → Add Authentication Source
 - Type: OAuth2
 - Name: `authentik`
 - OAuth2 Provider: OpenID Connect
 - Client ID and Secret from Authentik
 - OpenID Connect Discovery URL: `https://auth.yourdomain.com/application/o/forgejo/.well-known/openid-configuration`
 **Ask your AI:** "Walk me through adding an OAuth2 authentication source in Forgejo's admin panel."
 ---
 ## SSO for Karakeep
 **Important:** Karakeep uses NextAuth.js internally. The redirect URI is NOT the usual
 `/callback/authentik` — it is `/api/auth/callback/custom`.
 **In Authentik:** Create OAuth2 Provider with:
 - Redirect URI: `https://links.yourdomain.com/api/auth/callback/custom`
 **In Karakeep's environment:**
 ```
 NEXTAUTH_URL=https://links.yourdomain.com
 NEXTAUTH_SECRET=generate-a-random-secret
 OAUTH_WELLKNOWN_URL=https://auth.yourdomain.com/application/o/karakeep/.well-known/openid-configuration
 OAUTH_CLIENT_ID=paste-client-id
 OAUTH_CLIENT_SECRET=paste-client-secret
 OAUTH_PROVIDER_NAME=Authentik
 OAUTH_ALLOW_DANGEROUS_EMAIL_ACCOUNT_LINKING=true
 ```
 ---
 ## SSO for Kavita
 **In Authentik:** Create OAuth2 Provider with:
 - Redirect URI: `https://kavita.yourdomain.com/api/auth/callback`
 **In Kavita:** Go to Settings → OIDC (must be done through the UI, not by editing files):
 - Authority: `https://auth.yourdomain.com/application/o/kavita/` ← trailing slash required
 - Client ID and Client Secret from Authentik
 - Enabled: on
 **Critical:** The trailing slash in the Authority URL is required. Without it, Kavita
 gives an "issuer does not match" error.
 ---
 ## SSO for Open WebUI
 **In Authentik:** Create OAuth2 Provider with:
 - Redirect URI: `https://ai.yourdomain.com/oauth/oidc/callback`
 **In Open WebUI's environment:**
 ```
 ENABLE_OAUTH_SIGNUP=true
 OAUTH_PROVIDER_NAME=Authentik
 OPENID_PROVIDER_URL=https://auth.yourdomain.com/application/o/openwebui/.well-known/openid-configuration
 OAUTH_CLIENT_ID=paste-client-id
 OAUTH_CLIENT_SECRET=paste-client-secret
 ```
 ---
 ## SSO for BookStack
 **In Authentik:** Create OAuth2 Provider with:
 - Redirect URI: `https://wiki.yourdomain.com/oidc/callback`
 - Issuer mode: **Per Provider** (important — set this in Authentik's provider settings)
 **In BookStack's `.env`:**
 ```
 AUTH_METHOD=oidc
 AUTH_AUTO_INITIATE=false
 OIDC_NAME=Authentik
 OIDC_DISPLAY_NAME_CLAIMS=name
 OIDC_CLIENT_ID=paste-client-id
 OIDC_CLIENT_SECRET=paste-client-secret
 OIDC_ISSUER=https://auth.yourdomain.com/application/o/bookstack/
 OIDC_ISSUER_DISCOVER=true
 ```
 After setting this up, the BookStack cache directory needs to be writable:
 ```bash
 docker exec bookstack chown -R abc:users /config/www/framework/cache/
 docker compose restart bookstack
 ```
 ---
 ## SSO for Portainer
 **In Authentik:** Create OAuth2 Provider with:
 - Redirect URI: `https://portainer.yourdomain.com`
 **In Portainer:** Settings → Authentication → OAuth:
 - Provider: Custom
 - Client ID and Secret from Authentik
 - Authorization URL: `https://auth.yourdomain.com/application/o/authorize/`
 - Token URL: `https://auth.yourdomain.com/application/o/token/`
 - Userinfo URL: `https://auth.yourdomain.com/application/o/userinfo/`
 - Redirect URL: `https://portainer.yourdomain.com`
 - Scopes: `openid email profile`
 **Security note:** In Authentik, add a Policy Binding to the Portainer application
 to restrict access to your admin group only. This prevents anyone with an Authentik
 account from accessing the Docker management panel.
 ---
 ## Restricting Access by Group (Security)
 For sensitive services like Portainer, you want only administrators to access them:
 1. In Authentik, go to **Directory → Groups → Create**
 2. Name: `homelab-admin`
 3. Add yourself to this group
 4. Go to **Applications → Applications → [Portainer] → Policy Bindings**
 5. Add a binding: Group → `homelab-admin` → Allow
 Now only members of `homelab-admin` can use the Portainer application through SSO.
 ---
 ## Checkpoint
 Test SSO for each service:
 - [ ] Grafana — "Sign in with Authentik" works
 - [ ] Forgejo — OAuth2 login works
 - [ ] Karakeep — SSO login works
 - [ ] Kavita — "Sign in with Authentik" works
 - [ ] Open WebUI — SSO login works
 - [ ] BookStack — OIDC login works
 - [ ] Portainer — OAuth login works
 If any fail, check the error message and ask your AI: "I'm getting this error when
 signing in to [service] with Authentik: [paste the error]. What does it mean and
 how do I fix it?"
 ---
 **Next:** [Step 7 — Cloud Failover (kscloud1)](07-cloud-failover.md)
--- a/homelab-mastery/build-guide/with-ai/07-cloud-failover.md
+++ b/homelab-mastery/build-guide/with-ai/07-cloud-failover.md
@ -0,0 +1,202 @@
 # Step 7 — Cloud Failover (kscloud1)
 **Track:** With AI (Beginner)  
 **Time for this step:** 4–6 hours
 Right now, if your home computer goes off, your entire website goes offline. This step
 fixes that. You will turn your cloud VPS (kscloud1) into a full mirror of your homelab,
 so that when your home computer is off, kscloud1 keeps everything running.
 ---
 ## What You Are Building
 ```
 Home (monk)    ←—— always developing ——→ pushes to ——→   Cloud (kscloud1)
                                                          always live
                                                          never goes down
 Cloudflare routes traffic to whichever host responds.
 If monk is off, kscloud1 handles everything by itself.
 ```
 ---
 ## Step 7A — Set Up Tailscale on Both Machines
 Tailscale creates a private, encrypted connection between your home computer and your VPS.
 You need this so both machines can share a database securely.
 **On your home computer:**
 ```bash
 curl -fsSL https://tailscale.com/install.sh | sh
 sudo tailscale up
 ```
 Follow the link it gives you to authenticate in your browser.
 **On your VPS (via SSH):**
 ```bash
 curl -fsSL https://tailscale.com/install.sh | sh
 sudo tailscale up
 ```
 Authenticate again.
 After both are connected, check their Tailscale IPs:
 ```bash
 tailscale ip -4
 ```
 Write down both IPs — they look like `100.x.x.x`. You will use these in the next steps.
 **Ask your AI:** "I have Tailscale installed on two machines. How do I verify they can
 reach each other using their Tailscale IPs?"
 ---
 ## Step 7B — Move the Shared Databases to kscloud1
 For SSO to work properly across both machines, both Authentik instances must share
 one database. If they have separate databases, logins will fail roughly half the time.
 This means:
 - Move (or start fresh) Postgres and Redis on kscloud1
 - Configure both monk and kscloud1's Authentik to point to kscloud1's database over Tailscale
 **On kscloud1**, create the database containers. Use the same passwords you used on monk:
 ```bash
 mkdir -p /opt/kitestacks/docker/authentik
 cd /opt/kitestacks/docker/authentik
 ```
 Create `docker-compose.yml` with Postgres and Redis bound to the Tailscale IP:
 ```yaml
 services:
  authentik-postgres:
    image: postgres:16-alpine
    container_name: authentik-postgres
    restart: unless-stopped
    environment:
      POSTGRES_PASSWORD: your-db-password
      POSTGRES_USER: authentik
      POSTGRES_DB: authentik
    ports:
      - "100.123.x.x:5432:5432"   # bind to Tailscale IP only
    volumes:
      - ./postgres:/var/lib/postgresql/data
    networks:
      - kitestacks
  authentik-redis:
    image: redis:alpine
    container_name: authentik-redis
    restart: unless-stopped
    ports:
      - "100.123.x.x:6379:6379"   # bind to Tailscale IP only
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 ```
 Replace `100.123.x.x` with kscloud1's actual Tailscale IP.
 ```bash
 docker compose up -d
 ```
 **On monk**, update Authentik's environment to point to kscloud1's database:
 ```
 AUTHENTIK_POSTGRESQL__HOST=100.123.x.x   # kscloud1's Tailscale IP
 AUTHENTIK_REDIS__HOST=100.123.x.x
 ```
 Restart Authentik on monk:
 ```bash
 cd ~/kitestacks-live/docker/authentik
 docker compose down
 docker compose up -d
 ```
 **Ask your AI:** "I need to migrate my Authentik database from one host to another.
 How do I dump the data from my current Postgres and restore it on the new host?"
 ---
 ## Step 7C — Deploy All Services on kscloud1
 Now deploy the same services on kscloud1. SSH into your VPS and create the same
 folder structure and docker-compose files that you have on monk.
 ```bash
 mkdir -p /opt/kitestacks/docker
 ```
 For each service (forgejo, homepage, karakeep, kavita, grafana, etc.):
 1. Create the folder: `mkdir -p /opt/kitestacks/docker/<service>`
 2. Copy your docker-compose.yml from monk (with any path changes for `/opt/kitestacks/`)
 3. Copy your .env files
 4. Run `docker compose up -d`
 The fastest way is to have your AI help you:
 > "I have all my services running on my home computer at ~/kitestacks-live/docker/.
 > I want to replicate them on my VPS at /opt/kitestacks/docker/. Can you help me
 > go through each service and identify what needs to change for the VPS environment?"
 **Important differences on kscloud1:**
 - Authentik already points to the shared Postgres/Redis (same as monk now)
 - Forgejo should also use the shared Postgres (add a `forgejo` database to it)
 - Paths use `/opt/kitestacks/` instead of `~/kitestacks-live/`
 ---
 ## Step 7D — Verify Failover Works
 With both machines running and both cloudflared connectors active, test that failover works:
 1. In your Cloudflare Tunnel dashboard, you should see **2 connectors**
 2. Visit your website from your phone (not connected to home WiFi)
 3. Everything should work
 4. Now stop monk's cloudflared: `cd ~/kitestacks-live/docker/cloudflared && docker compose stop`
 5. Visit your website again from your phone
 6. Everything should still work (kscloud1 is serving it)
 7. Restart monk's cloudflared: `docker compose start cloudflared`
 If step 6 works, your cloud failover is complete.
 ---
 ## Step 7E — Set Up Uptime Kuma on kscloud1
 Your Conky desktop widget reads Uptime Kuma from kscloud1 (not monk). Set it up there:
 Deploy uptime-kuma on kscloud1 the same way you did on monk. Then push your monitors
 from monk to kscloud1 by copying the database.
 **Ask your AI:** "How do I copy a SQLite database from one Docker container to another
 on a different machine, safely and without data corruption?"
 The trick is using Python's `sqlite3.backup()` method — it creates a consistent copy
 even while the database is in use.
 ---
 ## Checkpoint
 - [ ] Tailscale is installed on both machines and they can reach each other
 - [ ] Shared Postgres and Redis are running on kscloud1's Tailscale IP
 - [ ] Both Authentik instances (monk and kscloud1) point to the shared database
 - [ ] All 11 services are running on kscloud1
 - [ ] Cloudflare Tunnel shows 2 connectors
 - [ ] Website works when monk's cloudflared is stopped
 ---
 **Next:** [Step 8 — Monitoring](08-monitoring.md)
--- a/homelab-mastery/build-guide/with-ai/08-monitoring.md
+++ b/homelab-mastery/build-guide/with-ai/08-monitoring.md
@ -0,0 +1,229 @@
 # Step 8 — Monitoring
 **Track:** With AI (Beginner)  
 **Time for this step:** 2–3 hours
 Monitoring means knowing when something is wrong before your users tell you.
 In this step you will set up three layers of monitoring:
 1. **Grafana** — beautiful dashboards showing CPU, RAM, disk, and network over time
 2. **Uptime Kuma** — checks every 60 seconds that each service responds correctly
 3. **Conky** — a desktop widget on your home computer showing live kscloud1 status
 ---
 ## Monitoring Layer 1 — Grafana + Prometheus
 You already deployed Grafana and Prometheus in Step 5. Now configure them properly.
 ### Edit the Prometheus Config
 Prometheus needs to know where to collect metrics from. Tell it about both machines:
 ```bash
 nano ~/kitestacks-live/docker/prometheus/prometheus.yml
 ```
 Add this content:
 ```yaml
 global:
  scrape_interval: 15s
 scrape_configs:
  - job_name: 'monk-node'
    static_configs:
      - targets: ['node-exporter:9100']
        labels:
          instance: 'monk'
  - job_name: 'kscloud1-node'
    static_configs:
      - targets: ['YOUR_VPS_IP:9100']
        labels:
          instance: 'kscloud1'
 ```
 Replace `YOUR_VPS_IP` with your VPS's public IP address.
 **On kscloud1**, make sure node-exporter is configured to be reachable publicly:
 ```yaml
 # In node-exporter's docker-compose.yml on kscloud1
 ports:
  - "0.0.0.0:9100:9100"
 ```
 Restart Prometheus:
 ```bash
 cd ~/kitestacks-live/docker/prometheus
 docker compose restart prometheus
 ```
 ### Configure Grafana Provisioning
 Tell Grafana to automatically load Prometheus as a data source and load the
 Node Exporter Full dashboard:
 Create `~/kitestacks-live/docker/grafana/provisioning/datasources/prometheus.yml`:
 ```yaml
 apiVersion: 1
 datasources:
  - name: Prometheus
    type: prometheus
    uid: 000000001
    url: http://prometheus:9090
    isDefault: true
 ```
 Create `~/kitestacks-live/docker/grafana/provisioning/dashboards/dashboards.yml`:
 ```yaml
 apiVersion: 1
 providers:
  - name: default
    folder: KiteStacks
    type: file
    options:
      path: /etc/grafana/provisioning/dashboards
 ```
 The Node Exporter Full dashboard (id 1860) can be imported from Grafana's dashboard library:
 1. Log in to grafana.yourdomain.com
 2. Left menu → Dashboards → Import
 3. Enter ID: `1860`
 4. Select your Prometheus datasource
 5. Import
 You should now see CPU, RAM, disk, and network graphs for both monk and kscloud1.
 Switch between them using the "instance" dropdown at the top of the dashboard.
 ---
 ## Monitoring Layer 2 — Uptime Kuma
 You set up Uptime Kuma in Step 5. Now add monitors for all your services.
 Log in to `status.yourdomain.com` and add an HTTP monitor for each service:
 | Monitor Name | URL | Check Interval |
 |-------------|-----|----------------|
 | Main Website | https://www.yourdomain.com | 60s |
 | Authentik | https://auth.yourdomain.com | 60s |
 | Forgejo | https://gitforge.yourdomain.com | 60s |
 | KiteAI | https://ai.yourdomain.com | 60s |
 | Karakeep | https://links.yourdomain.com | 60s |
 | Kavita | https://kavita.yourdomain.com | 60s |
 | Grafana | https://grafana.yourdomain.com | 60s |
 | BookStack | https://wiki.yourdomain.com | 60s |
 | OSTicket | https://tasks.yourdomain.com | 60s |
 | Portainer | https://portainer.yourdomain.com | 60s |
 | kscloud1 | (ping to kscloud1 IP) | 60s |
 | Monk | (ping to monk's Tailscale IP) | 60s |
 Then create a Status Page:
 1. Status Pages → New Status Page
 2. Title: "KiteStacks Status"
 3. Slug: `homelab`
 4. Add all monitors to it
 **Push Uptime Kuma to kscloud1:**
 The Conky widget on your desktop reads kscloud1's Uptime Kuma, not monk's. Push monk's
 database to kscloud1 after setting up monitors:
 **Ask your AI:** "How do I copy a Docker named volume's SQLite database from one machine
 to another using Python's sqlite3.backup() method?"
 ---
 ## Monitoring Layer 3 — Conky Desktop Widget
 Conky is a program that draws information on your desktop background in real time.
 Your KiteStacks widget shows whether each service on kscloud1 is up (green dot) or
 down (red dot), refreshed every 15 seconds.
 ### Install Conky
 ```bash
 sudo apt install conky-all
 ```
 ### Install the Widget Script
 The widget script reads Uptime Kuma's API and formats the output for Conky.
 The script is at `~/.local/bin/kitestacks-uptime-widget.sh` in the homelab repo.
 Copy it to your machine:
 ```bash
 mkdir -p ~/.local/bin
 cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime-widget.sh ~/.local/bin/
 chmod +x ~/.local/bin/kitestacks-uptime-widget.sh
 ```
 Edit the script to use your kscloud1's Tailscale IP:
 ```bash
 nano ~/.local/bin/kitestacks-uptime-widget.sh
 ```
 Change the `KUMA_URL` line:
 ```bash
 KUMA_URL="http://100.123.x.x:3001"   # kscloud1's Tailscale IP
 ```
 ### Enable the Conky Config
 ```bash
 cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime.conf ~/.config/conky/kitestacks-uptime.conf
 conky -c ~/.config/conky/kitestacks-uptime.conf -d
 ```
 The widget should appear in the top-right corner of your desktop, showing a dot for
 each service — green for up, red for down.
 **Ask your AI:** "How do I make Conky start automatically when I log in to my Ubuntu desktop?"
 ---
 ## Setting Up Alerts
 Uptime Kuma can send you a notification on your phone when a service goes down.
 **Option 1: ntfy (recommended — self-hosted)**
 You have ntfy running as a container. Set up an ntfy notification in Uptime Kuma:
 - Notification Type: ntfy
 - URL: your ntfy server URL
 - Topic: choose a topic name (e.g., `homelab-alerts`)
 Install the ntfy app on your phone and subscribe to your topic.
 **Option 2: Email**
 Configure email notifications in Uptime Kuma using your email address.
 **Ask your AI:** "How do I configure Uptime Kuma to send notifications via ntfy?"
 ---
 ## Checkpoint
 - [ ] Prometheus is collecting metrics from both monk and kscloud1
 - [ ] Grafana shows Node Exporter Full dashboard with both hosts
 - [ ] Uptime Kuma has monitors for all 11 services
 - [ ] Uptime Kuma status page is live at status.yourdomain.com/status/homelab
 - [ ] Uptime Kuma database has been pushed to kscloud1
 - [ ] Conky widget is showing on your desktop with live service status
 - [ ] You receive a notification when you manually pause a service in Uptime Kuma
 ---
 ## Congratulations — Your Homelab Is Complete
 You have built a production homelab with:
 - 11 self-hosted services running in Docker
 - Single sign-on via Authentik
 - Cloud failover on a Hetzner VPS
 - Private networking over Tailscale
 - Real-time monitoring via Grafana and Uptime Kuma
 - A live desktop status widget
 Everything you built here maps directly to enterprise cloud engineering skills.
 Every concept has a certification that covers it in depth.
 **Your next step:** [certifications/roadmap.md](../../certifications/roadmap.md)
--- a/homelab-mastery/build-guide/without-ai/01-linux-foundations.md
+++ b/homelab-mastery/build-guide/without-ai/01-linux-foundations.md
@ -0,0 +1,321 @@
 # Without AI — Part 1: Linux Foundations
 **Track:** Advanced (No AI)  
 **Time for this section:** 1–2 weeks of evenings and weekends
 Before you touch Docker or any service, you need a solid foundation in Linux.
 Every command you run in this homelab is a Linux command. If you skip this,
 you will be copying without understanding — which means you cannot debug when
 things go wrong.
 ---
 ## Total Build Time Estimate (Without AI)
 Before you start, here is an honest breakdown of how long this entire homelab
 takes to build from scratch — assuming you are learning as you go, working
 2–3 hours on evenings and weekends:
 | Phase | What You Are Learning / Building | Estimated Time |
 |-------|----------------------------------|---------------|
 | 1 — Linux Foundations | Shell, filesystem, permissions, SSH | 1–2 weeks |
 | 2 — Bash Scripting | Variables, loops, conditionals, scripts | 1–2 weeks |
 | 3 — Python Basics | Data structures, sqlite3, HTTP requests | 1–2 weeks |
 | 4 — Docker Deep Dive | Images, volumes, networks, compose | 1–2 weeks |
 | 5 — Networking | DNS, ports, TLS, Tailscale, firewalls | 1–2 weeks |
 | 6 — Full Build | Deploying all 11 services + cloud failover | 4–8 weeks |
 | 7 — Troubleshooting | Debugging, production issues, fixes | Ongoing |
 | Documentation | Writing what you built and why | 1 week |
 **Total: approximately 3–6 months** working part-time (evenings + weekends).
 **Full-time (8 hours/day):** 6–10 weeks.
 The wide ranges reflect the honest reality: some people hit a DNS issue that takes
 3 hours to debug. Some services take a day to configure SSO for. Budget extra time.
 The troubleshooting you will do along the way is not wasted time — it is where most
 of the real learning happens.
 ---
 ## What Is Linux?
 Linux is an operating system — like Windows or macOS — but open source, free, and
 used to run most of the internet. Your home server, your cloud VPS, and almost every
 web server in existence runs Linux.
 **Why Linux and not Windows Server?**
 - Free — no licensing cost
 - More control — no hidden processes you can't see or stop
 - Docker runs natively on Linux (on Windows, Docker runs inside a hidden Linux VM)
 - The entire cloud engineering industry is Linux-first
 You will use **Ubuntu 24.04 LTS** — the most widely used Linux distribution for servers.
 ---
 ## The Terminal
 The terminal (also called the shell or command line) is where you work. There is no
 graphical interface for most server tasks. You type a command, press Enter, read the
 output, and type the next command.
 Open a terminal on Ubuntu: `Ctrl + Alt + T`
 You will see a prompt like:
 ```
 kenpat@monk:~$
 ```
 Breaking that down:
 - `kenpat` — your username
 - `monk` — the machine name (hostname)
 - `~` — your current directory (`~` means your home directory, `/home/kenpat`)
 - `$` — indicates you are a regular user (not root/admin)
 ---
 ## The Filesystem
 Linux organizes everything in a tree of directories (folders) starting at `/` (root).
 ```
 /
 ├── home/          ← user home directories
 │   └── kenpat/    ← your home directory (~)
 ├── etc/           ← system configuration files
 ├── var/           ← variable data (logs, databases)
 ├── usr/           ← installed programs
 ├── tmp/           ← temporary files (cleared on reboot)
 ├── opt/           ← optional software (we use this for kscloud1)
 └── proc/          ← virtual filesystem — represents running processes
 ```
 **Key commands:**
 ```bash
 pwd                    # Print Working Directory — where am I right now?
 ls                     # List files in current directory
 ls -la                 # List all files, including hidden ones, with permissions
 cd /home/kenpat        # Change Directory — move to a specific path
 cd ~                   # Go to your home directory
 cd ..                  # Go up one level
 mkdir mydir            # Make a new directory
 mkdir -p a/b/c         # Make directories including parents (-p = parents)
 rm file.txt            # Remove a file
 rm -rf mydir/          # Remove a directory and everything inside it (-r = recursive, -f = force)
 cp file.txt backup.txt # Copy a file
 mv file.txt newname.txt# Move or rename a file
 cat file.txt           # Print the contents of a file
 less file.txt          # View a file page by page (q to quit)
 nano file.txt          # Open a file in the nano text editor
 ```
 **Practice:** Run these commands. Navigate around the filesystem. Understand what you see.
 ```bash
 pwd                    # Where are you?
 ls /                   # What is in the root directory?
 ls /home               # What home directories exist?
 ls -la ~               # What files are in YOUR home directory? (hidden files too)
 cd /var/log            # Go to the log directory
 ls                     # What log files exist?
 cat /etc/hostname      # What is this machine's hostname?
 cd ~                   # Go back home
 ```
 ---
 ## File Permissions
 Every file in Linux has permissions that control who can read it, write to it, or
 execute it. This is crucial — misconfigured permissions are a common source of bugs.
 ```
 -rw-r--r-- 1 kenpat kenpat 1234 Jun 19 10:00 myfile.txt
 ```
 Breaking it down:
 - `-` — file type (`d` = directory, `-` = regular file, `l` = symlink)
 - `rw-` — owner permissions: read, write, no execute
 - `r--` — group permissions: read only
 - `r--` — everyone else: read only
 - `kenpat kenpat` — owner and group
 ```bash
 chmod 644 myfile.txt   # rw-r--r-- (owner read/write, others read)
 chmod 755 myscript.sh  # rwxr-xr-x (owner full, others read+execute)
 chmod +x myscript.sh   # Add execute permission for everyone
 chown kenpat:kenpat file.txt  # Change owner to kenpat, group to kenpat
 chown -R 1000:1000 /mydir/    # Change owner recursively for entire directory
 ```
 **Why this matters in Docker:** Docker containers run as specific user IDs.
 If a container expects to own a file (e.g., UID 1000) but the file is owned by
 root, the container cannot write to it. Many Docker setup issues come down to
 file permission mistakes.
 ---
 ## Users and sudo
 Linux separates regular users from the administrator (called `root`).
 Root can do anything — delete system files, stop critical services, change any setting.
 Regular users cannot.
 `sudo` lets a trusted user run a single command as root:
 ```bash
 sudo apt update           # Run apt update as root
 sudo systemctl restart docker   # Restart Docker as root
 sudo nano /etc/hosts      # Edit a system file as root
 ```
 **Non-interactive sudo** (used in scripts when there is no terminal to type a password):
 ```bash
 echo mypassword | sudo -S apt update
 # -S reads password from stdin (standard input)
 ```
 **Become root entirely** (use carefully):
 ```bash
 sudo -i    # Opens a root shell. Prompt changes from $ to #
 exit       # Return to regular user
 ```
 ---
 ## SSH — Connecting to Remote Machines
 SSH (Secure Shell) lets you control a remote machine over an encrypted connection.
 ```bash
 ssh kenpat@192.168.1.100          # Connect to a local machine
 ssh root@5.78.x.x                 # Connect to your VPS as root
 ssh -i ~/.ssh/mykey kenpat@host   # Connect using a specific private key
 ssh -L 5099:localhost:5000 kenpat@host  # Local port forward
 ```
 ### SSH Keys (Better Than Passwords)
 Instead of typing a password every time, you generate a key pair:
 - **Private key** (`~/.ssh/id_ed25519`) — stays on your machine, never shared
 - **Public key** (`~/.ssh/id_ed25519.pub`) — put this on the server
 ```bash
 # Generate a new key pair
 ssh-keygen -t ed25519 -C "monk-to-kscloud1" -f ~/.ssh/id_ed25519_kscloud1
 # Copy your public key to the server
 ssh-copy-id -i ~/.ssh/id_ed25519_kscloud1.pub kenpat@your-vps-ip
 # Connect using the key
 ssh -i ~/.ssh/id_ed25519_kscloud1 kenpat@your-vps-ip
 ```
 ### SSH Local Port Forwarding
 Sometimes a service is running on a remote machine but not exposed publicly.
 You can forward a local port to a remote port through the SSH connection:
 ```bash
 ssh -L 5099:localhost:5000 kenpat@kscloud1-tailscale-ip
 ```
 This means: "On MY machine, port 5099 forwards to kscloud1's localhost:5000."
 Now visiting `http://localhost:5099` in your browser reaches kscloud1's port 5000.
 Used in this homelab to access kscloud1's Kavita directly (bypassing Cloudflare)
 when configuring OIDC settings.
 ---
 ## Package Management (apt)
 Ubuntu uses `apt` to install, update, and remove software:
 ```bash
 sudo apt update              # Refresh the list of available packages
 sudo apt upgrade -y          # Install all available updates
 sudo apt install -y curl git # Install specific packages
 sudo apt remove package      # Remove a package
 sudo apt search keyword      # Search for a package by name
 dpkg -l | grep docker        # List installed packages matching "docker"
 ```
 ---
 ## Processes and Services
 ```bash
 ps aux                        # List all running processes
 ps aux | grep docker          # Find processes matching "docker"
 top                           # Live process monitor (q to quit)
 htop                          # Better live monitor (install with: sudo apt install htop)
 kill 1234                     # Send kill signal to process ID 1234
 kill -9 1234                  # Force kill (cannot be ignored)
 pkill conky                   # Kill all processes named "conky"
 systemctl status docker       # Check if Docker service is running
 systemctl start docker        # Start it
 systemctl stop docker         # Stop it
 systemctl restart docker      # Restart it
 systemctl enable docker       # Make it start automatically on boot
 systemctl disable docker      # Prevent it from starting on boot
 ```
 ---
 ## Reading Logs
 When something breaks, you read the logs to find out why:
 ```bash
 journalctl -u docker          # System logs for the Docker service
 journalctl -f                 # Follow all system logs live
 cat /var/log/syslog           # System log file
 tail -f /var/log/syslog       # Follow (live tail) the system log
 dmesg | tail -20              # Kernel messages, last 20 lines
 ```
 ---
 ## Essential Tools
 ```bash
 curl -s https://example.com           # Make an HTTP GET request
 curl -s https://example.com | head    # Pipe output through head (first 10 lines)
 wget https://example.com/file.zip     # Download a file
 grep "error" /var/log/syslog          # Search a file for a pattern
 grep -r "TUNNEL_TOKEN" ~/kitestacks-live/  # Search recursively in a directory
 find ~ -name "*.env" 2>/dev/null      # Find all .env files in home dir
 find /opt -name "docker-compose.yml"  # Find all compose files
 wc -l file.txt                        # Count lines in a file
 cut -d= -f2 file.env                  # Cut: split by = and take field 2
 tr -d '\n'                            # Remove newlines from input
 |                                     # Pipe: send output of one command to another
 >                                     # Redirect: write output to a file (overwrites)
 >>                                    # Redirect: append output to a file
 2>/dev/null                           # Redirect error output to /dev/null (discard errors)
 ```
 ---
 ## Practice Exercises
 Do these before moving on:
 1. Navigate to `/var/log` and read the last 20 lines of `syslog`
 2. Create a directory structure: `~/practice/a/b/c/`
 3. Create a file in `c/` with your name in it using `echo "your name" > ~/practice/a/b/c/name.txt`
 4. Read it with `cat`
 5. Check its permissions with `ls -la`
 6. Change its permissions to read-only: `chmod 444 ~/practice/a/b/c/name.txt`
 7. Try to edit it — what happens?
 8. Find all `.conf` files in `/etc/` that contain the word "ubuntu"
 9. Generate an SSH key pair with `ssh-keygen`
 10. SSH into your VPS
 ---
 **Next:** [Part 2 — Bash Scripting](02-bash-scripting.md)
--- a/homelab-mastery/build-guide/without-ai/02-bash-scripting.md
+++ b/homelab-mastery/build-guide/without-ai/02-bash-scripting.md
@ -0,0 +1,333 @@
 # Without AI — Part 2: Bash Scripting
 **Track:** Advanced (No AI)  
 **Time for this section:** 1–2 weeks
 Bash is the language of the Linux shell. Almost every automation script in this
 homelab is a Bash script. You do not need to master it — you need to be able to
 read it, write simple scripts, and understand what a script does before you run it.
 ---
 ## What Is a Script?
 A script is a text file containing a sequence of shell commands. Instead of typing
 commands one by one, you put them in a file and run the file.
 ```bash
 #!/usr/bin/env bash
 # This is a comment
 echo "Hello from my script"
 ```
 The first line (`#!/usr/bin/env bash`) is called the **shebang**. It tells Linux
 which interpreter to use to run this file. Without it, Linux may use the wrong shell.
 To run a script:
 ```bash
 chmod +x myscript.sh    # Make it executable
 ./myscript.sh           # Run it
 ```
 Or without making it executable:
 ```bash
 bash myscript.sh
 ```
 ---
 ## Variables
 Variables store values you want to reuse:
 ```bash
 name="kenpat"
 port=3000
 greeting="Hello, $name"
 echo $name          # prints: kenpat
 echo $port          # prints: 3000
 echo $greeting      # prints: Hello, kenpat
 echo "${name}s"     # prints: kenpats (braces needed when appending)
 ```
 **Special variables:**
 ```bash
 $0        # The script's own filename
 $1 $2 $3  # Command-line arguments (first, second, third)
 $#        # Number of arguments passed
 $?        # Exit code of the last command (0 = success, non-zero = error)
 $$        # Current process ID (PID)
 $HOME     # Your home directory path
 $USER     # Your username
 ```
 **Read-only environment variables:**
 ```bash
 export MY_VAR="value"    # Make available to child processes
 printenv                 # List all environment variables
 printenv MY_VAR          # Print one variable
 ```
 ---
 ## Conditionals (if/else)
 ```bash
 if [[ condition ]]; then
    # commands if true
 elif [[ other_condition ]]; then
    # commands if second condition is true
 else
    # commands if nothing was true
 fi
 ```
 **Common conditions:**
 ```bash
 [[ -f /path/to/file ]]     # True if file exists and is a regular file
 [[ -d /path/to/dir ]]      # True if directory exists
 [[ -s /path/to/file ]]     # True if file exists and is non-empty
 [[ -z "$var" ]]            # True if variable is empty
 [[ -n "$var" ]]            # True if variable is NOT empty
 [[ "$a" == "$b" ]]         # True if strings are equal
 [[ "$a" != "$b" ]]         # True if strings are NOT equal
 [[ $n -eq 5 ]]             # True if number equals 5
 [[ $n -gt 5 ]]             # True if number is greater than 5
 [[ $n -lt 5 ]]             # True if number is less than 5
 ```
 **Real example from the homelab:**
 ```bash
 if [[ $# -ne 1 ]]; then
    echo "Usage: $0 '<cloudflare_tunnel_token>'" >&2
    exit 2
 fi
 ```
 This checks that exactly one argument was provided (`$# -ne 1` means "number of args
 is not equal to 1"). If not, it prints usage instructions and exits with code 2 (error).
 The `>&2` sends the message to stderr (error output) instead of stdout (normal output).
 ---
 ## Loops
 **For loop — iterate over a list:**
 ```bash
 for item in one two three; do
    echo "Item: $item"
 done
 # Iterate over files
 for file in *.yml; do
    echo "Found compose file: $file"
 done
 # Iterate over a range of numbers
 for i in {1..10}; do
    echo "Number: $i"
 done
 ```
 **While loop — repeat while a condition is true:**
 ```bash
 count=0
 while [[ $count -lt 5 ]]; do
    echo "Count: $count"
    count=$(( count + 1 ))
 done
 # Wait until a container is healthy
 while [[ "$(docker inspect --format '{{.State.Health.Status}}' authentik)" != "healthy" ]]; do
    echo "Waiting for authentik..."
    sleep 5
 done
 echo "Authentik is healthy"
 ```
 ---
 ## Functions
 ```bash
 greet() {
    local name="$1"    # local = only exists inside this function
    echo "Hello, $name"
 }
 greet "kenpat"   # prints: Hello, kenpat
 greet "world"    # prints: Hello, world
 ```
 **Why local variables matter:** Without `local`, variables are global and can
 accidentally overwrite values from other parts of the script.
 ---
 ## Error Handling
 ```bash
 set -euo pipefail
 ```
 Put this near the top of every script you write. It sets three behaviors:
 - `-e` — exit immediately if any command fails (returns non-zero exit code)
 - `-u` — exit if you use an undefined variable
 - `-o pipefail` — if any command in a pipeline fails, the whole pipeline fails
 Without this, a script can silently continue after an error, potentially causing
 damage downstream (like deleting data after a failed backup).
 **Checking a command's result:**
 ```bash
 if curl -s https://example.com > /dev/null; then
    echo "Site is up"
 else
    echo "Site is down"
 fi
 ```
 **Exit codes:**
 ```bash
 exit 0    # Success
 exit 1    # Generic error
 exit 2    # Misuse (bad arguments)
 ```
 ---
 ## String Manipulation
 ```bash
 var="TUNNEL_TOKEN=abc123"
 # Split by delimiter, take field 2
 echo "$var" | cut -d= -f2        # prints: abc123
 # But what if the value itself contains = signs?
 echo "$var" | cut -d= -f2-       # prints: abc123 (f2- = from field 2 to end)
 # Remove trailing newline
 echo "hello" | tr -d '\n'
 # Convert to lowercase
 echo "HELLO" | tr '[:upper:]' '[:lower:]'
 # Replace text
 echo "hello world" | sed 's/world/there/'   # prints: hello there
 echo "aabbcc" | sed 's/b/B/g'               # prints: aaBBcc (g = all occurrences)
 # Extract with grep
 echo "addr: 192.168.1.1" | grep -oP '\d+\.\d+\.\d+\.\d+'  # prints: 192.168.1.1
 ```
 ---
 ## Here Documents (heredoc)
 A heredoc lets you write multi-line strings inline:
 ```bash
 cat <<'EOF'
 This is line one
 This is line two
 Variables like $HOME are NOT expanded (because of the quotes around EOF)
 EOF
 cat <<EOF
 This is line one
 HOME is: $HOME   (expanded because no quotes)
 EOF
 ```
 Used in this homelab to write multi-line content to files:
 ```bash
 cat > /tmp/fix.sql <<'EOF'
 BEGIN;
 UPDATE ServerSetting SET Value='{"enabled":true}' WHERE "Key"=40;
 COMMIT;
 EOF
 ```
 ---
 ## Real Scripts in This Homelab
 ### The Token Rotation Script
 `~/kitestacks-homelab/scripts/rollout-cloudflared-token.sh`:
 ```bash
 #!/usr/bin/env bash
 set -euo pipefail
 if [[ $# -ne 1 ]]; then
  echo "Usage: $0 '<cloudflare_tunnel_token>'" >&2
  exit 2
 fi
 token="$1"
 monk_dir="${MONK_CLOUDFLARED_DIR:-$HOME/kitestacks-live/docker/cloudflared}"
 kscloud1_host="${KSCLOUD1_HOST:?set KSCLOUD1_HOST, for example user@host}"
 kscloud1_key="${KSCLOUD1_KEY:-$HOME/.ssh/id_ed25519_kscloud1}"
 kscloud1_dir="${KSCLOUD1_CLOUDFLARED_DIR:-/opt/kitestacks/docker/cloudflared}"
 ```
 Walking through each line:
 - `set -euo pipefail` — fail fast and safely
 - `$# -ne 1` — check exactly one argument was given
 - `${MONK_CLOUDFLARED_DIR:-$HOME/...}` — use environment variable if set, otherwise use default
 - `${KSCLOUD1_HOST:?...}` — if `KSCLOUD1_HOST` is not set, exit with that error message
 This is a real production script. Read it in full at that path.
 ---
 ## Writing Your Own Scripts
 **Template for any script:**
 ```bash
 #!/usr/bin/env bash
 set -euo pipefail
 # --- Configuration (change these) ---
 MY_VAR="${MY_ENV_VAR:-default_value}"
 TARGET_HOST="${1:?Usage: $0 <hostname>}"
 # --- Functions ---
 log() {
    echo "[$(date '+%H:%M:%S')] $*"
 }
 die() {
    echo "ERROR: $*" >&2
    exit 1
 }
 # --- Main ---
 log "Starting..."
 if [[ ! -d "$TARGET_HOST" ]]; then
    die "Directory does not exist: $TARGET_HOST"
 fi
 log "Done."
 ```
 ---
 ## Practice Exercises
 1. Write a script that checks if Docker is running and prints "Docker is up" or "Docker is down"
 2. Write a script that takes a service name as an argument and shows its logs:
   `./show-logs.sh forgejo`
 3. Write a script that loops through all directories in `~/kitestacks-live/docker/`
   and prints the service name and whether it has a `.env` file
 4. Write a script that checks if a URL returns 200 OK and prints "UP" or "DOWN":
   `./check-url.sh https://gitforge.kitestacks.com`
 5. Read and understand every line of `scripts/rollout-cloudflared-token.sh`
 ---
 **Next:** [Part 3 — Python Basics](03-python-basics.md)
--- a/homelab-mastery/build-guide/without-ai/03-python-basics.md
+++ b/homelab-mastery/build-guide/without-ai/03-python-basics.md
@ -0,0 +1,347 @@
 # Without AI — Part 3: Python Basics
 **Track:** Advanced (No AI)  
 **Time for this section:** 1–2 weeks
 Python is used in this homelab for:
 1. **Database operations** — copying SQLite databases safely between machines
 2. **HTTP requests** — hitting APIs to configure services
 3. **The metrics API** — the Python FastAPI service that feeds live stats to the portal
 4. **One-off automation** — scripts that are too complex for Bash
 You do not need to be a Python developer. You need to read Python code, understand
 what it does, modify it for your situation, and write simple scripts.
 ---
 ## Installing Python
 Ubuntu 24.04 comes with Python 3 already installed:
 ```bash
 python3 --version    # Should show 3.12.x or similar
 pip3 --version       # Package manager for Python
 ```
 Install the packages used in this homelab:
 ```bash
 pip3 install requests fastapi uvicorn psutil
 ```
 ---
 ## Python Syntax Basics
 Python uses indentation (spaces) to define blocks of code instead of `{}` like many
 other languages. This is critical — wrong indentation causes errors.
 ```python
 # This is a comment
 name = "kenpat"              # string
 port = 3000                  # integer
 price = 4.99                 # float
 is_running = True            # boolean
 print(name)                  # prints: kenpat
 print(f"Port is {port}")     # f-string: prints: Port is 3000
 print(f"{name!r}")           # repr: prints: 'kenpat' (with quotes)
 ```
 ---
 ## Data Structures
 ```python
 # List (ordered, mutable)
 services = ["forgejo", "grafana", "authentik"]
 services.append("portainer")         # add to end
 services[0]                          # "forgejo" (zero-indexed)
 services[-1]                         # "portainer" (last item)
 len(services)                        # 4
 for service in services:
    print(service)
 # Dictionary (key-value pairs, like JSON)
 monitor = {
    "name": "Forgejo",
    "url": "https://gitforge.kitestacks.com",
    "id": 16,
    "active": True
 }
 monitor["name"]                      # "Forgejo"
 monitor.get("missing", "default")    # "default" (safe get with fallback)
 monitor.keys()                       # dict_keys(["name", "url", "id", "active"])
 for key, value in monitor.items():
    print(f"{key}: {value}")
 # List of dicts (very common in API responses)
 monitors = [
    {"id": 16, "name": "Forgejo"},
    {"id": 17, "name": "Grafana"},
 ]
 for m in monitors:
    print(m["id"], m["name"])
 ```
 ---
 ## Functions and Conditionals
 ```python
 def check_service(name, url):
    """Check if a service URL is reachable."""
    if not url.startswith("https://"):
        return False
    print(f"Checking {name} at {url}")
    return True
 result = check_service("Grafana", "https://grafana.kitestacks.com")
 print(result)   # True
 ```
 **Conditionals:**
 ```python
 status = 200
 if status == 200:
    print("OK")
 elif status in (301, 302):
    print("Redirect")
 elif status >= 500:
    print("Server error")
 else:
    print(f"Unexpected status: {status}")
 ```
 ---
 ## Working with JSON
 Almost every API in this homelab sends and receives JSON (JavaScript Object Notation).
 Python's `json` module converts between JSON strings and Python dicts/lists:
 ```python
 import json
 # JSON string to Python dict
 data = json.loads('{"name": "Forgejo", "id": 16}')
 print(data["name"])   # Forgejo
 # Python dict to JSON string
 obj = {"monitors": [1, 2, 3]}
 json_str = json.dumps(obj, indent=2)
 print(json_str)
 # {
 #   "monitors": [1, 2, 3]
 # }
 # Read JSON from a file
 with open("/tmp/kuma.meta.json") as f:
    kuma_data = json.load(f)
 # Parse Uptime Kuma heartbeat data
 for monitor_id, heartbeats in kuma_data.get("heartbeatList", {}).items():
    if heartbeats:
        last = heartbeats[-1]
        status = "UP" if last["status"] == 1 else "DOWN"
        print(f"Monitor {monitor_id}: {status}")
 ```
 ---
 ## HTTP Requests with `requests`
 The `requests` library makes HTTP calls easy:
 ```python
 import requests
 # GET request
 response = requests.get("https://gitforge.kitestacks.com/api/v1/repos/search",
                        headers={"Authorization": "token your-api-token"},
                        timeout=5)
 print(response.status_code)   # 200
 data = response.json()        # Parse JSON response body
 print(data["data"][0]["name"])  # First repo name
 # POST request with JSON body
 response = requests.post(
    "https://auth.kitestacks.com/api/v3/core/tokens/",
    headers={"Authorization": "Bearer your-admin-token"},
    json={"identifier": "my-token", "user": "kenpat"},
    timeout=5
 )
 if response.ok:    # True for 2xx status codes
    print("Token created:", response.json()["key"])
 else:
    print(f"Failed: {response.status_code} {response.text}")
 ```
 ---
 ## SQLite — The Key Database Skill in This Homelab
 SQLite is a database that lives in a single file. Uptime Kuma, Kavita, and other services
 use SQLite. You used Python's `sqlite3` module to copy databases safely between machines.
 ```python
 import sqlite3
 # Connect to a database file
 conn = sqlite3.connect("/path/to/kuma.db")
 # Run a query
 cursor = conn.execute("SELECT id, name, url FROM monitor ORDER BY id")
 rows = cursor.fetchall()     # Get all results
 for row in rows:
    print(row[0], row[1], row[2])
 # Insert data
 conn.execute(
    "INSERT INTO monitor (name, type, url, active) VALUES (?, ?, ?, ?)",
    ("BookStack", "http", "https://wiki.kitestacks.com", 1)
 )
 conn.commit()    # Save changes (without commit, nothing is written)
 # Use a transaction explicitly (safer for multiple changes)
 conn.execute("BEGIN")
 conn.execute("UPDATE monitor SET active=1 WHERE id=26")
 conn.execute("UPDATE monitor SET active=1 WHERE id=27")
 conn.execute("COMMIT")
 conn.close()
 ```
 ### The `backup()` Method — Copying Databases Safely
 SQLite databases in WAL mode (write-ahead log) cannot be copied with a plain file copy
 while they are in use. The `Connection.backup()` method creates a consistent snapshot:
 ```python
 import sqlite3
 def safe_backup(source_path, dest_path):
    """Copy a SQLite database safely, even if it's in use."""
    src = sqlite3.connect(source_path)
    dst = sqlite3.connect(dest_path)
    src.backup(dst)       # Creates a consistent copy
    dst.close()
    src.close()
    print(f"Backed up {source_path} to {dest_path}")
 safe_backup("/src/kuma.db", "/out/kuma.db.backup")
 ```
 **Why a plain `cp` would fail:** SQLite in WAL mode has two extra files:
 `kuma.db-wal` (uncommitted changes) and `kuma.db-shm` (shared memory). If you copy
 the main file without those, or in the wrong order, you get a corrupted database.
 `Connection.backup()` handles all of this correctly.
 ---
 ## Writing a Simple FastAPI Service
 The kitestacks-metrics-api is a Python FastAPI service. Understanding it helps you
 modify or extend it:
 ```python
 from fastapi import FastAPI
 import psutil
 app = FastAPI()
@app.get("/api/health")
 def health():
    return {"ok": True}
@app.get("/api/metrics")
 def metrics():
    return {
        "cpu_percent": psutil.cpu_percent(interval=1),
        "ram_percent": psutil.virtual_memory().percent,
        "ram_used_gb": psutil.virtual_memory().used / 1e9,
        "disk_percent": psutil.disk_usage("/").percent,
    }
 ```
 Run it:
 ```bash
 uvicorn myapi:app --host 0.0.0.0 --port 8000
 ```
 `psutil` reads these values from the host's `/proc` filesystem. When running inside
 a Docker container with `pid: host`, it reads the HOST's stats.
 ---
 ## Environment Variables in Python
 ```python
 import os
 token = os.environ.get("FORGEJO_TOKEN")           # None if not set
 token = os.environ.get("FORGEJO_TOKEN", "")       # Empty string if not set
 token = os.environ["FORGEJO_TOKEN"]               # KeyError if not set (explicit)
 # Check and fail clearly
 token = os.environ.get("FORGEJO_TOKEN")
 if not token:
    raise ValueError("FORGEJO_TOKEN environment variable is required")
 ```
 ---
 ## File Operations
 ```python
 import os
 # Read a file
 with open("/tmp/kuma.json") as f:
    content = f.read()
 # Write a file
 with open("/tmp/output.sql", "w") as f:
    f.write("UPDATE ServerSetting SET Value='test' WHERE \"Key\"=40;\n")
 # Check if a file exists
 if os.path.exists("/data/kuma.db"):
    print("Database found")
 # Delete a file safely
 for fname in ["/data/kuma.db-shm", "/data/kuma.db-wal"]:
    if os.path.exists(fname):
        os.remove(fname)
        print(f"Removed {fname}")
 # List files in a directory
 for filename in os.listdir("/app/data"):
    print(filename)
 ```
 ---
 ## Practice Exercises
 1. Write a Python script that reads `monitors.json` from Uptime Kuma's API response
   and prints each monitor's name and status
 2. Write a script that connects to a SQLite database, lists all tables, and prints
   the first 5 rows of the `monitor` table
 3. Write a script that uses `requests` to check if all 11 KiteStacks URLs return
   a status code between 200 and 399, and prints a summary
 4. Read the kitestacks-metrics-api source code and understand what each endpoint does
 5. Modify the `safe_backup()` function to also delete `-shm` and `-wal` files from
   the destination before writing (prevents WAL conflicts after restore)
 ---
 **Next:** [Part 4 — Docker Deep Dive](04-docker-deep-dive.md)
--- a/homelab-mastery/build-guide/without-ai/04-docker-deep-dive.md
+++ b/homelab-mastery/build-guide/without-ai/04-docker-deep-dive.md
@ -0,0 +1,303 @@
 # Without AI — Part 4: Docker Deep Dive
 **Track:** Advanced (No AI)  
 **Time for this section:** 1–2 weeks
 Docker is the technology that runs every service in this homelab. Understanding it
 deeply — not just copying compose files — is what separates someone who can maintain
 and troubleshoot a homelab from someone who hopes nothing breaks.
 ---
 ## What Docker Actually Is
 Most explanations say "containers are like lightweight VMs." That is wrong and leads
 to confusion. Here is what a container actually is:
 **A container is a Linux process with isolation applied.**
 Two Linux kernel features provide that isolation:
 **Namespaces** — the container gets its own view of:
 - Filesystem (it sees `/` but it is a different tree than the host's `/`)
 - Network interfaces (its own `eth0`, its own IP on the Docker network)
 - Process list (it can only see its own processes, not the host's)
 - User IDs (it can be "root" inside without being root on the host)
 **cgroups (control groups)** — limits how much of the host's resources the container can use:
 - CPU cores and usage limits
 - RAM limits
 - Disk I/O limits
 - Network bandwidth limits
 **Result:** No second kernel, no hardware emulation, no hypervisor. The nginx process
 in your `homepage` container is a regular Linux process on your machine — it just
 thinks it is alone.
 ---
 ## Images vs Containers
 ```
 Image                        Container
 ─────────────────────────    ─────────────────────────────────────────
 A recipe                     A running instance made from the recipe
 Read-only, immutable         Has a writable layer on top of the image
 Stored in layers             One writable layer per container
 Shared across containers     Separate per container
 Survives container deletion  Deleted with the container (unless volume)
 ```
 **Layers:** Docker images are built in layers. Each line in a `Dockerfile` creates a layer.
 If you update one layer, only that layer is re-downloaded. This is why pulling an update
 is fast — most layers are already local.
 ```bash
 docker image ls                         # List local images
 docker image inspect nginx:alpine       # See image metadata and layers
 docker image history nginx:alpine       # See how the image was built, layer by layer
 docker image pull postgres:16-alpine    # Download an image explicitly
 docker image rm nginx:alpine            # Remove a local image
 ```
 ---
 ## Docker Networks — In Depth
 Docker provides several networking modes:
 **bridge (default):** Container gets its own virtual network interface with a private IP
 (172.x.x.x range). Containers on the same bridge network can reach each other by IP
 or by name (via Docker's built-in DNS). Containers on different bridge networks are isolated.
 **host:** Container shares the host's network namespace entirely. `--network host` means
 no isolation — the container sees all host network interfaces and binds directly to
 host ports. Used for kitestacks-metrics-api so psutil can see real network stats.
 **none:** No networking at all. Rarely used.
 ```bash
 # Create a named bridge network
 docker network create kitestacks
 # See all networks
 docker network ls
 # Inspect a network — see which containers are connected and their IPs
 docker network inspect kitestacks
 # Connect a running container to a network
 docker network connect kitestacks my-container
 # Disconnect
 docker network disconnect kitestacks my-container
 ```
 **The DNS trick:** When two containers are on the same bridge network, Docker runs a
 DNS server at `127.0.0.11` inside each container. Container names resolve to their
 internal IPs. This is why `cloudflared` can connect to `http://grafana:3000` —
 Docker DNS resolves `grafana` to the grafana container's IP.
 ```bash
 # Verify DNS works from inside a container
 docker exec cloudflared nslookup grafana
 docker exec cloudflared curl -s http://grafana:3000/api/health
 ```
 ---
 ## Volumes — Persisting Data
 Containers are ephemeral. When you delete a container, its writable layer is gone.
 To keep data, you use volumes.
 **Bind mount:** You choose the path on the host.
 ```yaml
 volumes:
  - ./data:/forgejo-data           # host path : container path
  - /home/kenpat/books:/books:ro   # :ro = read-only
 ```
 Data is at `./data` on the host. You can navigate there with `cd`. You can back it up.
 **Named volume:** Docker manages the path.
 ```yaml
 volumes:
  - uptime-kuma:/app/data
 volumes:
  uptime-kuma:              # define the named volume
 ```
 Data is at `/var/lib/docker/volumes/uptime-kuma/_data/` on the host (Docker manages this).
 ```bash
 docker volume ls                            # List named volumes
 docker volume inspect uptime-kuma           # See where it is stored
 docker volume rm uptime-kuma                # Delete a volume (and its data!)
 ```
 **Access a named volume from a one-off container:**
 ```bash
 docker run --rm -v uptime-kuma:/data alpine ls /data
 ```
 This is the pattern used throughout this homelab to read or modify volumes without
 stopping the running service (for reads) or after stopping it (for writes).
 ---
 ## Docker Compose — The Full Picture
 Docker Compose reads a YAML file and manages the lifecycle of multiple containers.
 ```yaml
 services:
  forgejo:
    image: codeberg.org/forgejo/forgejo:latest
    container_name: forgejo           # Fixed name (not random)
    restart: unless-stopped           # Restart on crash or host reboot
    env_file: .env                    # Load environment variables from file
    environment:
      FORGEJO__server__DOMAIN: gitforge.kitestacks.com   # Override one env var
    volumes:
      - ./data:/data                  # Bind mount: ./data on host → /data in container
    ports:
      - "127.0.0.1:2222:22"          # Bind host 127.0.0.1:2222 to container port 22 (SSH)
    networks:
      - kitestacks
    depends_on:
      - authentik-postgres             # Start this service before forgejo
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
 networks:
  kitestacks:
    external: true                    # Use existing network (don't create a new one)
 ```
 **Key fields explained:**
 `restart: unless-stopped`
 - `no` — never restart
 - `always` — always restart, even on manual stop
 - `on-failure` — restart only if exit code is non-zero
 - `unless-stopped` — restart on crash or reboot, but not if you manually stopped it
 `env_file: .env`
 Reads `KEY=VALUE` pairs from a file. The `.env` file is in `.gitignore` so secrets
 never get committed to git. Always use this for passwords, tokens, and secrets.
 `depends_on`
 Starts services in dependency order. Does NOT wait for a service to be "ready" —
 just waits for the container to START. If you need to wait for a database to be ready,
 add a health check and use `condition: service_healthy`.
 **Common commands:**
 ```bash
 docker compose up -d              # Start all services in background
 docker compose down               # Stop and remove containers (not volumes)
 docker compose down -v            # Stop, remove containers AND volumes (data loss!)
 docker compose restart forgejo    # Restart one service
 docker compose pull               # Pull latest images
 docker compose logs -f forgejo    # Follow logs for one service
 docker compose ps                 # Show service status
 docker compose exec forgejo bash  # Open shell in running service
 docker compose config             # Validate and show merged config
 ```
 ---
 ## Port Mappings — When to Use Them
 ```yaml
 ports:
  - "3005:3000"           # host_port:container_port
  - "127.0.0.1:3005:3000" # bind to localhost only (not accessible from outside host)
  - "0.0.0.0:9100:9100"   # bind on all interfaces (accessible from outside)
 ```
 **In this homelab, most services do NOT expose host ports** — they only communicate
 through the Docker network. Cloudflare Tunnel connects directly to the container via
 the Docker bridge network, so no host ports are needed for public services.
 The only services that need host ports:
 - `node-exporter` on kscloud1 (so Prometheus on monk can scrape it via public IP)
 - `kitestacks-metrics-api` does NOT use ports — it uses `network_mode: host`
 - `portainer` uses 9443 (HTTPS)
 ---
 ## Inspecting and Debugging
 ```bash
 # See everything about a container
 docker inspect forgejo
 # See just its IP address on each network
 docker inspect forgejo --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'
 # See its environment variables (careful — this shows secrets!)
 docker inspect forgejo --format '{{range .Config.Env}}{{println .}}{{end}}'
 # See its mounts
 docker inspect forgejo --format '{{json .Mounts}}' | python3 -m json.tool
 # See resource usage
 docker stats                    # Live, all containers
 docker stats forgejo --no-stream # One snapshot for one container
 # See what the container's filesystem looks like
 docker exec forgejo ls /
 docker exec forgejo cat /etc/forgejo/app.ini
 docker exec forgejo find /data -name "*.db" 2>/dev/null
 ```
 ---
 ## Common Gotchas
 **Containers share the host's kernel:** If you run an Alpine-based image but your
 host kernel is too old, some syscalls may not work. Rare but real.
 **Named volumes are invisible by default:** New developers spend hours wondering where
 data went after deleting a container. Named volumes survive `docker compose down`.
 They do NOT survive `docker compose down -v`.
 **Order vs readiness:** `depends_on` does not mean "wait until ready." A Postgres
 container starts in milliseconds, but PostgreSQL inside it takes 3–5 seconds to accept
 connections. Use healthchecks for real readiness checking.
 **Port conflicts:** Two containers cannot bind the same host port. If you get
 `Bind for 0.0.0.0:3000 failed: port is already allocated`, something else is already
 using that host port.
 **network_mode: host and named networks cannot coexist:**
 ```yaml
 network_mode: host    # This means the container has NO network isolation
 # You cannot also add networks: [...] — they conflict
 ```
 ---
 ## Practice Exercises
 1. Pull the `nginx:alpine` image and run it: `docker run -d -p 8080:80 nginx:alpine`
   Visit `http://localhost:8080`. Then exec into it and find the nginx config.
 2. Run two containers (`alpine`) on the same custom network and verify they can
   ping each other by container name
 3. Create a named volume and mount it in two different containers. Write a file from
   one container and read it from the other
 4. Write a `docker-compose.yml` with three services: one nginx, one redis, one alpine
   that waits for redis to be healthy before starting
 5. Use `docker inspect` to find the IP address of your `forgejo` container on the
   `kitestacks` network. Confirm it matches what Docker DNS resolves.
 ---
 **Next:** [Part 5 — Networking](05-networking.md)
--- a/homelab-mastery/build-guide/without-ai/05-networking.md
+++ b/homelab-mastery/build-guide/without-ai/05-networking.md
@ -0,0 +1,352 @@
 # Without AI — Part 5: Networking
 **Track:** Advanced (No AI)  
 **Time for this section:** 1–2 weeks
 Networking is the hardest part to learn and the most important. Every problem in this
 homelab ultimately involves a packet trying to get somewhere. If you understand how
 packets travel, you can debug anything.
 ---
 ## IP Addresses
 Every device on a network has an IP address — a number that identifies it.
 **IPv4:** Four octets (0–255) separated by dots: `192.168.1.205`
 **Classes of addresses:**
 | Range | Who Owns It | Used For |
 |-------|------------|---------|
 | `10.0.0.0/8` | Private | Corporate networks, VPNs |
 | `172.16.0.0/12` | Private | Docker bridge networks |
 | `192.168.0.0/16` | Private | Home networks (your router) |
 | `100.64.0.0/10` | Shared | Tailscale uses this range |
 | Everything else | Public | Routable on the internet |
 Private addresses are not routable on the internet. Your home router uses NAT
 (Network Address Translation) to let private-addressed devices reach the internet.
 ---
 ## Subnetting and CIDR Notation
 CIDR (Classless Inter-Domain Routing) notation describes a range of IP addresses:
 ```
 192.168.1.0/24
              │
              └── prefix length: how many bits are fixed
 ```
 An IPv4 address is 32 bits. A `/24` means the first 24 bits are fixed (the network),
 leaving 8 bits for hosts. `2^8 = 256` addresses, minus network (`.0`) and broadcast (`.255`)
 = 254 usable host addresses.
 | CIDR | Addresses | Usable | Example |
 |------|-----------|--------|---------|
 | `/32` | 1 | 1 | A single IP |
 | `/31` | 2 | 2 | Point-to-point link |
 | `/30` | 4 | 2 | Small link |
 | `/29` | 8 | 6 | Small subnet |
 | `/28` | 16 | 14 | |
 | `/27` | 32 | 30 | |
 | `/26` | 64 | 62 | |
 | `/25` | 128 | 126 | |
 | `/24` | 256 | 254 | Typical home/office LAN |
 | `/16` | 65,536 | 65,534 | Large network |
 | `/12` | 1,048,576 | — | Docker range: 172.16.0.0/12 |
 | `/8` | 16,777,216 | — | 10.x.x.x range |
 **Subnetting practice:** Calculating the host range of `172.17.0.0/16`:
 - Fixed: `172.17` (first 16 bits)
 - Variable: last 16 bits
 - Host range: `172.17.0.1` to `172.17.255.254`
 - This covers all of `172.17.x.x`
 **Why `/12` covers all Docker networks:**
 `172.16.0.0/12` covers `172.16.0.0` through `172.31.255.255`.
 Docker creates bridge networks in the `172.17.x.x`, `172.18.x.x`, etc. ranges.
 All of those are inside `172.16.0.0/12` — so one ufw rule covers all Docker bridges.
 ---
 ## Ports
 A port is a 16-bit number (0–65535) that identifies which service on a host should
 handle a connection.
 ```
 IP address = the building
 Port       = the apartment number
 ```
 **Well-known ports (0–1023):**
 | Port | Protocol | Service |
 |------|----------|---------|
 | 22 | TCP | SSH |
 | 25 | TCP | SMTP (email sending) |
 | 53 | UDP/TCP | DNS |
 | 80 | TCP | HTTP |
 | 443 | TCP | HTTPS |
 | 5432 | TCP | PostgreSQL |
 | 6379 | TCP | Redis |
 **Ephemeral ports (49152–65535):** OS assigns these randomly for outbound connections.
 **In Docker:**
 ```yaml
 ports:
  - "9100:9100"   # host:container — both the same number
 ```
 Container port 9100 is mapped to host port 9100.
 External systems connect to the host IP on port 9100.
 Internally, containers on the Docker network use the container port directly.
 ---
 ## DNS (Domain Name System)
 DNS is a distributed database that maps names to IP addresses.
 **The hierarchy:**
 ```
 . (root)
 └── com
    └── kitestacks
        ├── www      →  Cloudflare anycast IP
        ├── auth     →  Cloudflare anycast IP
        └── grafana  →  Cloudflare anycast IP
 ```
 **Resolution process for `grafana.kitestacks.com`:**
 1. Browser checks local cache — not found
 2. Browser asks OS resolver (usually `127.0.0.53`)
 3. OS asks the configured DNS server (your home router, or 8.8.8.8)
 4. Resolver asks root nameservers: "who handles `.com`?"
 5. Root says: "Ask Verisign's servers"
 6. Resolver asks Verisign: "who handles `kitestacks.com`?"
 7. Verisign says: "Ask Cloudflare's nameservers (`vera.ns.cloudflare.com`)"
 8. Resolver asks Cloudflare: "what is `grafana.kitestacks.com`?"
 9. Cloudflare returns: "Cloudflare's anycast IP: 104.x.x.x"
 10. Browser connects to 104.x.x.x on port 443
 **Internal Docker DNS:**
 Inside the `kitestacks` Docker network, Docker runs a DNS server at `127.0.0.11`.
 When cloudflared resolves `grafana`, Docker DNS returns the container's bridge IP.
 ```bash
 # Check what an external name resolves to
 dig grafana.kitestacks.com
 # Check DNS from inside a container
 docker exec cloudflared nslookup grafana
 docker exec cloudflared cat /etc/resolv.conf   # Shows the DNS server: 127.0.0.11
 ```
 ---
 ## HTTP and HTTPS
 **HTTP:** Plain text request/response protocol. Anyone who can see the traffic can read it.
 ```
 GET /api/health HTTP/1.1
 Host: grafana.kitestacks.com
 Accept: application/json
 HTTP/1.1 200 OK
 Content-Type: application/json
 {"ok": true}
 ```
 **HTTPS:** HTTP inside a TLS-encrypted tunnel. The connection is encrypted from client to
 Cloudflare's edge. Between Cloudflare and your containers (inside Docker network), it is
 plain HTTP — this is fine because that traffic never leaves the host.
 **TLS handshake (simplified):**
 1. Client says "hello, I support these cipher suites"
 2. Server sends its certificate (proves it is `kitestacks.com`)
 3. Client verifies certificate against trusted Certificate Authorities
 4. Both sides agree on encryption keys (Diffie-Hellman key exchange)
 5. Encrypted connection established
 6. HTTP requests flow inside this encrypted tunnel
 In this homelab, Cloudflare handles TLS entirely. Your containers never see TLS.
 ---
 ## Cloudflare Tunnel — Technical Details
 **What cloudflared actually does:**
 ```bash
 # Watch cloudflared connect
 docker logs cloudflared -f
 # You see: "Connection established" connIndex=0 location=ORD
 # ORD = Chicago data center (or nearest Cloudflare POP to you)
 ```
 cloudflared establishes persistent multiplexed HTTP/2 connections to Cloudflare's
 edge network. When a request comes in:
 ```
 Internet user → Cloudflare edge → tunnel (HTTP/2 multiplexed) → cloudflared
                                                                       ↓
 cloudflared reads Ingress rules from Cloudflare API:
  grafana.kitestacks.com → http://grafana:3000
 cloudflared → Docker DNS → grafana container IP → sends request
 ```
 The tunnel connection uses QUIC (UDP-based) when possible, falls back to HTTPS/TCP.
 **Active-Active with two connectors:**
 Each connector registers separately. Cloudflare maintains a list of active connectors.
 Incoming requests are distributed across connectors by Cloudflare — no configuration
 needed on your end. If one connector drops, the others take all traffic within seconds.
 ---
 ## Tailscale — WireGuard Under the Hood
 Tailscale is a managed WireGuard VPN. Understanding WireGuard explains Tailscale.
 **WireGuard:**
 - Modern VPN protocol, designed in 2016
 - Uses UDP (faster than TCP-based VPNs like OpenVPN)
 - Cryptography: Curve25519 key exchange, ChaCha20-Poly1305 encryption
 - Each peer has a public/private key pair (like SSH keys)
 - Configured via static peer lists with IP allowances
 **The NAT problem:** Home machines are behind NAT. Their public IP is the router's IP,
 not their own. Two NAT-ed machines cannot easily make direct connections.
 **Tailscale's solution — UDP hole punching:**
 1. Both machines connect to Tailscale's coordination server (DERP)
 2. Tailscale orchestrates a "hole punch": both machines send packets to each other
   simultaneously, which opens NAT mappings on both routers
 3. Direct WireGuard connection established peer-to-peer
 4. Tailscale coordination servers are no longer involved in the data path
 ```bash
 # Check Tailscale status
 tailscale status
 # See your device's Tailscale IP
 tailscale ip -4
 # Check connectivity to kscloud1
 tailscale ping 100.123.x.x
 # See if connection is direct or via relay
 tailscale status --json | python3 -m json.tool | grep -A5 "kscloud1"
 ```
 **Why Tailscale IPs are stable:** Each device's `100.x.x.x` IP is tied to its machine
 identity in Tailscale's database. It does not change when you move networks or reconnect.
 ---
 ## Firewalls (ufw)
 ufw (Uncomplicated Firewall) is a frontend for iptables/nftables.
 **kscloud1's firewall configuration:**
 ```bash
 # View current rules
 sudo ufw status verbose
 # Default policies
 sudo ufw default deny incoming    # Block all inbound by default
 sudo ufw default allow outgoing   # Allow all outbound
 # Allow specific services
 sudo ufw allow ssh                 # Allow SSH (port 22)
 sudo ufw allow from 172.16.0.0/12 to any port 8000 proto tcp  # Docker → metrics API
 # Why 172.16.0.0/12 and not just the specific Docker subnet?
 # Docker creates a new bridge network with a random 172.x subnet for each network.
 # /12 covers ALL possible Docker subnets so this rule always works.
 ```
 **The ufw/Docker conflict:** Docker modifies iptables rules directly, bypassing ufw.
 This means Docker's port mappings (`-p 9100:9100`) are accessible regardless of ufw rules.
 Only services running in `network_mode: host` are controlled by ufw.
 kscloud1's metrics API uses `network_mode: host`, so it needs an explicit ufw allow rule
 for Docker containers to reach it.
 ---
 ## Reverse Proxies
 A reverse proxy receives requests on behalf of backend services:
 ```
 Client → Reverse Proxy → Backend A
                      → Backend B
                      → Backend C
 ```
 In this homelab:
 - **Cloudflare + cloudflared** — the primary reverse proxy routing by hostname
 - **nginx (homepage container)** — secondary proxy forwarding `/api/*` to metrics API
 nginx config that proxies API calls:
 ```nginx
 location /api/ {
    proxy_pass http://host.docker.internal:8000/;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
 }
 ```
 `host.docker.internal` resolves to the host machine's IP from inside a Docker container.
 This lets the nginx container reach the metrics API running in `network_mode: host`.
 ---
 ## Diagnosing Network Problems
 **"I can't reach the service from outside"**
 ```bash
 # Is cloudflared running and connected?
 docker logs cloudflared | tail -20
 # Is the target container running and on the right network?
 docker inspect homepage --format '{{range .NetworkSettings.Networks}}{{println .}}{{end}}'
 # Can cloudflared reach the container?
 docker exec cloudflared curl -s http://homepage:3000
 ```
 **"Two containers can't talk to each other"**
 ```bash
 # Are they on the same network?
 docker network inspect kitestacks | grep -A5 "Containers"
 # DNS resolution working?
 docker exec service-a nslookup service-b
 # Is the target port open inside the container?
 docker exec service-b ss -tlnp
 ```
 **"The database won't accept connections"**
 ```bash
 # Is Postgres listening?
 docker exec authentik-postgres ss -tlnp | grep 5432
 # From another container, can we reach it?
 docker exec authentik nc -zv authentik-postgres 5432
 # Is it bound to the right interface on kscloud1?
 docker exec authentik-postgres ss -tlnp | grep 5432
 # Should show: *:5432 or 100.123.x.x:5432, not 127.0.0.1:5432
 ```
 ---
 **Next:** [Part 6 — Full Build](06-full-build.md)
--- a/homelab-mastery/build-guide/without-ai/06-full-build.md
+++ b/homelab-mastery/build-guide/without-ai/06-full-build.md
@ -0,0 +1,478 @@
 # Without AI — Part 6: Full Build
 **Track:** Advanced (No AI)  
 **Time for this section:** 4–8 weeks
 You now have the foundations: Linux, Bash, Python, Docker, and Networking.
 This section builds the entire KiteStacks homelab from scratch — command by command,
 with every command explained.
 ---
 ## Before You Start
 You need:
 - Ubuntu 24.04 installed on your home PC (monk) and your VPS (kscloud1)
 - A domain name with DNS managed by Cloudflare
 - SSH key access to kscloud1
 - Tailscale account and CLI installed on both machines
 - Cloudflare account with a tunnel created (token saved)
 ---
 ## Phase 1 — Prepare Both Machines
 Run on **both monk and kscloud1**:
 ```bash
 # Update the system
 sudo apt update && sudo apt upgrade -y
 # Install essential tools
 sudo apt install -y curl git nano wget python3 python3-pip ufw
 # Install Docker
 sudo apt install -y ca-certificates curl
 sudo install -m 0755 -d /etc/apt/keyrings
 sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
  -o /etc/apt/keyrings/docker.asc
 sudo chmod a+r /etc/apt/keyrings/docker.asc
 echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
 sudo apt update
 sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
 # Enable and start Docker
 sudo systemctl enable docker
 sudo systemctl start docker
 # Add your user to the docker group (avoids sudo for every docker command)
 sudo usermod -aG docker $USER
 # Log out and back in for this to take effect
 # Create the shared Docker network
 docker network create kitestacks
 ```
 On **kscloud1** specifically, set up the firewall:
 ```bash
 sudo ufw default deny incoming
 sudo ufw default allow outgoing
 sudo ufw allow ssh
 # Allow Docker bridge networks to reach host port 8000 (metrics API)
 sudo ufw allow from 172.16.0.0/12 to any port 8000 proto tcp
 sudo ufw --force enable
 sudo ufw status verbose
 ```
 Install Tailscale on both machines:
 ```bash
 curl -fsSL https://tailscale.com/install.sh | sh
 sudo tailscale up
 # Follow the URL to authenticate
 tailscale ip -4   # Note this IP — you will use it throughout the build
 ```
 ---
 ## Phase 2 — Cloudflared (Tunnel Connector)
 Run on **monk**:
 ```bash
 mkdir -p ~/kitestacks-live/docker/cloudflared
 cd ~/kitestacks-live/docker/cloudflared
 cat > .env <<'EOF'
 TUNNEL_TOKEN=your-tunnel-token-from-cloudflare
 EOF
 cat > docker-compose.yml <<'EOF'
 services:
  cloudflared:
    image: cloudflare/cloudflared:latest
    container_name: cloudflared
    restart: unless-stopped
    command: tunnel --no-autoupdate run
    environment:
      - TUNNEL_TOKEN=${TUNNEL_TOKEN:?set TUNNEL_TOKEN in .env}
    networks:
      - default
      - kitestacks
 networks:
  kitestacks:
    external: true
 EOF
 docker compose up -d
 docker logs cloudflared   # Confirm "Connection established"
 ```
 **Why `${TUNNEL_TOKEN:?set TUNNEL_TOKEN in .env}`:**
 The `:?` syntax means: if the variable is unset or empty, exit with the given error message.
 This prevents silently running cloudflared with no token (which would produce a confusing error).
 Repeat on **kscloud1** using the same token, same docker-compose.yml, at `/opt/kitestacks/docker/cloudflared/`.
 ---
 ## Phase 3 — Shared Database Layer (on kscloud1)
 The shared Postgres and Redis will run on kscloud1. Both monk's and kscloud1's Authentik
 will point to these. Forgejo will use the same Postgres (different database).
 On **kscloud1**:
 ```bash
 # Get kscloud1's Tailscale IP
 TAILSCALE_IP=$(tailscale ip -4)
 echo "Tailscale IP: $TAILSCALE_IP"
 mkdir -p /opt/kitestacks/docker/authentik
 cd /opt/kitestacks/docker/authentik
 # Generate a strong Postgres password
 PG_PASS=$(openssl rand -base64 32 | tr -d '/+=')
 echo "Postgres password: $PG_PASS"  # Save this
 cat > .env <<EOF
 PG_PASS=${PG_PASS}
 EOF
 cat > docker-compose.yml <<EOF
 services:
  authentik-postgres:
    image: postgres:16-alpine
    container_name: authentik-postgres
    restart: unless-stopped
    environment:
      POSTGRES_PASSWORD: \${PG_PASS}
      POSTGRES_USER: authentik
      POSTGRES_DB: authentik
    ports:
      - "${TAILSCALE_IP}:5432:5432"
    volumes:
      - ./postgres:/var/lib/postgresql/data
    networks:
      - kitestacks
      - authentik_default
  authentik-redis:
    image: redis:7-alpine
    container_name: authentik-redis
    restart: unless-stopped
    ports:
      - "${TAILSCALE_IP}:6379:6379"
    networks:
      - kitestacks
      - authentik_default
 networks:
  kitestacks:
    external: true
  authentik_default:
    name: authentik_default
 EOF
 docker compose up -d
 docker ps   # Confirm both containers are Up
 # Verify Postgres is listening on Tailscale IP only (NOT 0.0.0.0)
 docker exec authentik-postgres ss -tlnp | grep 5432
 # Expected: LISTEN  0.0.0.0:5432 or 100.x.x.x:5432
 ```
 **Why the Tailscale IP binding matters:**
 `"${TAILSCALE_IP}:5432:5432"` tells Docker: bind host port 5432 only on the Tailscale
 interface. If you used `"5432:5432"` (or `"0.0.0.0:5432:5432"`), Postgres would be
 reachable from the public internet — a serious security risk. Only devices on your
 Tailscale network can reach `100.x.x.x:5432`.
 Create the Forgejo database:
 ```bash
 docker exec -e PGPASSWORD="${PG_PASS}" authentik-postgres \
  psql -U authentik -c "CREATE USER forgejo WITH PASSWORD 'forgejo-password-here';"
 docker exec -e PGPASSWORD="${PG_PASS}" authentik-postgres \
  psql -U authentik -c "CREATE DATABASE forgejo OWNER forgejo;"
 ```
 ---
 ## Phase 4 — Authentik (SSO)
 On **monk** first:
 ```bash
 mkdir -p ~/kitestacks-live/docker/authentik
 cd ~/kitestacks-live/docker/authentik
 # Get kscloud1's Tailscale IP
 KSCLOUD1_TAILSCALE=100.123.x.x   # Replace with your actual value
 # Generate Authentik secret key (must be same on both hosts)
 SECRET_KEY=$(openssl rand -base64 60 | tr -d '\n')
 echo "Secret key: $SECRET_KEY"    # Save this — both hosts need the SAME key
 cat > .env <<EOF
 PG_PASS=your-postgres-password-from-phase-3
 AUTHENTIK_SECRET_KEY=${SECRET_KEY}
 AUTHENTIK_POSTGRESQL__HOST=${KSCLOUD1_TAILSCALE}
 AUTHENTIK_POSTGRESQL__USER=authentik
 AUTHENTIK_POSTGRESQL__NAME=authentik
 AUTHENTIK_POSTGRESQL__PASSWORD=your-postgres-password-from-phase-3
 AUTHENTIK_REDIS__HOST=${KSCLOUD1_TAILSCALE}
 AUTHENTIK_BOOTSTRAP_EMAIL=your@email.com
 AUTHENTIK_BOOTSTRAP_PASSWORD=choose-strong-password
 EOF
 cat > docker-compose.yml <<'EOF'
 services:
  authentik:
    image: ghcr.io/goauthentik/server:latest
    container_name: authentik
    restart: unless-stopped
    command: server
    env_file: .env
    networks:
      - kitestacks
  authentik-worker:
    image: ghcr.io/goauthentik/server:latest
    container_name: authentik-worker
    restart: unless-stopped
    command: worker
    env_file: .env
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 EOF
 docker compose up -d
 # Wait for Authentik to be healthy (takes ~2 minutes on first boot)
 until [[ "$(docker inspect --format '{{.State.Health.Status}}' authentik)" == "healthy" ]]; do
  echo "Waiting for Authentik... $(docker inspect --format '{{.State.Health.Status}}' authentik)"
  sleep 10
 done
 echo "Authentik is healthy"
 ```
 **What happens on first boot:** Authentik runs database migrations (creates all tables),
 generates cryptographic keys, and starts the server. The worker process handles
 background jobs (email, background flows). Both need the same `.env` file.
 **Why `AUTHENTIK_REDIS__HOST` and not just `REDIS_HOST`:**
 Authentik uses a config format where `__` in environment variable names means "nested key".
 `AUTHENTIK_POSTGRESQL__HOST` maps to `authentik.postgresql.host` in the config tree.
 On **kscloud1**, create the same Authentik setup pointing to the local Postgres:
 ```bash
 # On kscloud1, AUTHENTIK_POSTGRESQL__HOST should be authentik-postgres
 # (via the Docker network), not the Tailscale IP
 # kscloud1's Authentik is on the same Docker network as Postgres
 ```
 ---
 ## Phase 5 — Forgejo
 On **monk**:
 ```bash
 mkdir -p ~/kitestacks-live/docker/forgejo
 cd ~/kitestacks-live/docker/forgejo
 KSCLOUD1_TAILSCALE=100.123.x.x   # kscloud1's Tailscale IP
 cat > .env <<EOF
 FORGEJO__database__DB_TYPE=postgres
 FORGEJO__database__HOST=${KSCLOUD1_TAILSCALE}:5432
 FORGEJO__database__NAME=forgejo
 FORGEJO__database__USER=forgejo
 FORGEJO__database__PASSWD=forgejo-password-from-phase-3
 FORGEJO__server__DOMAIN=gitforge.yourdomain.com
 FORGEJO__server__ROOT_URL=https://gitforge.yourdomain.com
 FORGEJO__server__SSH_DOMAIN=gitforge.yourdomain.com
 EOF
 cat > docker-compose.yml <<'EOF'
 services:
  forgejo:
    image: codeberg.org/forgejo/forgejo:latest
    container_name: forgejo
    restart: unless-stopped
    env_file: .env
    volumes:
      - ./data:/data
    networks:
      - kitestacks
 networks:
  kitestacks:
    external: true
 EOF
 docker compose up -d
 docker logs forgejo -f   # Watch for errors
 ```
 Visit `gitforge.yourdomain.com`. Complete the initial setup, then create your admin account.
 On **kscloud1**: Same configuration. Both Forgejo instances point to the same Postgres `forgejo` database — so repos, users, and settings are identical on both.
 ---
 ## Phase 6 — All Remaining Services
 For each remaining service, the pattern is the same:
 1. `mkdir -p ~/kitestacks-live/docker/<service>`
 2. Create `.env` with secrets
 3. Create `docker-compose.yml`
 4. `docker compose up -d`
 5. Verify with `docker ps` and `docker logs <container>`
 Detailed compose files for each service are in `~/kitestacks-homelab/apps/<service>/`.
 Use those as your reference — read each file before running it.
 Key services and their main configuration points:
 **Karakeep:** Provider ID is `custom` (not `authentik`) — OAuth redirect URI is
 `https://links.yourdomain.com/api/auth/callback/custom`.
 **Kavita:** OIDC must be configured via web UI (Settings → OIDC), not by file editing.
 Authority URL requires trailing slash.
 **BookStack:** After first start, fix cache permissions:
 ```bash
 docker exec bookstack chown -R abc:users /config/www/framework/cache/
 docker compose restart bookstack
 ```
 **kitestacks-metrics-api:**
 ```yaml
 services:
  kitestacks-metrics-api:
    image: your-metrics-api-image   # Build from apps/kitestacks-metrics-api/
    container_name: kitestacks-metrics-api
    restart: unless-stopped
    network_mode: host    # Must be host — not kitestacks network
    pid: host             # Must be host — reads /proc for real stats
    environment:
      - FORGEJO_API_BASE=https://gitforge.yourdomain.com
      - FORGEJO_TOKEN=your-forgejo-api-token
 ```
 Note: `network_mode: host` and `networks:` cannot coexist. The metrics API is reachable
 at `host.docker.internal:8000` from other containers.
 ---
 ## Phase 7 — SSO Configuration
 For each service, in Authentik admin panel (`auth.yourdomain.com/if/admin/`):
 1. **Applications → Providers → Create → OAuth2/OpenID Provider**
   - Client type: Confidential
   - Redirect URIs: service-specific (see SSO guide)
   - Signing key: authentik Self-signed Certificate
   - Scopes: openid, email, profile
 2. **Applications → Applications → Create**
   - Provider: the one you just created
   - Launch URL: the service's public URL
 3. (For sensitive services) **Policy Binding** → restrict to `homelab-admin` group
 OAuth2 code TTL — increase to prevent `invalid_grant` during monk reconnect:
 ```bash
 # Connect to shared Postgres from kscloud1
 docker exec -it authentik-postgres psql -U authentik -d authentik
 -- Increase code lifetime for all providers to 10 minutes
 UPDATE authentik_providers_oauth2_oauth2provider
 SET access_code_validity = '00:10:00';
 -- Restart both Authentik instances after this
 \q
 ```
 ---
 ## Phase 8 — Push Everything to kscloud1
 With monk as the source, push configurations to kscloud1:
 ```bash
 # For each service, copy the docker-compose.yml and .env (with paths adjusted)
 # The standard pattern:
 for service in forgejo karakeep kavita grafana uptime-kuma bookstack osticket portainer; do
  ssh -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x \
    "mkdir -p /opt/kitestacks/docker/$service"
  scp -i ~/.ssh/id_ed25519_kscloud1 \
    ~/kitestacks-live/docker/$service/docker-compose.yml \
    ~/kitestacks-live/docker/$service/.env \
    kenpat@100.123.x.x:/opt/kitestacks/docker/$service/
 done
 ```
 Then on kscloud1, start each service:
 ```bash
 for service in forgejo karakeep kavita grafana uptime-kuma bookstack osticket portainer; do
  cd /opt/kitestacks/docker/$service
  docker compose up -d
 done
 ```
 Verify all 11 services return the expected status:
 ```bash
 for url in www auth gitforge ai links kavita grafana status wiki tasks portainer; do
  code=$(curl -s -o /dev/null -w "%{http_code}" "https://${url}.yourdomain.com" --max-time 5)
  echo "${url}.yourdomain.com: ${code}"
 done
 ```
 All should return 200 or 302 (redirect to login).
 ---
 ## Committing Everything to Forgejo
 Once your homelab is working, commit all configurations:
 ```bash
 cd ~/kitestacks-live
 git init
 git remote add origin https://gitforge.yourdomain.com/kenpat/kitestacks-live.git
 # Add a .gitignore BEFORE adding files — never commit secrets
 cat > .gitignore <<'EOF'
 **/.env
 **/data/
 **/postgres/
 **/config/
 **/*.db
 **/*.db-shm
 **/*.db-wal
 EOF
 git add docker-compose.yml docker/*/docker-compose.yml
 git commit -m "initial: all service compose files"
 git push origin main
 ```
 Your `.env` files (which contain passwords and tokens) must NEVER be committed.
 The `.gitignore` above prevents this.
 ---
 **Next:** [Part 7 — Troubleshooting](07-troubleshooting.md)
--- a/homelab-mastery/build-guide/without-ai/07-troubleshooting.md
+++ b/homelab-mastery/build-guide/without-ai/07-troubleshooting.md
@ -0,0 +1,389 @@
 # Without AI — Part 7: Troubleshooting
 **Track:** Advanced (No AI)  
 **Time for this section:** Ongoing (this is a reference you return to)
 Troubleshooting is not a step you complete — it is a skill you build over time.
 This section teaches the methodology and documents the real issues encountered
 building KiteStacks, with full explanations of how each was diagnosed and fixed.
 ---
 ## The Troubleshooting Mindset
 Before running any command, form a hypothesis. Before Googling, read the error.
 **The diagnostic loop:**
 1. **Observe** — what exactly is failing? URL? Error message? Which service?
 2. **Hypothesize** — what could cause this? List 2–3 possibilities
 3. **Test** — run the simplest command to prove or disprove your hypothesis
 4. **Narrow** — eliminate possibilities until one remains
 5. **Fix** — apply the fix
 6. **Verify** — confirm the fix worked
 7. **Document** — write what broke and what fixed it
 The most common mistake: jumping to step 5 without completing steps 2–4.
 ---
 ## Diagnostic Commands to Know Cold
 ```bash
 # Container status
 docker ps                          # All running containers
 docker ps -a                       # All containers (including stopped)
 docker inspect <container>         # Full container config and state
 # Logs
 docker logs <container>            # All logs
 docker logs <container> --tail 50  # Last 50 lines
 docker logs <container> -f         # Follow live
 docker logs <container> --since 5m # Last 5 minutes
 # Network
 docker exec <container> curl -s http://other-container:port/health
 docker exec <container> nslookup other-container
 docker exec <container> ss -tlnp
 docker network inspect kitestacks
 # Disk and resources
 docker system df                   # Docker disk usage
 docker stats --no-stream           # One-shot resource usage
 df -h                              # Host disk usage
 free -h                            # Host RAM
 # DNS and HTTP from host
 curl -sv https://grafana.kitestacks.com  # -v = verbose (shows headers, TLS)
 dig grafana.kitestacks.com               # DNS lookup
 ```
 ---
 ## Real Issues Encountered Building KiteStacks
 ### Issue 1 — SSO: `invalid_grant` on OAuth Login (50% of the time)
 **Symptom:** Clicking "Sign in with Authentik" in Grafana, Kavita, etc. sometimes
 worked and sometimes showed `invalid_grant: The provided authorization grant is invalid`.
 Happened roughly 50% of the time. No correlation to time of day.
 **Observation:** The error appeared specifically after the authorization code redirect,
 during the token exchange step.
 **Hypothesis:**
 1. Authentik configuration wrong (but then it would fail 100% of the time)
 2. Network issue (but HTTP 400 means request reached Authentik)
 3. The code created in step 1 is not found in step 2
 **Testing:**
 ```bash
 # Check if both Authentik instances have the same database
 docker exec authentik psql -U authentik -h $KSCLOUD1_IP -c "SELECT count(*) FROM authentik_providers_oauth2_authorizationcode;"
 # Monk's Authentik: count = 3
 # kscloud1's Authentik: count = 1
 # Different! Step 1 created the code in one DB, step 2 looked in the other.
 ```
 **Root cause:** Two Authentik instances, two separate Postgres databases. Cloudflare
 routes `/authorize` and `/application/o/token/` independently — they can hit different hosts.
 **Fix:** Migrate both Authentik instances to a single shared Postgres, hosted on kscloud1,
 bound to the Tailscale IP only.
 ```bash
 # 1. Dump monk's Authentik DB
 docker exec authentik-postgres pg_dump -U authentik authentik --clean --if-exists \
  > /tmp/authentik_dump.sql
 # 2. Restore to kscloud1's new shared Postgres
 scp /tmp/authentik_dump.sql kenpat@100.123.x.x:/tmp/
 ssh kenpat@100.123.x.x "docker exec -i authentik-postgres psql -U authentik -d authentik \
  < /tmp/authentik_dump.sql"
 # 3. Update monk's Authentik .env to point to kscloud1's Tailscale IP
 AUTHENTIK_POSTGRESQL__HOST=100.123.x.x
 AUTHENTIK_REDIS__HOST=100.123.x.x
 # 4. Remove monk's local Postgres and Redis
 docker stop authentik-postgres authentik-redis   # Stop, don't delete (keep data as backup)
 # 5. Restart monk's Authentik
 docker compose up -d
 ```
 **Verification:** Logged in from a browser with DevTools open, watching Network tab.
 `/authorize` returned 302 with a code. `/token` returned 200 with a JWT. Done.
 **Lesson:** Stateful services with active-active routing need shared state. Any session,
 token, or code stored in one instance's database is invisible to the other instance.
 ---
 ### Issue 2 — Phantom Third Connector in Cloudflare Dashboard
 **Symptom:** Cloudflare Tunnel showed 3 active connectors when only 2 were expected
 (monk + kscloud1). Which was the third?
 **Investigation:**
 ```bash
 # Check running Docker containers for cloudflared
 docker ps | grep cloudflared
 # Shows: one cloudflared container — expected
 # Check for non-Docker cloudflared processes
 ps aux | grep cloudflared
 # Shows: TWO processes!
 # /usr/bin/cloudflared (system-installed, running as a systemd service)
 # /usr/local/bin/cloudflared (Docker container)
 ```
 **Root cause:** A cloudflared systemd service was installed separately from the Docker
 container. Both connected to the same tunnel with the same token, registering as separate connectors.
 ```bash
 # Verify the systemd service
 sudo systemctl status cloudflared
 # Fix: disable the systemd service
 sudo systemctl stop cloudflared
 sudo systemctl disable cloudflared
 # Verify only one connector process remains
 ps aux | grep cloudflared
 ```
 **Verification:** Cloudflare dashboard refreshed to show 2 connectors within 30 seconds.
 **Lesson:** A service installed via package manager AND in Docker is a recipe for duplicate
 processes. Check both `docker ps` and `ps aux` when troubleshooting unexpected behavior.
 ---
 ### Issue 3 — Karakeep SSO "Redirect URI Error"
 **Symptom:** After configuring Authentik OAuth2 for Karakeep, clicking "Sign in"
 showed "Redirect URI Error: The provided redirect_uri does not match any of the
 allowed redirect URIs" from Authentik.
 **Investigation:**
 ```bash
 # Check what redirect URI was used in the OAuth2 request
 # Read from Authentik's logs
 docker logs authentik --tail 100 | grep "redirect_uri"
 # Shows: redirect_uri=https://links.kitestacks.com/api/auth/callback/authentik
 ```
 **Root cause:** Karakeep uses NextAuth.js internally with provider ID `custom`.
 NextAuth constructs callback URLs as `/api/auth/callback/<provider-id>`.
 The provider ID is `custom`, not `authentik`.
 So the callback is `/api/auth/callback/custom`, not `/api/auth/callback/authentik`.
 **Fix:**
 ```bash
 # Update Authentik's OAuth2 provider for Karakeep in the shared Postgres
 docker exec -it authentik-postgres psql -U authentik -d authentik
 BEGIN;
 UPDATE authentik_providers_oauth2_oauth2provider
 SET _redirect_uris = '["https://links.kitestacks.com/api/auth/callback/custom"]'
 WHERE name = 'Karakeep';
 COMMIT;
 -- Verify
 SELECT name, _redirect_uris FROM authentik_providers_oauth2_oauth2provider WHERE name = 'Karakeep';
 \q
 ```
 Restart Authentik on both hosts:
 ```bash
 docker compose restart authentik authentik-worker
 # Wait for healthy before testing
 ```
 **Lesson:** When you get a redirect URI mismatch, always check what URI the APP is
 actually sending — not what you think it should send. The app's logs or browser DevTools
 Network tab show the actual request.
 ---
 ### Issue 4 — Kavita OIDC Config Gets Wiped on Restart
 **Symptom:** Configured Kavita's OIDC settings by editing `kavita.db` directly
 (using sqlite3). Settings looked correct in the DB. After `docker compose restart kavita`,
 the OIDC config was reset to empty/disabled.
 **Investigation:**
 ```bash
 # Check the ServerSetting row before and after restart
 docker exec -it kavita sqlite3 /kavita/config/kavita.db \
  "SELECT Value, RowVersion FROM ServerSetting WHERE \"Key\"=40;"
 # Before restart: {"enabled":true,"authority":"...","clientId":"kavita",...}, RowVersion=8
 # After restart: {"enabled":false,"authority":"","clientId":"","clientSecret":"",...}, RowVersion=10
 # RowVersion incremented by 2 — Kavita wrote to the row twice during startup
 ```
 **Root cause:** Kavita validates and resets `ServerSetting` rows during startup from
 its own defaults. Any value that does not pass Kavita's internal validation (including
 OIDC config with the wrong format) gets reset to defaults. Direct SQL writes do not
 go through Kavita's validation pipeline, so they get overwritten.
 **Fix:** Use Kavita's own Settings UI via SSH port forwarding to bypass Cloudflare
 and reach kscloud1's Kavita directly:
 ```bash
 # Forward kscloud1's Kavita port to localhost
 ssh -L 5099:localhost:5000 -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x -N &
 # Now visit http://localhost:5099 in browser
 # Log in with your Kavita credentials
 # Settings → OIDC → configure there
 # Click Save → changes survive restart
 ```
 **Verification:** After saving in the UI, checked `RowVersion` was not incrementing on restart.
 **Lesson:** Do not write directly to application databases unless you know the app does not
 reinitialize those values on startup. Use the application's own APIs or UI.
 **Critical detail:** The Authority URL MUST have a trailing slash:
 `https://auth.kitestacks.com/application/o/kavita/`
 Without it: "issuer does not match" error, because Authentik's `openid-configuration`
 returns an `issuer` field that includes the trailing slash, and Kavita compares them exactly.
 ---
 ### Issue 5 — SSO Login Fails After monk Reconnects
 **Symptom:** When monk went offline and came back, SSO logins failed for 5–10 minutes
 with `invalid_grant`, then started working again.
 **Investigation:**
 Timeline reconstruction:
 - T+0: monk goes offline (power or network)
 - T+0: kscloud1 handles all traffic solo — SSO works fine, codes stored in shared DB
 - T+5min: monk comes back online, cloudflared reconnects
 - T+5min to T+8min: monk's Authentik is still starting (container startup takes ~3–4 min)
 - During this window: Cloudflare routes some `/authorize` to kscloud1, some `/token` to monk
 - Monk's Authentik hasn't finished starting — it responds with errors or invalid state
 **Root cause:** The OAuth2 authorization code has a 1-minute TTL (default). Monk's Authentik
 takes 3–5 minutes to fully start. During startup, Cloudflare is already routing traffic to
 monk's cloudflared (which is running), but monk's Authentik is not ready.
 Codes created on kscloud1 expire before monk's Authentik is healthy enough to serve them.
 **Fix:** Increase the OAuth2 code TTL from 1 minute to 10 minutes:
 ```bash
 docker exec -it authentik-postgres psql -U authentik -d authentik
 UPDATE authentik_providers_oauth2_oauth2provider
 SET access_code_validity = '00:10:00';
 \q
 ```
 Restart both Authentik instances. Now codes have a 10-minute window — enough for monk
 to finish starting before the code expires.
 **Alternative/additional fix:** Add a health check to monk's cloudflared or Authentik
 that keeps cloudflared from accepting traffic until Authentik is healthy.
 ---
 ### Issue 6 — kscloud1 SSH Key Auth Broken After Long Absence
 **Symptom:** After not connecting to kscloud1 for several weeks, `ssh kenpat@kscloud1`
 returned "Permission denied (publickey)".
 **Investigation:**
 ```bash
 ssh -v -i ~/.ssh/id_ed25519_kscloud1 kenpat@100.123.x.x
 # Verbose output showed: offered key was not accepted
 # No other errors — key was being offered but rejected
 ```
 **Root cause:** The `authorized_keys` file on kscloud1 had somehow been reset or corrupted
 (possibly from a VPS maintenance event or snapshot restore).
 **Fix:** Use Hetzner's console (web-based terminal that does not require SSH):
 1. Hetzner dashboard → Server → Console
 2. Log in as root (reset root password via Hetzner UI if needed)
 3. Restore the public key:
 ```bash
 # On kscloud1 via Hetzner console
 mkdir -p /home/kenpat/.ssh
 cat >> /home/kenpat/.ssh/authorized_keys << 'EOF'
 ssh-ed25519 AAAA... your-public-key-here
 EOF
 chmod 700 /home/kenpat/.ssh
 chmod 600 /home/kenpat/.ssh/authorized_keys
 chown -R kenpat:kenpat /home/kenpat/.ssh
 ```
 **Lesson:** Always keep your public key backed up. Cloud providers (Hetzner, AWS, DigitalOcean)
 all have web-based console access for exactly this situation. Never rely only on SSH for
 access to a remote server.
 ---
 ### Issue 7 — ufw Blocking Docker Container to Host Port
 **Symptom:** The portal homepage on kscloud1 showed "0%" and "Offline" for the System Status
 widget. On monk it showed real values.
 **Investigation:**
 ```bash
 # Test the metrics API directly from inside the homepage container on kscloud1
 docker exec homepage-backup curl -s http://host.docker.internal:8000/api/metrics
 # No response after timeout
 # Test from host directly
 curl -s http://localhost:8000/api/metrics
 # Returns real metrics immediately
 # Check ufw rules
 sudo ufw status verbose
 # default deny incoming — no specific rule for port 8000
 ```
 **Root cause:** The `kitestacks-metrics-api` container runs with `network_mode: host`.
 When `homepage-backup` calls `host.docker.internal:8000`, the kernel sees the source IP
 as the Docker bridge network (`172.x.x.x`). ufw's `default deny incoming` blocks it.
 Docker's iptables bypass (that allows published ports to work despite ufw) does not apply
 here because this is host-to-host traffic, not container-published port traffic.
 **Fix:**
 ```bash
 sudo ufw allow from 172.16.0.0/12 to any port 8000 proto tcp
 sudo ufw status verbose   # Verify rule added
 ```
 `172.16.0.0/12` covers all Docker bridge subnets (172.16.x.x through 172.31.x.x).
 **Verification:**
 ```bash
 docker exec homepage-backup curl -s http://host.docker.internal:8000/api/metrics
 # Now returns: {"cpu_percent": 4.2, "ram_percent": 71.3, ...}
 ```
 ---
 ## General Troubleshooting Cheatsheet
 | Symptom | First Commands to Run |
 |---------|----------------------|
 | Container won't start | `docker logs <container>` |
 | Container starts then crashes | `docker logs <container> --tail 30` |
 | Can't reach service from browser | `docker exec cloudflared curl -s http://<service>:<port>` |
 | SSL/TLS error in browser | `curl -sv https://yourdomain.com` (check Cloudflare is resolving) |
 | SSO failing with invalid_grant | Check both Authentik instances point to same shared Postgres |
 | Database error | Check data directory permissions: `ls -la ./data/` |
 | Port already in use | `sudo ss -tlnp | grep :<port>` |
 | Out of disk space | `df -h` and `docker system df` |
 | Out of RAM | `free -h` and `docker stats --no-stream` |
 | Can't ping between containers | `docker network inspect kitestacks` |
 | Forgejo 502 | `docker logs forgejo` — likely DB connection issue |
 | Authentik won't start | Check it can reach `$KSCLOUD1_TAILSCALE:5432` (Tailscale up?) |
--- a/homelab-mastery/certifications/roadmap.md
+++ b/homelab-mastery/certifications/roadmap.md
@ -132,12 +132,12 @@ Given where you are today:
 | Timeframe | Milestone |
 |-----------|-----------|
-| Next 1–2 months | CompTIA A+ Core 2 ✅ |
+| **July 7, 2026** | **CompTIA A+ Core 2** — exam goal (hard deadline July 12) |
-| Months 3–8 | CCNA |
+| Months 1–6 after A+ | CCNA |
-| Months 9–11 | AWS SAA-C03 |
+| Months 7–9 after A+ | AWS SAA-C03 |
-| Months 12–14 | AWS SysOps Associate |
+| Months 10–12 after A+ | AWS SysOps Associate |
-| Months 15–18 | CKA (or CompTIA Cloud+) |
+| Months 13–16 after A+ | CKA (or CompTIA Cloud+) |
-| Months 18+ | AI/ML certs |
+| Months 16+ after A+ | AI/ML certs |
 ---
--- a/homelab-mastery/concepts/oauth2-oidc.md
+++ b/homelab-mastery/concepts/oauth2-oidc.md
@ -6,16 +6,16 @@ This is the concept that most people get wrong. Understanding it cold will impre
 ## The Problem SSO Solves
-Without SSO: 9 services = 9 separate user databases. To add a friend:
+Without SSO: 11 services = 11 separate user databases. To add a friend:
 - Create account in Forgejo
 - Create account in Grafana
 - Create account in Open WebUI
 - Create account in Kavita
- ... 9 times
+- ... 11 times
-To remove their access: 9 places to deactivate.
+To remove their access: 11 places to deactivate.
-With SSO: 1 account in Authentik. Access to all 9 services. Deactivate once.
+With SSO: 1 account in Authentik. Access to all 11 services. Deactivate once.
 ---
@ -168,4 +168,4 @@ Authentik acts as a reverse proxy in front of the app. The user authenticates wi
 ## What to Say About SSO
-> *"I implemented single sign-on across all nine services using Authentik as the OIDC identity provider. Each service is registered as an OAuth2 client with a unique client ID and redirect URI. The OAuth2 authorization code flow means user credentials only ever go to Authentik — other services receive a signed JWT and never see the password. I hit a distributed systems issue in production where authorization codes were being invalidated by active-active load balancing across two hosts — I diagnosed it by tracing the OAuth2 flow and fixed it by sharing a single Postgres database between both Authentik instances over a private Tailscale network."*
+> *"I implemented single sign-on across all eleven services using Authentik as the OIDC identity provider. Each service is registered as an OAuth2 client with a unique client ID and redirect URI. The OAuth2 authorization code flow means user credentials only ever go to Authentik — other services receive a signed JWT and never see the password. I hit a distributed systems issue in production where authorization codes were being invalidated by active-active load balancing across two hosts — I diagnosed it by tracing the OAuth2 flow and fixed it by sharing a single Postgres database between both Authentik instances over a private Tailscale network."*
--- a/homelab-mastery/interview-prep/explain-the-project.md
+++ b/homelab-mastery/interview-prep/explain-the-project.md
@ -2,19 +2,19 @@
 ## The 30-Second Version (LinkedIn DM, recruiter screen)
-> *"I built a self-hosted homelab running a public website at kitestacks.com with nine services — including a Git platform, AI assistant, eBook library, monitoring stack, and SSO. It runs on my home PC with a Hetzner cloud VPS as a live failover, connected through Cloudflare Tunnel so no ports are exposed on my home network. Everything is containerized with Docker and documented in a private Forgejo repo."*
+> *"I built a self-hosted homelab running a public website at kitestacks.com with eleven services — including a Git platform, AI assistant, eBook library, bookmark manager, wiki, help desk, monitoring stack, and SSO. It runs on my home PC with a Hetzner cloud VPS as a live failover, connected through Cloudflare Tunnel so no ports are exposed on my home network. Everything is containerized with Docker and documented in a private Forgejo repo."*
 ---
 ## The 2-Minute Version (phone screen, LinkedIn intro)
-> *"I built KiteStacks — a multi-host self-hosted platform running at kitestacks.com. The core is nine services containerized with Docker: a Forgejo Git instance, Grafana monitoring, Authentik for single sign-on, Open WebUI for AI access, Kavita for reading, Karakeep for bookmarks, OpenProject for tasks, Uptime Kuma for monitoring, and a custom portal I built myself.*
+> *"I built KiteStacks — a multi-host self-hosted platform running at kitestacks.com. The core is eleven services containerized with Docker: a custom portal, Forgejo Git instance, Authentik for single sign-on, Open WebUI for AI access, Karakeep for bookmarks, Kavita for reading, Grafana with Prometheus for monitoring, Uptime Kuma for uptime checks, BookStack for documentation, OSTicket for help desk, and Portainer for container management.*
 >
 > *It runs on my home machine with a Hetzner VPS as a permanent cloud replica — active-active load balanced through Cloudflare Tunnel so the site stays up even when I'm traveling and my home network is down.*
 >
 > *The hardest part was a production SSO bug where OAuth2 authorization codes were being invalidated by the active-active routing — I traced the OAuth2 flow, identified it as a split-database problem, and solved it by migrating both hosts to a shared Postgres instance accessible only over a private Tailscale network.*
 >
-> *I'm currently studying for the CCNA to formalize the networking knowledge this project required."*
+> *I'm currently studying for CompTIA A+ Core 2 (exam goal July 2026), then CCNA to formalize the networking knowledge this project required."*
 ---
@ -52,7 +52,7 @@ Be ready to go deep on any of these topics. Know the answers cold.
 **"How does the monitoring work?"**
-> *"Prometheus scrapes metrics from two node-exporter instances every 15 seconds — one on the home machine via Docker DNS and one on the Hetzner VPS via its public IP. Grafana visualizes both with the Node Exporter Full dashboard, and you can switch between hosts with an instance picker. Uptime Kuma runs external HTTP checks against all nine public subdomains and would alert me if any went down."*
+> *"Prometheus scrapes metrics from two node-exporter instances every 15 seconds — one on the home machine via Docker DNS and one on the Hetzner VPS via its public IP. Grafana visualizes both with the Node Exporter Full dashboard, and you can switch between hosts with an instance picker. Uptime Kuma runs external HTTP checks against all eleven public subdomains and alerts me if any go down."*
 ---
--- a/homelab-mastery/learning-path/README.md
+++ b/homelab-mastery/learning-path/README.md
@ -2,13 +2,13 @@
 ## Your Advantage
-You don't have a blank canvas. You have a live production system you built. Most people study networking in a textbook. You configured Cloudflare DNS, set up Tailscale, debugged a Docker networking ufw issue, and traced a distributed systems bug in OAuth2. That's hands-on experience that study alone can't replicate.
+You don't have a blank canvas. You have a live production system you built — eleven services running across two hosts with SSO, active-active failover, and shared databases. Most people study networking in a textbook. You configured Cloudflare DNS, set up Tailscale, debugged a Docker networking ufw issue, and traced a distributed systems bug in OAuth2. That's hands-on experience that study alone can't replicate.
 The goal now: attach the vocabulary, depth, and theory to things you've already done.
 ---
-## Phase 1 — Complete A+ Core 2 (Now)
+## Phase 1 — Complete A+ Core 2 (Exam goal: July 7, 2026)
 **Focus areas that directly map to your homelab:**
@ -66,16 +66,18 @@ The CCNA will make everything in your homelab make deeper sense. After CCNA, re-
 |-----|------------------------|
 | EC2 | Hetzner VPS (kscloud1) |
 | S3 | Static file storage |
-| VPC | Docker bridge network |
+| VPC | Docker bridge network (kitestacks) |
 | ALB + CloudFront | Cloudflare Tunnel + edge |
-| RDS | Authentik Postgres |
+| RDS | Shared Postgres on kscloud1 (Authentik + Forgejo) |
-| ElastiCache | Authentik Redis |
+| ElastiCache | Shared Redis on kscloud1 |
 | CloudWatch | Prometheus + Grafana |
 | Route 53 | Cloudflare DNS |
-| IAM | Authentik RBAC / groups |
+| IAM | Authentik RBAC / groups (homelab-admin) |
 | Secrets Manager | .env files (what you'd replace) |
 | ECS / Fargate | Docker Compose (what you use) |
 | VPC Peering | Tailscale overlay |
 | Confluence/SharePoint | BookStack |
 | ServiceNow | OSTicket |
 ---