kitestacks-homelab/RUNBOOK.md

293 lines
9.1 KiB
Markdown

# KiteStacks Homelab — Complete Setup Runbook
**Last Updated:** 2026-06-18
**Status:** Production (monk primary, kscloud1 Hetzner cloud replica)
**Maintainer:** kenpat
---
## Architecture Overview
```
Internet
└── Cloudflare (DNS + Tunnel)
│ Active-Active across 2 connectors
├── cloudflared on monk (primary home machine, Docker container)
└── cloudflared on kscloud1 (Hetzner VPS, <KSCLOUD1_PUBLIC_IP>)
Tailscale overlay network (VPN mesh):
monk <MONK_TAILSCALE_IP>
kscloud1 <KSCLOUD1_TAILSCALE_IP> ← hosts shared Authentik Postgres + Redis
```
**Public subdomains** route through the same Cloudflare Tunnel token.
Both monk and kscloud1 are connectors so the site stays up if either goes offline.
| Subdomain | Service | Port |
|-----------|---------|------|
| auth.kitestacks.com | Authentik | 9000 |
| portainer.kitestacks.com | Portainer | 9443 |
| wiki.kitestacks.com | BookStack | 6875 (monk) / 6877 (kscloud1) |
| grafana.kitestacks.com | Grafana | 3000 |
| gitforge.kitestacks.com | Forgejo | 3006 |
| links.kitestacks.com | Karakeep | 3100 |
| status.kitestacks.com | Uptime Kuma | 3001 |
| tasks.kitestacks.com | OSTicket | 8080 |
| flux.kitestacks.com | FluxCD | — |
---
## Service Inventory
### Running on monk
authentik, authentik-worker, authentik-ldap, authentik-ldap-proxy,
bookstack, bookstack-db, cloudflared, flux, forgejo, grafana,
karakeep, karakeep-chrome, karakeep-meilisearch, kavita,
kite-litellm, kite-openwebui, kitestacks-metrics-api, kitestacks-portal,
node-exporter, ntfy, osticket, osticket-app, osticket-db,
portainer, prometheus, uptime-kuma, blackbox-exporter
### Running on kscloud1 (extras)
bookstack, bookstack-db-ks, kite-monitor, osticket-app-118,
osticket-db-118, www-backup, homepage-backup, cloudflared,
authentik-postgresql, authentik-redis
### Shared infrastructure on kscloud1
- PostgreSQL `:5432` — Authentik DB used by both hosts (Tailscale only)
- Redis `:6379` — Authentik session cache (Tailscale only)
---
## Cloudflare Tunnel
### How it works
Both monk and kscloud1 run `cloudflared` as Docker containers using the **same tunnel token**. Cloudflare load-balances across both connectors (active-active). The tunnel token is stored in:
- monk: `~/kitestacks-live/docker/cloudflared/.env``TUNNEL_TOKEN`
- kscloud1: `/opt/kitestacks/docker/cloudflared/.env``TUNNEL_TOKEN`
### Fix: Phantom 3rd Replica
If `cloudflared tunnel info` shows 3 connectors instead of 2, the native cloudflared systemd service on monk is running alongside the Docker container.
```bash
# Check systemd cloudflared on monk
systemctl status cloudflared
# Disable it — Docker container is the correct one
sudo systemctl disable --now cloudflared
```
### Adding a new hostname route
In Cloudflare Zero Trust → Networks → Tunnels → your tunnel → Edit → Public Hostname:
- Subdomain: `newservice`
- Domain: `kitestacks.com`
- Service: `http://container-name:port`
Both monk and kscloud1 must have the container running on the same port.
---
## Authentik SSO
### Architecture
Authentik uses a **shared database** hosted on kscloud1. monk's Authentik containers connect via Tailscale.
- monk containers: `authentik`, `authentik-worker`, `authentik-ldap`, `authentik-ldap-proxy`
- DB: PostgreSQL on kscloud1 at `<KSCLOUD1_TAILSCALE_IP>:5432`
- Redis: kscloud1 at `<KSCLOUD1_TAILSCALE_IP>:6379`
### Adding OIDC SSO for a new app
1. In Authentik admin (`https://auth.kitestacks.com/if/admin/`):
- **Providers** → Create → OAuth2/OpenID Provider
- Name the provider after the app (e.g. `bookstack`)
- Set `issuer_mode` based on the app's requirements (see Debug doc)
- Note the Client ID and Client Secret
2. **Application** → Create → link to the provider
3. **Policy Binding** → bind the `default-authentication-flow` to the application
4. Configure the app with:
- `OIDC_ISSUER` = discovery base URL
- `OIDC_CLIENT_ID` / `OIDC_CLIENT_SECRET`
- Callback URL = `https://yourapp.kitestacks.com/auth/callback`
### Checking OIDC discovery URL
```bash
# Per-provider (issuer_mode=per_provider)
curl -s https://auth.kitestacks.com/application/o/<slug>/.well-known/openid-configuration | python3 -m json.tool
# Global (issuer_mode=global)
# Note: global issuer URL does NOT serve a JSON discovery doc at /.well-known/
# Use per-provider mode for apps that auto-discover endpoints (BookStack, etc.)
```
### Changing provider issuer_mode via SQL
```bash
docker run --rm --network host \
-e PGPASSWORD="<REDACTED>" \
postgres:16 psql -h <KSCLOUD1_TAILSCALE_IP> -U authentik authentik -c \
"UPDATE authentik_providers_oauth2_oauth2provider SET issuer_mode='per_provider' WHERE provider_ptr_id=<ID>;"
```
---
## Portainer
### OAuth setup (Authentik)
Portainer CE uses AuthenticationMethod=3 (OAuth). Configured via the BoltDB.
Key settings:
- `OAuthLoginURI`: `https://auth.kitestacks.com/application/o/authorize/`
- `OAuthTokenURI`: `https://auth.kitestacks.com/application/o/token/`
- `OAuthUserURI`: `https://auth.kitestacks.com/application/o/userinfo/`
- `OAuthClientID`: `portainer`
- `OAuthRedirectURI`: `https://portainer.kitestacks.com`
- `OAuthAutoCreateUsers`: `true`
- `OAuthDefaultTeamID`: `0`
### Pre-creating an admin user before first OAuth login
OAuth auto-created users default to Role:2 (regular user) and can't see environments.
Pre-create them as Role:1 (admin) via the API before they log in:
```bash
# Get auth token
TOKEN=$(curl -sk -X POST https://portainer.kitestacks.com/api/auth \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"<REDACTED>"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['jwt'])")
# Create user as admin (Role:1), no password needed for OAuth users
curl -sk -X POST "https://portainer.kitestacks.com/api/users" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"username":"user@example.com","role":1}'
```
### Reset admin password (if locked out)
```bash
# Stop Portainer
docker stop portainer
# Reset password (shows new temp password)
docker run --rm -v portainer_data:/data portainer/helper-reset-password
# Restart
docker start portainer
```
---
## BookStack
### Setup (both monk and kscloud1)
Location:
- monk: `~/kitestacks-live/docker/bookstack/docker-compose.yml`
- kscloud1: `/opt/kitestacks/docker/bookstack/docker-compose.yml`
Key environment variables:
```yaml
- APP_URL=https://wiki.kitestacks.com
- DB_HOST=bookstack-db
- AUTH_METHOD=oidc
- OIDC_ISSUER=https://auth.kitestacks.com/application/o/bookstack/
- OIDC_ISSUER_DISCOVER=true
- OIDC_CLIENT_ID=bookstack
- OIDC_CLIENT_SECRET=<REDACTED>
- OIDC_USER_ATTRIBUTE=email
- APP_KEY=<REDACTED>
```
### Generate APP_KEY
```bash
docker run --rm --entrypoint /bin/bash lscr.io/linuxserver/bookstack:latest appkey
```
### OIDC Configuration
BookStack uses `OIDC_ISSUER_DISCOVER=true` to auto-discover all endpoints from Authentik.
The `OIDC_ISSUER` must match the per-app discovery URL base (not the global Authentik URL).
The Authentik bookstack provider must have `issuer_mode='per_provider'` so its discovery
document returns the correct per-app issuer URL. See Debug doc for full troubleshooting.
### Fix cache permissions after artisan runs
Running `php artisan` as root creates root-owned cache dirs that block the app:
```bash
docker exec bookstack chown -R abc:users /config/www/framework/cache/
```
### Clear Laravel config/cache
```bash
docker exec bookstack php /app/www/artisan config:clear
docker exec bookstack php /app/www/artisan cache:clear
```
---
## kscloud1 Access
### SSH
```bash
ssh -i ~/.ssh/id_ed25519_kscloud1 root@<KSCLOUD1_TAILSCALE_IP>
```
### If SSH key is lost / not working
1. Open Hetzner Cloud console: `console.hetzner.cloud` → your server → Console tab
2. Log in as `root` (Linux user password)
3. Serve the key from monk over Tailscale:
```bash
# On monk — start temporary HTTP server
cat ~/.ssh/id_ed25519_kscloud1.pub > ~/key.txt
python3 -m http.server 7777 --directory ~/
```
4. In Hetzner console, type:
```bash
curl http://<MONK_TAILSCALE_IP>:7777/key.txt > /root/.ssh/authorized_keys
```
5. Enable root SSH (if needed):
```bash
sed -i 's/^#*PermitRootLogin.*/PermitRootLogin prohibit-password/' /etc/ssh/sshd_config
systemctl restart ssh
```
---
## OSTicket SMTP
**Config:** smtp.gmail.com:587, STARTTLS
**From:** `kitestacks.helpdesk@gmail.com` (app password stored in DB)
To test email delivery: Admin Panel → Diagnostics → Send Test Email
---
## Forgejo
Runs on monk at `localhost:3006` (port 2222 for SSH git).
### Generate API token for automation
```bash
docker exec -u git forgejo forgejo admin user generate-access-token \
--username kenpat --token-name "my-token" --raw \
--scopes "read:user,write:user,read:repository,write:repository"
```
---
## Common Docker Operations
```bash
# View logs for a service
docker logs <container> --tail 50 -f
# Restart a service
cd ~/kitestacks-live/docker/<service> && docker compose restart
# Full stack restart
docker compose down && docker compose up -d
# Update a container image
docker compose pull && docker compose up -d
# Check all running containers
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
```