kitestacks-homelab/homelab-mastery/build-guide/with-ai/07-cloud-failover.md
kenpat 1e8319ee75 docs: comprehensive homelab-mastery rewrite with full build guides
Complete documentation suite for KiteStacks covering all 11 services across
2-host active-active architecture. Includes beginner track (with AI, 8 files)
and advanced track (without AI, 7 files) with time estimates, real troubleshooting
cases, and command-by-command explanations. Updates certifications roadmap to
reflect July 7 2026 A+ Core 2 exam goal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 01:08:43 -05:00

202 lines
6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Step 7 — Cloud Failover (kscloud1)
**Track:** With AI (Beginner)
**Time for this step:** 46 hours
Right now, if your home computer goes off, your entire website goes offline. This step
fixes that. You will turn your cloud VPS (kscloud1) into a full mirror of your homelab,
so that when your home computer is off, kscloud1 keeps everything running.
---
## What You Are Building
```
Home (monk) ←—— always developing ——→ pushes to ——→ Cloud (kscloud1)
always live
never goes down
Cloudflare routes traffic to whichever host responds.
If monk is off, kscloud1 handles everything by itself.
```
---
## Step 7A — Set Up Tailscale on Both Machines
Tailscale creates a private, encrypted connection between your home computer and your VPS.
You need this so both machines can share a database securely.
**On your home computer:**
```bash
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
```
Follow the link it gives you to authenticate in your browser.
**On your VPS (via SSH):**
```bash
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
```
Authenticate again.
After both are connected, check their Tailscale IPs:
```bash
tailscale ip -4
```
Write down both IPs — they look like `100.x.x.x`. You will use these in the next steps.
**Ask your AI:** "I have Tailscale installed on two machines. How do I verify they can
reach each other using their Tailscale IPs?"
---
## Step 7B — Move the Shared Databases to kscloud1
For SSO to work properly across both machines, both Authentik instances must share
one database. If they have separate databases, logins will fail roughly half the time.
This means:
- Move (or start fresh) Postgres and Redis on kscloud1
- Configure both monk and kscloud1's Authentik to point to kscloud1's database over Tailscale
**On kscloud1**, create the database containers. Use the same passwords you used on monk:
```bash
mkdir -p /opt/kitestacks/docker/authentik
cd /opt/kitestacks/docker/authentik
```
Create `docker-compose.yml` with Postgres and Redis bound to the Tailscale IP:
```yaml
services:
authentik-postgres:
image: postgres:16-alpine
container_name: authentik-postgres
restart: unless-stopped
environment:
POSTGRES_PASSWORD: your-db-password
POSTGRES_USER: authentik
POSTGRES_DB: authentik
ports:
- "100.123.x.x:5432:5432" # bind to Tailscale IP only
volumes:
- ./postgres:/var/lib/postgresql/data
networks:
- kitestacks
authentik-redis:
image: redis:alpine
container_name: authentik-redis
restart: unless-stopped
ports:
- "100.123.x.x:6379:6379" # bind to Tailscale IP only
networks:
- kitestacks
networks:
kitestacks:
external: true
```
Replace `100.123.x.x` with kscloud1's actual Tailscale IP.
```bash
docker compose up -d
```
**On monk**, update Authentik's environment to point to kscloud1's database:
```
AUTHENTIK_POSTGRESQL__HOST=100.123.x.x # kscloud1's Tailscale IP
AUTHENTIK_REDIS__HOST=100.123.x.x
```
Restart Authentik on monk:
```bash
cd ~/kitestacks-live/docker/authentik
docker compose down
docker compose up -d
```
**Ask your AI:** "I need to migrate my Authentik database from one host to another.
How do I dump the data from my current Postgres and restore it on the new host?"
---
## Step 7C — Deploy All Services on kscloud1
Now deploy the same services on kscloud1. SSH into your VPS and create the same
folder structure and docker-compose files that you have on monk.
```bash
mkdir -p /opt/kitestacks/docker
```
For each service (forgejo, homepage, karakeep, kavita, grafana, etc.):
1. Create the folder: `mkdir -p /opt/kitestacks/docker/<service>`
2. Copy your docker-compose.yml from monk (with any path changes for `/opt/kitestacks/`)
3. Copy your .env files
4. Run `docker compose up -d`
The fastest way is to have your AI help you:
> "I have all my services running on my home computer at ~/kitestacks-live/docker/.
> I want to replicate them on my VPS at /opt/kitestacks/docker/. Can you help me
> go through each service and identify what needs to change for the VPS environment?"
**Important differences on kscloud1:**
- Authentik already points to the shared Postgres/Redis (same as monk now)
- Forgejo should also use the shared Postgres (add a `forgejo` database to it)
- Paths use `/opt/kitestacks/` instead of `~/kitestacks-live/`
---
## Step 7D — Verify Failover Works
With both machines running and both cloudflared connectors active, test that failover works:
1. In your Cloudflare Tunnel dashboard, you should see **2 connectors**
2. Visit your website from your phone (not connected to home WiFi)
3. Everything should work
4. Now stop monk's cloudflared: `cd ~/kitestacks-live/docker/cloudflared && docker compose stop`
5. Visit your website again from your phone
6. Everything should still work (kscloud1 is serving it)
7. Restart monk's cloudflared: `docker compose start cloudflared`
If step 6 works, your cloud failover is complete.
---
## Step 7E — Set Up Uptime Kuma on kscloud1
Your Conky desktop widget reads Uptime Kuma from kscloud1 (not monk). Set it up there:
Deploy uptime-kuma on kscloud1 the same way you did on monk. Then push your monitors
from monk to kscloud1 by copying the database.
**Ask your AI:** "How do I copy a SQLite database from one Docker container to another
on a different machine, safely and without data corruption?"
The trick is using Python's `sqlite3.backup()` method — it creates a consistent copy
even while the database is in use.
---
## Checkpoint
- [ ] Tailscale is installed on both machines and they can reach each other
- [ ] Shared Postgres and Redis are running on kscloud1's Tailscale IP
- [ ] Both Authentik instances (monk and kscloud1) point to the shared database
- [ ] All 11 services are running on kscloud1
- [ ] Cloudflare Tunnel shows 2 connectors
- [ ] Website works when monk's cloudflared is stopped
---
**Next:** [Step 8 — Monitoring](08-monitoring.md)