kitestacks-homelab/homelab-mastery/build-guide/with-ai/08-monitoring.md
kenpat 1e8319ee75 docs: comprehensive homelab-mastery rewrite with full build guides
Complete documentation suite for KiteStacks covering all 11 services across
2-host active-active architecture. Includes beginner track (with AI, 8 files)
and advanced track (without AI, 7 files) with time estimates, real troubleshooting
cases, and command-by-command explanations. Updates certifications roadmap to
reflect July 7 2026 A+ Core 2 exam goal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 01:08:43 -05:00

229 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Step 8 — Monitoring
**Track:** With AI (Beginner)
**Time for this step:** 23 hours
Monitoring means knowing when something is wrong before your users tell you.
In this step you will set up three layers of monitoring:
1. **Grafana** — beautiful dashboards showing CPU, RAM, disk, and network over time
2. **Uptime Kuma** — checks every 60 seconds that each service responds correctly
3. **Conky** — a desktop widget on your home computer showing live kscloud1 status
---
## Monitoring Layer 1 — Grafana + Prometheus
You already deployed Grafana and Prometheus in Step 5. Now configure them properly.
### Edit the Prometheus Config
Prometheus needs to know where to collect metrics from. Tell it about both machines:
```bash
nano ~/kitestacks-live/docker/prometheus/prometheus.yml
```
Add this content:
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'monk-node'
static_configs:
- targets: ['node-exporter:9100']
labels:
instance: 'monk'
- job_name: 'kscloud1-node'
static_configs:
- targets: ['YOUR_VPS_IP:9100']
labels:
instance: 'kscloud1'
```
Replace `YOUR_VPS_IP` with your VPS's public IP address.
**On kscloud1**, make sure node-exporter is configured to be reachable publicly:
```yaml
# In node-exporter's docker-compose.yml on kscloud1
ports:
- "0.0.0.0:9100:9100"
```
Restart Prometheus:
```bash
cd ~/kitestacks-live/docker/prometheus
docker compose restart prometheus
```
### Configure Grafana Provisioning
Tell Grafana to automatically load Prometheus as a data source and load the
Node Exporter Full dashboard:
Create `~/kitestacks-live/docker/grafana/provisioning/datasources/prometheus.yml`:
```yaml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
uid: 000000001
url: http://prometheus:9090
isDefault: true
```
Create `~/kitestacks-live/docker/grafana/provisioning/dashboards/dashboards.yml`:
```yaml
apiVersion: 1
providers:
- name: default
folder: KiteStacks
type: file
options:
path: /etc/grafana/provisioning/dashboards
```
The Node Exporter Full dashboard (id 1860) can be imported from Grafana's dashboard library:
1. Log in to grafana.yourdomain.com
2. Left menu → Dashboards → Import
3. Enter ID: `1860`
4. Select your Prometheus datasource
5. Import
You should now see CPU, RAM, disk, and network graphs for both monk and kscloud1.
Switch between them using the "instance" dropdown at the top of the dashboard.
---
## Monitoring Layer 2 — Uptime Kuma
You set up Uptime Kuma in Step 5. Now add monitors for all your services.
Log in to `status.yourdomain.com` and add an HTTP monitor for each service:
| Monitor Name | URL | Check Interval |
|-------------|-----|----------------|
| Main Website | https://www.yourdomain.com | 60s |
| Authentik | https://auth.yourdomain.com | 60s |
| Forgejo | https://gitforge.yourdomain.com | 60s |
| KiteAI | https://ai.yourdomain.com | 60s |
| Karakeep | https://links.yourdomain.com | 60s |
| Kavita | https://kavita.yourdomain.com | 60s |
| Grafana | https://grafana.yourdomain.com | 60s |
| BookStack | https://wiki.yourdomain.com | 60s |
| OSTicket | https://tasks.yourdomain.com | 60s |
| Portainer | https://portainer.yourdomain.com | 60s |
| kscloud1 | (ping to kscloud1 IP) | 60s |
| Monk | (ping to monk's Tailscale IP) | 60s |
Then create a Status Page:
1. Status Pages → New Status Page
2. Title: "KiteStacks Status"
3. Slug: `homelab`
4. Add all monitors to it
**Push Uptime Kuma to kscloud1:**
The Conky widget on your desktop reads kscloud1's Uptime Kuma, not monk's. Push monk's
database to kscloud1 after setting up monitors:
**Ask your AI:** "How do I copy a Docker named volume's SQLite database from one machine
to another using Python's sqlite3.backup() method?"
---
## Monitoring Layer 3 — Conky Desktop Widget
Conky is a program that draws information on your desktop background in real time.
Your KiteStacks widget shows whether each service on kscloud1 is up (green dot) or
down (red dot), refreshed every 15 seconds.
### Install Conky
```bash
sudo apt install conky-all
```
### Install the Widget Script
The widget script reads Uptime Kuma's API and formats the output for Conky.
The script is at `~/.local/bin/kitestacks-uptime-widget.sh` in the homelab repo.
Copy it to your machine:
```bash
mkdir -p ~/.local/bin
cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime-widget.sh ~/.local/bin/
chmod +x ~/.local/bin/kitestacks-uptime-widget.sh
```
Edit the script to use your kscloud1's Tailscale IP:
```bash
nano ~/.local/bin/kitestacks-uptime-widget.sh
```
Change the `KUMA_URL` line:
```bash
KUMA_URL="http://100.123.x.x:3001" # kscloud1's Tailscale IP
```
### Enable the Conky Config
```bash
cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime.conf ~/.config/conky/kitestacks-uptime.conf
conky -c ~/.config/conky/kitestacks-uptime.conf -d
```
The widget should appear in the top-right corner of your desktop, showing a dot for
each service — green for up, red for down.
**Ask your AI:** "How do I make Conky start automatically when I log in to my Ubuntu desktop?"
---
## Setting Up Alerts
Uptime Kuma can send you a notification on your phone when a service goes down.
**Option 1: ntfy (recommended — self-hosted)**
You have ntfy running as a container. Set up an ntfy notification in Uptime Kuma:
- Notification Type: ntfy
- URL: your ntfy server URL
- Topic: choose a topic name (e.g., `homelab-alerts`)
Install the ntfy app on your phone and subscribe to your topic.
**Option 2: Email**
Configure email notifications in Uptime Kuma using your email address.
**Ask your AI:** "How do I configure Uptime Kuma to send notifications via ntfy?"
---
## Checkpoint
- [ ] Prometheus is collecting metrics from both monk and kscloud1
- [ ] Grafana shows Node Exporter Full dashboard with both hosts
- [ ] Uptime Kuma has monitors for all 11 services
- [ ] Uptime Kuma status page is live at status.yourdomain.com/status/homelab
- [ ] Uptime Kuma database has been pushed to kscloud1
- [ ] Conky widget is showing on your desktop with live service status
- [ ] You receive a notification when you manually pause a service in Uptime Kuma
---
## Congratulations — Your Homelab Is Complete
You have built a production homelab with:
- 11 self-hosted services running in Docker
- Single sign-on via Authentik
- Cloud failover on a Hetzner VPS
- Private networking over Tailscale
- Real-time monitoring via Grafana and Uptime Kuma
- A live desktop status widget
Everything you built here maps directly to enterprise cloud engineering skills.
Every concept has a certification that covers it in depth.
**Your next step:** [certifications/roadmap.md](../../certifications/roadmap.md)