# Step 8 — Monitoring **Track:** With AI (Beginner) **Time for this step:** 2–3 hours Monitoring means knowing when something is wrong before your users tell you. In this step you will set up three layers of monitoring: 1. **Grafana** — beautiful dashboards showing CPU, RAM, disk, and network over time 2. **Uptime Kuma** — checks every 60 seconds that each service responds correctly 3. **Conky** — a desktop widget on your home computer showing live kscloud1 status --- ## Monitoring Layer 1 — Grafana + Prometheus You already deployed Grafana and Prometheus in Step 5. Now configure them properly. ### Edit the Prometheus Config Prometheus needs to know where to collect metrics from. Tell it about both machines: ```bash nano ~/kitestacks-live/docker/prometheus/prometheus.yml ``` Add this content: ```yaml global: scrape_interval: 15s scrape_configs: - job_name: 'monk-node' static_configs: - targets: ['node-exporter:9100'] labels: instance: 'monk' - job_name: 'kscloud1-node' static_configs: - targets: ['YOUR_VPS_IP:9100'] labels: instance: 'kscloud1' ``` Replace `YOUR_VPS_IP` with your VPS's public IP address. **On kscloud1**, make sure node-exporter is configured to be reachable publicly: ```yaml # In node-exporter's docker-compose.yml on kscloud1 ports: - "0.0.0.0:9100:9100" ``` Restart Prometheus: ```bash cd ~/kitestacks-live/docker/prometheus docker compose restart prometheus ``` ### Configure Grafana Provisioning Tell Grafana to automatically load Prometheus as a data source and load the Node Exporter Full dashboard: Create `~/kitestacks-live/docker/grafana/provisioning/datasources/prometheus.yml`: ```yaml apiVersion: 1 datasources: - name: Prometheus type: prometheus uid: 000000001 url: http://prometheus:9090 isDefault: true ``` Create `~/kitestacks-live/docker/grafana/provisioning/dashboards/dashboards.yml`: ```yaml apiVersion: 1 providers: - name: default folder: KiteStacks type: file options: path: /etc/grafana/provisioning/dashboards ``` The Node Exporter Full dashboard (id 1860) can be imported from Grafana's dashboard library: 1. Log in to grafana.yourdomain.com 2. Left menu → Dashboards → Import 3. Enter ID: `1860` 4. Select your Prometheus datasource 5. Import You should now see CPU, RAM, disk, and network graphs for both monk and kscloud1. Switch between them using the "instance" dropdown at the top of the dashboard. --- ## Monitoring Layer 2 — Uptime Kuma You set up Uptime Kuma in Step 5. Now add monitors for all your services. Log in to `status.yourdomain.com` and add an HTTP monitor for each service: | Monitor Name | URL | Check Interval | |-------------|-----|----------------| | Main Website | https://www.yourdomain.com | 60s | | Authentik | https://auth.yourdomain.com | 60s | | Forgejo | https://gitforge.yourdomain.com | 60s | | KiteAI | https://ai.yourdomain.com | 60s | | Karakeep | https://links.yourdomain.com | 60s | | Kavita | https://kavita.yourdomain.com | 60s | | Grafana | https://grafana.yourdomain.com | 60s | | BookStack | https://wiki.yourdomain.com | 60s | | OSTicket | https://tasks.yourdomain.com | 60s | | Portainer | https://portainer.yourdomain.com | 60s | | kscloud1 | (ping to kscloud1 IP) | 60s | | Monk | (ping to monk's Tailscale IP) | 60s | Then create a Status Page: 1. Status Pages → New Status Page 2. Title: "KiteStacks Status" 3. Slug: `homelab` 4. Add all monitors to it **Push Uptime Kuma to kscloud1:** The Conky widget on your desktop reads kscloud1's Uptime Kuma, not monk's. Push monk's database to kscloud1 after setting up monitors: **Ask your AI:** "How do I copy a Docker named volume's SQLite database from one machine to another using Python's sqlite3.backup() method?" --- ## Monitoring Layer 3 — Conky Desktop Widget Conky is a program that draws information on your desktop background in real time. Your KiteStacks widget shows whether each service on kscloud1 is up (green dot) or down (red dot), refreshed every 15 seconds. ### Install Conky ```bash sudo apt install conky-all ``` ### Install the Widget Script The widget script reads Uptime Kuma's API and formats the output for Conky. The script is at `~/.local/bin/kitestacks-uptime-widget.sh` in the homelab repo. Copy it to your machine: ```bash mkdir -p ~/.local/bin cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime-widget.sh ~/.local/bin/ chmod +x ~/.local/bin/kitestacks-uptime-widget.sh ``` Edit the script to use your kscloud1's Tailscale IP: ```bash nano ~/.local/bin/kitestacks-uptime-widget.sh ``` Change the `KUMA_URL` line: ```bash KUMA_URL="http://100.123.x.x:3001" # kscloud1's Tailscale IP ``` ### Enable the Conky Config ```bash cp ~/kitestacks-homelab/apps/conky/kitestacks-uptime.conf ~/.config/conky/kitestacks-uptime.conf conky -c ~/.config/conky/kitestacks-uptime.conf -d ``` The widget should appear in the top-right corner of your desktop, showing a dot for each service — green for up, red for down. **Ask your AI:** "How do I make Conky start automatically when I log in to my Ubuntu desktop?" --- ## Setting Up Alerts Uptime Kuma can send you a notification on your phone when a service goes down. **Option 1: ntfy (recommended — self-hosted)** You have ntfy running as a container. Set up an ntfy notification in Uptime Kuma: - Notification Type: ntfy - URL: your ntfy server URL - Topic: choose a topic name (e.g., `homelab-alerts`) Install the ntfy app on your phone and subscribe to your topic. **Option 2: Email** Configure email notifications in Uptime Kuma using your email address. **Ask your AI:** "How do I configure Uptime Kuma to send notifications via ntfy?" --- ## Checkpoint - [ ] Prometheus is collecting metrics from both monk and kscloud1 - [ ] Grafana shows Node Exporter Full dashboard with both hosts - [ ] Uptime Kuma has monitors for all 11 services - [ ] Uptime Kuma status page is live at status.yourdomain.com/status/homelab - [ ] Uptime Kuma database has been pushed to kscloud1 - [ ] Conky widget is showing on your desktop with live service status - [ ] You receive a notification when you manually pause a service in Uptime Kuma --- ## Congratulations — Your Homelab Is Complete You have built a production homelab with: - 11 self-hosted services running in Docker - Single sign-on via Authentik - Cloud failover on a Hetzner VPS - Private networking over Tailscale - Real-time monitoring via Grafana and Uptime Kuma - A live desktop status widget Everything you built here maps directly to enterprise cloud engineering skills. Every concept has a certification that covers it in depth. **Your next step:** [certifications/roadmap.md](../../certifications/roadmap.md)