Network monitoring #1: Server temperatures, MQTT and Bash

It’s always a good idea to know the state of the machines on your network. Plus, I’m a geek, and messing around with data appeals to me.

One metric I especially like to keep an eye on is CPU temperature. Why? Why not?

I’ve recently got into InfluxDB and Grafana, and that’s the way all future projects will be going. I’ll be getting into those in future posts. Before that, though, I implemented a somewhat more piecemeal approach which pretty much works. And I’m keeping it, because it provides a backup and sanity check.

This is what I like to call my ‘server room’. The dark box on the top shelf is the Odroid H2+. This is a slightly earlier incarnation than I have currently, with the bottom Raspberry Pi in the rack being a 3B+ model. It has since been replaced with a 4B.

What’s in a network?

Here’s a quick overview of my network – or, at least, the salient parts of it. There are four Raspberry Pi 4Bs in a rack – named Ada, Lulu, Peach and Polar.

Ada runs a dnsmasq caching DNS server and a Samba share.

Lulu runs InfluxDB 1.8 (with Chronograf and Telegraf), Grafana and Bookstack (actually a mirror of Polar below), all in Docker containers.

Peach runs Pi-Hole natively and ejabberd in a Docker container. It also runs an FTP server.

Polar (so-called because it’s fitted with an Ice Tower cooler) runs Bookstack and Homer in Docker containers. It also runs a Python script each hour to test and report my Internet speed.

These machines are all a little under-used at the moment, but I have plans. They all run Raspberry Pi OS Lite.

There’s also Dawnclock, a Raspberry Pi-based alarm clock.

And I have an Odroid H2+ called Pushkin (named after our cat) as my main server, running Ubuntu. This has 64GB of eMMC from which the OS runs, a 240GB NVMe for the main apps (including Docker), a 240GB SSD for a Samba share and a 1TD HDD which is first-line backup for the other drives. (I also do nightly off-site backups to AWS S3 – if you want to know how I do that, ask in the comments). Pushkin runs a bunch of things:

  • The aforementioned SMB share.
  • MQTT broker (Mosquitto).
  • MySQL server.
  • Intranet server (PHP, Apache).
  • Nextcloud server.
  • Portainer.
  • Zabbix server.
  • A REST API server written by me in Go.
  • MQTT-logger.py – a Python-based MQTT message monitoring program written by me. This actually plays a part in what’s coming. I won’t be sharing full code because, well, it’s ugly.

Finally, my daily driver machine is an iMac called Zola (named after our former dog) which, because it’s always left on, I treat kind of like a server.

Feeling hot, hot, hot

My first attempt to monitor CPU temps involved having a Bash script on each machine that would get run via a cron job every five minutes. The script squawks out the temperature reading via MQTT. So the first step was to install mosquitto-clients on each machine. On the Linux boxen (the RPis and the Odroid) I simply installed with apt:

sudo apt install mosquitto-clients

And here’s the Bash script:

#!/usr/bin/env bash
MQTT_SVR="10.0.30.60"
TOPIC="server/status"
MESSAGE="${HOSTNAME}_cpu"

CPU=$(</sys/class/thermal/thermal_zone0/temp)
CPU=$((CPU/1000))
MESSAGE="${MESSAGE}_${CPU}"

mosquitto_pub -h $MQTT_SVR -t $TOPIC -m $MESSAGE -q 1

exit 0

MQTT_SVR is the IP address of the server running Mosquitto, the MQTT broker, on our network – ie, Pushkin. That same machine, as I mentioned above, is running a Python program that logs MQTT messages. It recognises certain ones and writes the data to text files that are in the document root of the intranet server. The intranet server, in turn, uses PHP to display the contains of these files on the home page.

The relevant part of that page is shown  on the right. The machines in grey are ones that haven’t reported a temperature today – most likely because they’re turned off.

You can see the names of the machines on which it’s running. As well as the servers, Argon and M2 are Raspberry Pi 4Bs in Argon cases, and Homer and Troy are Odyssey machines. They all run the script fine.

Zola, the iMac, is missing from the list because it can’t run that script.

This system seems (and is) complicated, but it mostly works. For some reason, though, the MQTT monitoring program does occasionally miss a message. I’m not sure why. Even playing with QoS settings doesn’t seem to help.

That’s why I decided to up my game and do something a little more direct. And this is where InfluxDB and Grafana come in.

We’ll get into that in Part 2.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.