Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

Mounting LVM partitions from the terminal on Linux

Hello there! Recently I found myself with the interesting task of mounting an LVM partition by hand. It wasn't completely straightforward and there was a bunch of guesswork involved, so I thought I'd document the process here.

For those who aren't aware, LVM stands for the Logical Volume Manager, and it's present on Linux system to make managing partitions easier. It can:

  • Move and resize partitions while they are still mounted
  • Span multiple disks

....but to my knowledge it doesn't have any redundancy (use Btrfs) or encryption (use LUKS) built in. It is commonly used to manage the partitions on your Linux desktop, as then you don't need to reboot it into a live Linux environment to fiddle with your partitions as much.

LVM works on a layered system. There are 3 layers to it:

  1. Physical Volumes: Normal physical partitions on the disk.
  2. Volume Groups: Groups of logical (LVM) partitions.
  3. Logical Volumes: LVM-managed partitions.

In summary, logical volumes are part of a volume group, which spans 1 or more physical disks.

With this in mind, first list the available physical volumes and their associated volume groups, and identify which is the one you want to mount:

sudo vgdisplay

Notice the VG Size in the output. Comparing it with the output of lsblk -o NAME,RO,SIZE,RM,TYPE,MOUNTPOINT,LABEL,VENDOR,MODEL can be helpful to identify which one is which.

I encountered a situation where I had 2 with the same name - one from my host system I was working on, and another from the target disk I was trying to mount. In my situation each disk had it's own volume group assigned to it, so I needed to rename one of the volumes.

To do this, take the value of the VG UUID field of the volume group you want to rename from the output of sudo vgdisplay above, and then rename it like this:

sudo vgrename SOME_ID NEW_NAME

...for example, I did this:

sudo vgrename 5o1LoG-jFdv-v1Xm-m0Ca-vYmt-D5Wf-9AAFLm examplename

With that done, we can now locate the logical volume we want to mount. Do this by listing the logical volumes in the volume group you're interested in:

sudo lvdisplay vg_name

Note down the name of the logical volume you want to mount. Now we just need to figure out where it is actually located in /dev so that we can mount it. Despite the LV Path field appearing to show us this, it's not actually correct - at least on my system.

Instead, list the contents of /dev/mapper:

ls /dev/mapper

You should see the name of the logical volume that you want to mount in the form volumegroup-logicalvolumename. Once found, you should be able to mount it like so:

sudo mount /dev/mapper/volumegroup-logicalvolumename path/to/directory

...replacing path/to/directory with the path to the (empty) directory you want to mount it to.

If you can't find it, then it is probably because you plugged the drive in question in after you booted up. In this case, it's probable that the volume group is not active. You can check this is the case or not like so:

sudo lvscan

If it isn't active, then you can activate it like this:

sudo lvchange -a y vg_name

...replacing vg_name with the name of the volume group you want to activate. Once done, you can then mount the logical volume as I mentioned above.

Once you are done, unmounting it is a case of reversing these steps. First, unmount the partition:

sudo umount path/to/mount_point

Then, disable the volume group again:

sudo lvchange -a n vg_name

Finally, flush any cached writes to disk, just in case:

sync

Now, you can unplug the device from your machine.

That wraps up this quick tutorial. If you spot any mistakes in this, please do leave a comment below and I'll correct it.

Configuring an endlessh honeypot with rsyslog email notifications

Security is all about defence in depth, so I'm always looking for ways to better secure my home network. For example, I have cluster management traffic running over a Wireguard mesh VPN. Now, I'm turning my attention to the rest of my network.

To this end, while I have a guest network with wireless isolation enabled, I do not currently have a way to detect unauthorised devices connecting to my home WiFi network, or fake WiFi networks with the same name, etc. Detecting this is my next focus. While I've seen nzyme recently and it looks fantastic, it also looks more complicated to setup.

While I look into the documentation for nzyme, inspired by this reddit post I decided to setup a honeypot on my home network.

The goal of a honeypot is to detect threats moving around in a network. In my case, I want to detect if someone has connected to my network who shouldn't have done. Honeypots achieve this by pretending to be a popular service, but in reality they are there to collect information about potential threats.

To set one up, I found endlessh, which pretends to be an SSH server - but instead slowly sends an endless banner to the client, keeping the connection open as long as possible. It can also connection attempts to syslog, which allows us to detect connections and send an alert.

Implementing this comes in 2 steps. First, we setup endlessh and configure it to log connection attempts. Then, we reconfigure rsyslog to send email alerts.

Setting up endlessh

I'm working on one of the Raspberry Pis running Raspberry Pi OS in my network, but this should with with other machines too.

If you're following along to implement this yourself, make sure you've moved SSH to another port number before you continue, as we'll be configuring endlessh to listen on port 22 - the default port for ssh, as this is the port I imagine that an automated network scanner might attempt to connect to by default if it were looking for ssh servers to attempt to crack.

Conveniently, endlessh has a package in the default Debian repositories:

sudo apt install endlessh

...adjust this for your own package manager if you aren't on an apt-based system.

endlessh has a configuration file at /etc/endlessh/config by default. Open it up for editing, and make it look something like this:

# The port on which to listen for new SSH connections.
Port 22

# Set the detail level for the log.
#   0 = Quiet
#   1 = Standard, useful log messages
#   2 = Very noisy debugging information
LogLevel 1

Beforee we can start the endlessh service, we need to reconfigure it to allow it to listen on port 22, as this is a privileged port number. Doing this requires 2 steps. First, allow the binary to listen on privileged ports:

sudo setcap CAP_NET_BIND_SERVICE=+eip "$(which "endlessh")";

Then, if you are running systemd (most distributions do by default), execute the following command:

sudo systemctl edit endlessh.service

This will allow you to append some additional directives to the service definition for endlessh, without editing the original apt-managed systemd service file. Add the following, and then save and quit:

[Service]
AmbientCapabilities=CAP_NET_BIND_SERVICE
PrivateUsers=false

Finally, we can restart the endlessh service:

sudo systemctl restart endlessh
sudo systemctl enable --now endlessh

That completes the setup of endlessh!

Configuring rsyslog to send email alerts

The second part of this process is to send automatic alerts whenever anyone connects to our endlessh service. Since endlessh forwards logs to syslog by default, reconfiguring rsyslog to send the alerts seems like the logical choice. In my case, I'm going to send email alerts - but other ways of sending alerts do exist - I just haven't looked into them yet.

To do this requires that you have either a working email server (I followed the Ars Technica taking email back series, but whatever you do it's not for the faint for heart! Command line experience is definitely required - if you're looking for a nice first project to try, a web server instead), or an email account you can use. Note that I do not recommend using your own personal email account, as you'll have to store the password in plain text!

In my case, I have my own email server, and I have forwarded port 25 down an SSH tunnel so that I can use it to send emails (in the future I want to configure a proper smart host that listen on port 25 and forwards emails by authenticating against my server properly, but that's for another time as I have yet to find a relay-only MTA that also listens on port 25).

In a previous post, implemented centralised logging - so I'm going to be reconfiguring my main centralised rsyslog instance.

To do this, open up /etc/rsyslog.d/10-endlessh.conf for editing, and paste in something like this:

template (name="mailSubjectEndlessh" type="string" string="[HONEYPOT] endlessh connection on %hostname%")

if ( ($programname == 'endlessh') and (($msg contains "ACCEPT") or ($msg contains "CLOSE")) ) then {
    action(type="ommail" server="localhost" port="20205"
        mailfrom="[email protected]"
        mailto=["[email protected]"]
        subject.template="mailSubjectEndlessh"
        action.execonlyonceeveryinterval="3600"
    )
}

...where:

  • [HONEYPOT] endlessh connection on %hostname% is the subject name, and %hostname% is substituted for the actual hostname the honeypot is running on
  • [email protected] is the address that you want to send the alert FROM
  • [email protected] is the address that you want to send the alert TO
  • 3600 is the minimum interval between emails, in seconds. Log lines are not collected up - only 1 log line is sent at a time, and others logged in-between are ignored and handled as if the above email directive doesn't exist until the given number of seconds expires - at which point it will then email for the next log line that comes through, and the cycle then repeats. If anyone knows how to change that, please leave a command below.

Note that the template line is outside the if statement. This is important - I got a syntax error if I put it inside the if statement.

The if statement specifically looks for log messages with a tag of endlessh that contain either the substring ACCEPT or CLOSE. Only if those conditions are true will it send an email.

I have yet to learn how to configure rsyslog to authenticate while sending emails. I would suspect though that the easiest way of achieving this is to setup a local SMTP relay-only MTA (Mail Transfer Agent) that rsyslog can connect to and send emails, and then the relay will authenticate against the real server and send the email on rsyslog's behalf. I have yet to find such an MTA however other than Postfix - which, while great, can be hugely complicated to setup. Other alternatives I've tried include:

....but they all implement sendmail and while that's useful they do not listen on port 25 (or any other port for that matter) as far as I can tell.

Anyway, the other file you need to edit is /etc/rsyslog.conf. Open it up for editing, and put this near the top:

module(load="ommail")

...this loads the mail output plugin that sends the emails.

Now that we've reconfigured rsyslog, we need to restart it:

sudo systemctl restart rsyslog

rsyslog is picky about it's config file syntax, so make sure to check it's status for error messages:

sudo systemctl status rsyslog

You can also use lnav analyse your logs and find any error messages there too.

Conclusion

We've setup endlessh as a honeypot, and then reconfigured rsyslog to send email alerts. Test the system like so on your local machine:

ssh -vvv -p 22 someuser@yourserver

...and watch your inbox for the email alert that will follow shortly!

While this system isn't particularly useful on it's own, it's a small part of a larger strategy for securing my network. It's also been a testing ground for me to configure rsyslog to send email alerts - something I may want to configure my centralised rsyslog logging system to do for other things in the future.

If you've found this post useful or you have some suggestions, please leave a comment below!

Sources and further reading

Centralising logs with rsyslog

I manage quite a number of servers at this point, and something that's been on my mind for a while now is centralising all the log files generated by them. By this, specifically I mean that I want to automatically gather all logs generated by all the systems I manage into a single place in real time.

While there are enterprise-grade log management setups such as the ELK stack (elasticsearch, logstash, and kibana), as far as I'm aware they are all quite heavy and given my infrastructure is Raspberry Pi based (seriously, they use hardly any electricity at all compared to a regular desktop PC), with such a setup I would likely need multiple Pis to run it.

With this in mind, I'm opting for a different kind of log management system, which I'm basing on rsyslog (which is installed by default in most Linux distros) and lnav (which I've blogged about before: lnav basics tutorial), which runs much lighter, requiring only a fraction of a Raspberry Pi to operate, which is good since the Raspberry Pi I've dedicated to monitoring the rest of the infrastructure currently also handles:

  1. Continuous Integration: Laminar (this will eventually be a Docker container on my Hashicorp Nomad cluster)
  2. Collectd (Collectd is really easy to setup and runs so light, I love it)

I'm sure you might be asking yourself what the purpose of this is. My reasoning is fourfold:

  1. Having all the logs in one place makes them easier to analyse all at once, without having to SSH into many different servers
  2. If a box goes down, then I can read the logs from it before start attempting to fix it, giving me a heads up as to what the problem is (this, in conjunction with my collectd monitoring system)
  3. On the Raspberry Pis I manage, this prolongs the life of the microSD cards by reducing the number of writes thereto
  4. I gain a little bit of security, in that if a box is compromised, then unless the attacker also gains access to my logging server, then they can't erase their tracks as easily as might otherwise have done

With all this in mind, I thought that it's about time I actually did something about this. I've found that while the solution is actually really quite simple, it's not particularly easy to find, so I thought I'd post about it here.

In my setup, I'm going to be using a Raspberry Pi 4 4GB RAM I've dubbed eldarion, which is the successor to an earlier Raspberry Pi 3B+ that died some years prior I called elessar as the server upon which I centralise my logs. It has a 120GB SATA SSD attached in a case that used to house a WD PiDrive (they don't sell those anymore :-/) that I had lying around, which I've formatted with Btrfs.

Before we begin, let's outline the setup we're aiming for with a diagram to avoid confusion:

A diagram of the rsyslog setup we're aiming for. See explanation below.

eldarion will host the rsyslog server (which is essentially just a reconfiguration of the existing rsyslog server it is most likely already running), while other servers connect using the syslog protocol via a TCP connection, which is encrypted with TLS, using the GnuTLS engine (the default built into rsyslog). TLS here is important, since logs are naturally rather sensitive as I'm sure you can imagine.

To follow along here, you will need a valid Let's Encrypt certificate. It just so happens that I have a web server hosting my collectd graph panel interface, so I'm using that.

Of course, rsyslog can be configured in arbitrarily complex ways (such as having clients send logs to servers that they themselves forward to yet other servers), but at least for now I'm keeping it (relatively) simple.

Preparing the server

To start this process, we want to ensure the logs for the local system are stored in the right place. In my case, I have my SSD mounted to /mnt/eldarion-data2, so I want to put my logs in /mnt/eldarion-data2/syslog/localhost. There are 2 ways of accomplishing this:

  1. Reconfigure rsyslog to save logs elsewhere
  2. Be lazy, and bind mount the target location to /var/log

Since I'm feeling lazy today, I'm going to go with option 2 here. It's also a good idea if a program is badly written and decides it's a brilliant idea to write logs directly to /var/log itself instead of going through syslog.

If you're using DietPi, before you continue, do sudo dietpi-software and remove the existing logging system.

A bind mount is like a hard link of a directory, in that it makes a directory appear in multiple places at once. It acts as a separate "filesystem" though I assume to allow for avoiding infinite loops. They are also the tech behind volumes in Docker's backend containerd.

Open /etc/fstab for editing, and something like this on a new line:

/mnt/eldarion-data2/syslog/localhost    /var/log    none    auto,defaults,bind  0   0

..where /mnt/eldarion-data2/syslog/localhost is the location we want the data to be stored, and /var/log is the location we want to bind mount it to. Save and close /etc/fstab, and then mount the bind mount like so. Make sure /var/log is empty before mounting!

sudo mount /var/log

Next, we need to install some dependencies:

sudo apt install rsyslog rsyslog-gnutls

For some strange reason, TLS support is in a separate package on Debian-based systems. You'll need to investigate package names and translate this command for your distribution, of course.

Configuring the server

Now we have that taken care of, we can actually configure our server. Open /etc/rsyslog.conf for editing, and at the top put this:

# The $Thing syntax is apparently 'legacy', but I can't find how else we're supposed to do this
$DefaultNetstreamDriver gtls
$DefaultNetstreamDriverCAFile   /etc/letsencrypt/live/mooncarrot.space/chain.pem
$DefaultNetstreamDriverCertFile /etc/letsencrypt/live/mooncarrot.space/cert.pem
$DefaultNetstreamDriverKeyFile  /etc/letsencrypt/live/mooncarrot.space/privkey.pem

# StreamDriver.Mode=1 means TLS-only mode
module(load="imtcp" MaxSessions="500" StreamDriver.Mode="1" StreamDriver.AuthMode="anon")
input(type="imtcp" port="514")

$template remote-incoming-logs,"/mnt/eldarion-data2/syslog/hosts/%HOSTNAME%/%PROGRAMNAME%.log"
*.* ?remote-incoming-logs

You'll need to edit these bits to match your own setup:

  • /etc/letsencrypt/live/mooncarrot.space/: Path to the live directory there that contains the symlinks to the certs your Let's Encrypt client obtained for you
  • /mnt/eldarion-data2/syslog/hosts: The path to the directory we want to store the logs in

Save and close this, and then restart your server like so:

sudo systemctl restart rsyslog.service

Then, check to see if there were any errors:

sudo systemctl status rsyslog.service

Lastly, I recommend assigning a DNS subdomain to the server hosting the logs, such as logs.mooncarrot.space in my case. A single server can have multiple domain names of course, and this just makes it convenient if we every move the rsyslog server elsewhere - as we won't have to go around and edit like a dozen config files (which would be very annoying and tedious).

Configuring a client

Now that we have our rsyslog server setup, it should be relatively straightforward to configure a client box to send logs there. This is a 3 step process:

  1. Configure the existing /var/log to be an in-memory tmpfs to avoid any potential writes to disk
  2. Add a cron script to wipe /var/log every hour to avoid it getting full by accident
  3. Reconfigure (and install, if necessary) rsyslog to send logs to our shiny new server rather than save them to disk

If you haven't already confgiured /var/log to be an in-memory tmpfs, it is relatively simple. If you're unsure whether it is or not, do df -h.

First, open /etc/fstab for editing, and add the following line somewhere:

tmpfs /var/log tmpfs size=50M,noatime,lazytime,nodev,nosuid,noexec,mode=1777

Then, save + close it, and mount /var/log. Again, make sure /var/log is empty before mounting! Weird things happen if you don't.

sudo mount /var/log

Secondly, save the following to /etc/cron.hourly/clear-logs:

#!/usr/bin/env bash
rm -rf /var/log/*

Then, mark it executable:

sudo chmod +x /etc/cron.hourly/clear-logs

Lastly, we can reconfigure rsyslog. The specifics of how you do this varies depending on what you want to achieve, but for a host where I want to send all the logs to the rsyslog server and avoid saving them to the local in-memory tmpfs at all, I have a config file like this:

#################
#### MODULES ####
#################

module(load="imuxsock") # provides support for local system logging
module(load="imklog")   # provides kernel logging support
#module(load="immark")  # provides --MARK-- message capability

###########################
#### GLOBAL DIRECTIVES ####
###########################

$IncludeConfig /etc/rsyslog.d/*.conf

# Where to place spool and state files
$WorkDirectory /var/spool/rsyslog

###############
#### RULES ####
###############
$DefaultNetstreamDriverCAFile   /etc/ssl/isrg-root-x1-cross-signed.pem
$DefaultNetstreamDriver         gtls
$ActionSendStreamDriverMode     1       # Require TLS
$ActionSendStreamDriverAuthMode anon
*.* @@(o)logs.mooncarrot.space:514  # Forward everything to our rsyslog server

#
# Emergencies are sent to everybody logged in.
#
*.emerg             :omusrmsg:*

The rsyslog config file in question this needs to be saved to is located at /etc/rsyslog.conf. In this case, I replace the entire config file with the above, but you can pick and choose (e.g. on some hosts I want to save to the local disk and and to the rsyslog server).

Un the above you'll need to change the logs.mooncarrot.space bit - this should be the (sub)domain that you pointed at your rsyslog server earlier. The number after the colon (514) is the port number. The *.* tells it to send everything to the remote rsyslog server.

Before we're done here, we need to provide the rsyslog client with the CA certificate of the server (because, apparently, it isn't capable of ferreting around in /etc/ssl/certs like everyone else is). Since I'm using Let's Encrypt here, I downloaded their root certificate like this and it seemed to do the job:

sudo curl -sSL https://letsencrypt.org/certs/isrg-root-x1-cross-signed.pem -o /etc/ssl/isrg-root-x1-cross-signed.pem

Of course, one could generate their own CA and do mutual authentication for added security, but that's complicated, lots of effort, and probably unnecessary for my purposes as far as I can tell. I'll leave a link in the sources and further reading on how to do this if you're interested.

If you have a different setup, it's the $DefaultNetstreamDriverCAFile in the above you need to change to point at your actual CA certificate.

With that all configured, we can now restart the rsyslog client:

sudo systemctl restart rsyslog.service

...and, of course, check to see if there were any errors:

sudo systemctl status rsyslog.service

Finally, we also need to configure logrotate to rotate all these new log files. First, install logrotate if the logrotate command doesn't exist:

sudo apt install logrotate

Then, place the following in the file /etc/logrotate.d/centralisedlogging:

/mnt/eldarion-data2/syslog/hosts/*/*.log {
    rotate 12
    weekly
    missingok
    notifempty
    compress
    delaycompress
}

Of course, you'll want to replace /mnt/eldarion-data2/syslog/hosts/ with the directory you're storing the logs from the remote server in, and also customise the log rotation. For example, the 12 there is the number of old log files to keep, and weekly can be swapped for daily or even monthly if you like.

Conclusion

This has been a very quick whistle-stop tour of setting up an rsyslog server to centralise your logs. We've setup our rsyslog server to use a TLS encrypted connection to receive logs, which 1 or more clients can send logs to. We've also configured /var/log on both the server and the client to avoid awkward issues.

Moving forwards, I recommend reading my lnav basics tutorial blog post, which should be rather helpful in analysing the resulting log files.

lnav was not helpful however when I asked it to look at all the log files separately with sudo lnav */*.log, deciding to treat them as "generic logs" rather than "syslog logs", meaning that it didn't colour them properly, and also didn't allow for proper filter. To this end, it may be benefical to store all the logs in 1 file rather than in separate files. I'll keep an eye on this, and update this post if figure out how to convince lnav to treat them properly.

Another slightly snag with my approach here is that for some reason all the logs from elsewhere also end up in the generic /var/log/syslog file (hence how I found a 'workaround' the above issue), resulting in duplicated logs. I have yet to find a solution to this issue, but I'm also not sure whether I want to keep the logs in 1 big file or in many smaller files yet.

These issues aside, I'm pretty satisfied with the results. Together with my existing collectd-based monitoring system (which I'll blog about how I've set that up if there's any interest - collectd is really easy to use), this is another step towards greater transparency into the infrastructure I manage.

In the future, I want to investigate generating notifications alerts for issues in my infrastructure. These could come either from collectd, or from rsyslog, and I envision them going to a variety of places:

  1. Email (a daily digest perhaps?)
  2. XMPP (I've bridged to it from shell scripts before)

Given that my infrastructure is just something I run at home and I don't mind so much if it's down for a few hours, my focus here is not on notifying my as soon as possible, but notifying myself in a way that doesn't disturb me so I can check into it in my own time.

If you found this tutorial / guide useful, please do comment below! It's really cool and motivating to see that the stuff I post on here helps others out.

Sources and further reading

How to pin an apt repository for preferential package installation

As described in my last post, pinning apt repositories is now necessary if you want to install Firefox from an apt repository (e.g. if you want to install Firefox Beta). This is not an especially difficult process, but it is significantly confusing, so I thought I'd write a post about it.

Pinning an apt repository means that even if there's a newer version of a package elsewhere, the 'older' version will still be installed from the apt repository you pin.

Be very careful with this technique. You can easily cause major issues with your system if you pin the wrong repository!

Firstly, you want to head to /etc/apt/sources.list.d/ and find the .list file for the repository you want to pin. Take note of the URL inside that file, and then run this command:

apt-cache policy

No root is necessary here, as it's still a read-only command. Depending on how many apt repositories you have installed in your system, there may be a significant amount of output. Find the lines that correspond to the apt repository you want to preferentially install from in this output. For this example, I'm going to pin the excellent nautilus-typeahead apt repository, so the bit I'm looking for looks like this:

999 http://ppa.launchpad.net/lubomir-brindza/nautilus-typeahead/ubuntu jammy/main amd64 Packages
    release v=22.04,o=LP-PPA-lubomir-brindza-nautilus-typeahead,a=jammy,n=jammy,l=nautilus-typeahead,c=main,b=amd64
    origin ppa.launchpad.net

From here, take a note of the o= bit. In my case, it's o=LP-PPA-lubomir-brindza-nautilus-typeahead. Then, create a new file in /etc/apt/preferences.d with the following content:

Package: *
Pin: release o=LP-PPA-lubomir-brindza-nautilus-typeahead
Pin-Priority: 1001

See that o=.... bit there? Replace it with the one for the repository you want to pin. The number there is the new priority of the repository. The numbers at the beginning of each line in the output of the apt-cache policy command are the priorities of your existing apt repositories, so this should give you an idea as to what number you need to use here - a higher number means a higher priority regardless of the version number of the packages contained therein.

Then, simply sudo apt update and sudo apt dist-upgrade, and apt should pick up the "upgrades" from your newly pinned repository! In some situations you may need to remove and reinstall the offending package if you encounter issues.

Sources and further reading

Ubuntu 22.04 upgrade report

A slice of the official Ubuntu 22.04 Jammy Jellyfish wallpaper (Above: A slice of the official Ubuntu 22.04 Jammy Jellyfish wallpaper)

Hey there! Since Ubuntu 22.04 Jammy Jellyfish has recently been released, I've upgraded multiple machines to it, and I have enough to talk about that I thought it would be a good idea to write them up into a proper blog post for the benefit of others.

For reference, I've upgraded my main laptop on 20th May 2022 (10 days ago as of writing this psot), and I've also upgraded one of my desktops I use at University. I have yet to upgrade starbeamrainbowlabs.com - the server this blog post is hosted on - as I'm waiting for 22.04.1 for that (it would be very awkward indeed if the upgrade failed or there was some other issue I'm not yet aware of).

The official release notes for the Ubuntu 22.04 can be found here: https://discourse.ubuntu.com/t/jammy-jellyfish-release-notes/24668

There's also an official blog post that's ranked much higher in search engines, but it's not really very informative for me as I don't use the GNOME desktop - you're better off reading the real release notes above.

Thankfully, I have not encountered as many issues (so far!) with this update as I have with previous updates. While this update doesn't seem to change all that much aside from a few upgrades here and there, by far the biggest annoyance is shipping Firefox as a snapd by default.

Not only are they shipping it as a snap package, but they have bumped the epoch number, which means that the packages in the official firefox apt repository (beta users like me, use this one instead) are ignored in favour of the new snap package! I mean I get that shipping packages simplifies build systems for large projects like Firefox, but I have a number of issues with snapd:

  • Extra disk space usage: every snap package has it's own version of it's dependencies
  • Permissions: as far as I'm aware (please comment below if this is now fixed), there are permissions issues if you try to load a file from some places on disk when you're running an app installed via snapd, as it runs in a sandbox (this is also true of apps installed with flatpak). This makes using most applications completely impractical
  • Ease of updates: A minor annoyance, but with apps installed via snap I have 2 different package managers to worry about
  • Observability: Another minor concern, but with every package having it's own local dependencies, I'm makes it more difficult to observe and understand what's going on, and fix any potential issues

This aside, apt does allow for pinning apt repositories to work around this issue. I'll be posting a blog post on how this works more generally hopefully soon, but for now, you want to put this in a file at /etc/apt/preferences.d/firefox (after installing one of the above 2 apt repositories if you haven't done so already):

Package: *
Pin: release o=LP-PPA-mozillateam
Pin-Priority: 1001

...then run this sequence of commands:

sudo apt update
sudo apt purge firefox # This will *not* delete your user data - that's stored in your local user profile 
sudo apt install firefox

The above works for both the stable and beta versions. Optionally: sudo apt purge snapd.

I also found this necessary for the wonderful nautilus-typeahead apt repository.

This was the most major issue I encountered. Other than this, I ran into a number of little things that are worth noting before you decide to upgrade. Firstly, for those who dual (or triple or even more!) boot, the version of the grub bootloader shipped with Ubuntu does not detect other bootable partitions!

Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.

....so if you do run more than a single OS on your system, make sure you correct this after upgrading.

Another thing is that, as usual, Ubuntu disables all third party apt repositories on upgrade. I strongly recommend paying very close attention to the list of packages that do-release-upgrade decides it needs to remove, as if you install e.g. Inkscape or Krita via an apt repository to get the latest versions thereof, you'll need to reinstall them after re-enabling your apt repositories. Personally, I say "no" to the reboot at the end of the upgrade process and fix my apt repositories before then running:

sudo apt update
sudo apt dist-upgrade
sudo apt autoremove
sudo apt autoclean
# See also https://gitlab.com/sbrl/bin/-/blob/master/update-system

...and only then rebooting.

While GitHub's Atom seems to be more and more inactive these days as people move over to Visual Studio Code, I still find myself using it regularly as my primary code editor. Unfortunately, I encountered this bug, so I needed to edit /usr/share/applications/atom.desktop to add --no-sandbox to the execution line when starting Atom. The Exec= line in that file now reads:

Exec=env ATOM_DISABLE_SHELLING_OUT_FOR_ENVIRONMENT=false /usr/bin/atom --no-sandbox %F

This issue only occurred on 1 of the 2 systems I've upgraded though, so I'm not sure of the root cause. Other random issues I encountered:

  • GDM has a truly awful shade of grey in the background now. This repository gives a way to fix this problem. Try to avoid an image that's too light in colour, as the white text of the lock screen becomes rather difficult to see.
  • Speaking of backgrounds, the upgrade reset my desktop background on both machines I upgraded. Make sure you have a copy of it stored away somewhere, as you'll need it. lightdm (the login screen I use on my main laptop in place of gdm) seems to be fine though.
  • tumbler - a d-bus thumbnailing service - was also automatically removed. This does not appear to have caused me any problems so far (though image previews now make transparent pixels appear white, which is really annoying and I haven't yet looked into a fix on that one), so I need to look into this one further.
  • If you're a regular user of Memtest86+, it may disappear from your grub bootloader menu if you use EFI boot now for some strange reason.
  • The colour scheme in the address bar of Nautilus (the file manager) seems a bit messed up for me, but this may have more to do with the desktop theme I'm using.

If I encounter any other issues while upgrading my servers in the future, I'll make another post here about it if it's a significant issue, or comment on/edit this post if it's a minor thing.

If you encounter any other issues upgrading that aren't mentioned here, please do leave a comment below with the issue you encountered and the solution / workaround you implemented to fix it.

Using whiptail for text-based user interfaces

One of my ongoing projects is to implement a Bash-based raspberry pi provisioning system for hosts in my raspberry pi cluster. This is particularly important given that Debian 11 bullseye was released a number of months ago, and while it is technically possible to upgrade a host in-place from Debian 10 buster to Debian 11 bullseye, this is a lot of work that I'd rather avoid.

In implementing a Bash-based provisioning system, I'll have a system that allows me to rapidly provision a brand-new DietPi (or potentially other OSes in the future, but that's out-of-scope of version 1) automatically. Once the provisioning process is complete, I need only reboot it and potentially set a static IP address on my router and I'll then have a fully functional cluster host that requires no additional intervention (except to update it regularly of course).

The difficulty here is I don't yet have enough hosts in my cluster that I can have a clear server / worker division, since my Hashicorp Nomad and Consul clusters both have 3 server nodes for redundancy rather than 1. It is for this reason I need a system in my provisioning system that can ask me what configuration I want the new host to have.

To do this, I rediscovered the whiptail command, which is installed by default on pretty much every system I've encountered so far, and it allows you do develop surprisingly flexible text based user interfaces with relatively little effort, so I wanted to share it here.

Unfortunately, while it's very cool and also relatively easy to use, it also has a lot of options and can result in command invocations like this:

whiptail --title "Some title" --inputbox "Enter a hostname:" 10 40 "default_value" 3>&1 1>&2 2>&3;

...and it only gets more complicated from here. In particular the 2>&1 1>&2 2>&3 bit there is a fancy way of flipping the standard output and standard error.

I thought to myself that surely there must be a way that I can simplify this down to make it easier to use, so I implemented a number of wrapper functions:

ask_yesno() {
    local question="$1";

    whiptail --title "Step ${step_current} / ${step_max}" --yesno "${question}" 40 8;
    return "$?"; # Not actually needed, but best to be explicit
}

This first one asks a simple yes/no question. Use it like this:

if ask_yesno "Some question here"; then
    echo "Yep!";
else
    echo "Nope :-/";
fi

Next up, to ask the user for a string of text:

# Asks the user for a string of text.
# $1    The window title.
# $2    The question to ask.
# $3    The default text value.
# Returns the answer as a string on the standard output.
ask_text() {
    local title="$1";
    local question="$2";
    local default_text="$3";
    whiptail --title "${title}" --inputbox "${question}" 10 40 "${default_text}" 3>&1 1>&2 2>&3;
    return "$?"; # Not actually needed, but best to be explicit
}

# Asks the user for a password.
# $1    The window title.
# $2    The question to ask.
# $3    The default text value.
# Returns the answer as a string on the standard output.
ask_password() {
    local title="$1";
    local question="$2";
    local default_text="$3";
    whiptail --title "${title}" --passwordbox "${question}" 10 40 "${default_text}" 3>&1 1>&2 2>&3;
    return "$?"; # Not actually needed, but best to be explicit
}

These both work in the same way - it's just that with ask_password it uses asterisks instead of the actual characters the user is typing to hide what they are typing. Use them like this:

new_hostname="$(ask_text "Provisioning step 1 / 4" "Enter a hostname:" "${HOSTNAME}")";
sekret="$(ask_password "Provisioning step 2 / 4" "Enter a sekret:")";

The default value there is of course optional, since in Bash if a variable does not hold a value it is simply considered to be empty.

Finally, I needed a mechanism to ask the user to choose at most 1 value from a predefined list:

# Asks the user to choose at most 1 item from a list of items.
# $1        The window title.
# $2..$n    The items that the user must choose between.
# Returns the chosen item as a string on the standard output.
ask_multichoice() {
    local title="$1"; shift;
    local args=();
    while [[ "$#" -gt 0 ]]; do
        args+=("$1");
        args+=("$1");
        shift;
    done
    whiptail --nocancel --notags --menu "$title" 15 40 5 "${args[@]}" 3>&1 1>&2 2>&3;
    return "$?"; # Not actually needed, but best to be explicit
}

This one is a bit special, as it stores the items in an array before passing it to whiptail. This works because of word splitting, which is when the shell will substitute a variable with it's contents before splitting the arguments up. Here's how you'd use it:

choice="$(ask_multichoice "How should I install Consul?" "Don't install" "Client mode" "Server mode")";

As an aside, the underlying mechanics as to why this works is best explained by example. Consider the following:

oops="a value with spaces";

node src/index.mjs --text $oops;

Here, we store value we want to pass to the --text argument in a variable. Unfortunately, we didn't quote $oops when we passed it to our fictional Node.js script, so the shell actually interprets that Node.js call like this:

node src/index.mjs --text a value with spaces;

That's not right at all! Without the quotes around a value with spaces there, process.argv will actually look like this:

[
    '/usr/local/lib/node/bin/node',
    '/tmp/test/src/index.mjs',
    '--text',
    'a',
    'value',
    'with',
    'spaces'
]

The a value with spaces there has been considered by the Node.js subprocess as 4 different values!

Now, if we include the quotes there instead like so:

oops="a value with spaces";

node src/index.mjs --text "$oops";

...the shell will correctly expand it to look like this:

node src/index.mjs --text "a value with spaces";

... which then looks like this to our Node.js subprocess:

[
    '/usr/local/lib/node/bin/node',
    '/tmp/test/src/index.mjs',
    '--text',
    'a value with spaces'
]

Much better! This is important to understand, as when we start talking about arrays in Bash things start to work a little differently. Consider this example:

items=("an apple" "a banana" "an orange")

/tmp/test.mjs --text "${item[@]}"

Can you guess what process.argv will look like? The result might surprise you:

[
    '/usr/local/lib/node/bin/node',
    '/tmp/test.mjs',
    '--text',
    'an apple',
    'a banana',
    'an orange'
]

Each element of the Bash array has been turned into a separate item - even when we quoted it and the items themselves contain spaces! What's going on here?

In this case, we used [@] when addressing our items Bash array, which causes Bash to expand it like this:

/tmp/test.mjs --text "an apple" "a banana" "an orange"

....so it quotes each item in the array separately. If we forgot the quotes instead like this:

/tmp/test.mjs --text ${item[@]}

...we would get this in process.argv:

[
    '/usr/local/lib/node/bin/node',
    '/tmp/test.mjs',
    '--text',
    'an',
    'apple',
    'a',
    'banana',
    'an',
    'orange'
]

Here, Bash still expands each element separately, but does not quote each item. Because each item isn't quoted, when the command is actually executed, it splits everything a second time!

As a side note, if you want all the items in a Bash array in a single quoted item, you need to use an asterisk * instead of an at-sign @ like so:

/tmp/test.mjs --text "${a[*]}";

....which would yield the following process.argv:

[
    '/usr/local/lib/node/bin/node',
    '/tmp/test.mjs',
    '--text',
    'an apple a banana an orange'
]

With that, we have a set of functions that make whiptail much easier to use. Once it's finished, I'll write a post on my Bash-based cluster host provisioning script and explain my design philosophy behind it and how it works.

Switching from XFCE4 to KDE Plasma

While I use Unity (7.5) and Ubuntu on my main laptop, on my travel laptop I instead use Artix Linux. Recently, I've been experiencing an issue where when I login to the lock screen after resume the device from sleep, I get a black screen.

Rather than digging around endlessly attempting to fix the issue (I didn't even know where to start), I've been meaning to try out KDE Plasma, which is 1 of a number of popular desktop environments available. To this end, I switched from XFCE (version 4) to KDE Plasma (5.24 as of the time of typing). this ultimately did end up fixing my issue (my travel laptop would win a prize for the most unusual software setup, as it originated as a Manjaro OpenRC machine).

Now that I've completed that switch (I'm typing this now in Atom running in the KDE Plasma desktop environment!), I thought I'd write up a quick post about the two desktop environments and my first impressions of KDE as compared to XFCE.

(Above: My KDE desktop environment, complete with a desktop background taken from CrossCode. The taskbar is at the top because this is how I had it configured in XFCE.)

The best way I suppose to describe the difference between XFCE and KDE is jumping from your garden pond into the local canal. While XFCE is fairly customisable, KDE is much more so - especially when it comes to desktop effects and the look and feel. I really appreciate the ability to customise the desktop effects to tune them to match what I've previously been used to in Unity (though I still use Unity on both my main laptop and my Lab PC at University) and XFCE.

One such example of this is the workspaces feature. You can customise the number of workspaces and also have them in a grid (just like Unity), which the GNOME desktop that comes with Ubuntu by default doesn't allow for. You can even tune the slide animation between desktops which I found helpful as the default animation was too slow for me.

It also has an enormous library of applications that complement the KDE desktop environment, with everything from your staples such as the terminal, an image viewer, and a file manager to more niche and specialised use-cases like a graph calculator and a colour contrast checker. While these can of course be installed in other desktop environments, it's cool to see such an expansive suite of programs for every conceivable use-case right there.

Related to this, there also appears to be a substantial number of widgets that you can add to your desktop. Like XFCE, KDE has a concept of panels which can hold 1 or more widgets in a line. This is helpful for monitoring system resources for example. While these are for the most part just as customisable as the main desktop environment, I wish that their dependencies were more clearly defined. On more than 1 occasion I found I was missing a dependency for some widget to work that wasn't mentioned in the documentation. upowerd is required for the battery indicator to work (which wasn't running due to a bug caused by a package name change from the great migration of Manjaro back in 2017), and the plasma-nm pacman package is required for the network / WiFi indicator to work, but isn't specified as a dependency when you install the plasma-desktop package. Clearly some work is needed in this area (though, to be fair, as I mentioned earlier I have a very strange setup indeed).

I'm continuing to find 1000 little issues with it that I'm fixing 1 by 1 - just while writing this post I found that dolphin doesn't support jumping to the address bar if you start typing a forward slash / (or maybe it was another related issue? I can't remember), which is really annoying as I do this all the time - but this is a normal experience when switching desktop environments (or, indeed, machines) - at least for me.

On the whole though, KDE feels like a more modern take on XFCE. With fancier graphics and desktop effects and what appears to be a larger community (measuring such things can be subjective though), I'm glad that I made the switch from XFCE to KDE - even if it was just to fix a bug at first (I would never have considered switching otherwise). As a desktop environment, I think it's comfortable enough that I'll be using KDE on a permanent basis on my travel laptop from now on.

systemquery, part 2: replay attack

Hey there! As promised I'll have my writeup about AAAI-22, but in the meantime I wanted to make a quick post about a replay attack I found in my systemquery encryption protocol, and how I fixed it. I commented quickly about this on the last post in this series, but I thought that it warranted a full blog post.

In this post, I'm going to explain the replay attack in question I discovered, how replay attacks work, and how I fixed the replay attack in question. It should be noted though that at this time my project systemquery is not being used in production (it's still under development), so there is no real-world impact to this particular bug. However, it can still serve as a useful reminder as to why implementing your own crypto / encryption protocols is a really bad idea.

As I explained in the first blog post in this series, the systemquery protocol is based on JSON messages. These messages are not just sent in the clear though (much though that would simplify things!), as I want to ensure they are encrypted with authenticated encryption. To this end, I have devised a 3 layer protocol:

Objects are stringified to JSON, before being encrypted (with a cryptographically secure random IV that's different for every message) and then finally packaged into what I call a framed transport - in essence a 4 byte unsigned integer which represents the length in bytes of the block of data that immediately follows.

The encryption algorithm itself is provided by tweetnacl's secretbox() function, which provides authenticated encryption. It's also been independently audited and has 16 million weekly downloads, so it should be a good choice here.

While this protocol I've devised looks secure at first glance, all is not as it seems. As I alluded to at the beginning of this post, it's vulnerable to a reply attack. This attack is perhaps best explained with the aid of a diagram:

Let's imagine that Alice has an open connection to Bob, and is sending some messages. To simplify things, we will only consider 1 direction - but remember that in reality such a connection is bidirectional.

Now let's assume that there's an attacker with the ability listen to our connection and insert bogus messages into our message stream. Since the messages are encrypted, our attacker can't read their contents - but they can copy and store messages and then insert them into the message stream at a later date.

When Bob receives a message, they will decrypt it and then parse the JSON message contained within. Should Bob receive a bogus copy of a message that Alice sent earlier, Bob will still be able to decrypt it as a normal message, and won't be able to tell it apart from a genuine message! Should our attacker figure out what a message's function is, they could do all kinds of unpleasant things.

Not to worry though, as there are multiple solutions to this problem:

  1. Include a timestamp in the message, which is then checked later
  2. Add a sequence counter to keep track of the ordering of messages

In my case, I've decided to go with the latter option here, as given that I'm using TCP I can guarantee that the order I receive messages in is the order in which I sent them. Let's take a look at what happens if we implement such a sequence counter:

When sending a message, Alice adds a sequence counter field that increments by 1 for each message sent. At the other end, Bob increments their sequence counter by 1 every time they receive a message. In this way, Bob can detect if our attacker attempts a replay attack, because the sequence number on the message they copied will be out of order.

To ensure there aren't any leaks here, should the sequence counter overflow (unlikely), we need to also re-exchange the session key that's used to encrypt messages. In doing so, we can avoid a situation where the sequence number has rolled over but the session key is the same, which would give an attacker an opportunity to replay a message.

With that, we can prevent replay attacks. The other thing worth mentioning here is that the sequence numbering needs to be done in both directions - so Alice and Bob will have both a read sequence number and a write sequence number which are incremented independently of one another whenever they receive and send a message respectively.

Conclusion

In this post, we've gone on a little bit of a tangent to explore replay attacks and how to mitigate them. In the next post in this series, I'd like to talk about the peer-to-peer swarming algorithm I've devised - both the parts thereof I've implemented, and those that I have yet to implement.

Sources and further reading

systemquery, part 1: encryption protocols

Unfortunately, my autoplant project is taking longer than I anticipated to setup and debug. In the meantime, I'm going to talk about systemquery - another (not so) little project I've been working on in my spare time.

As I've acquired more servers of various kinds (mostly consisting of Raspberry Pis), I've found myself with an increasing need to get a high-level overview of the status of all the servers I manage. At the moment, this need is satisfied by my monitoring system's (collectd, which while I haven't blogged about my setup directly, I have posted about it here and here) web-based dashboard called Collectd Graph Panel (sadly now abandonware, but still very useful):

This is great and valuable, but if I want to ask questions like "are all apt updates installed", or "what's the status of this service on all hosts?", or "which host haven't I upgraded to Debian bullseye yet?", or "is this mount still working", I currently have to SSH into every host to find the information I'm looking for.

To solve this problem, I discovered the tool osquery. Osquery is a tool to extract information from a network of hosts with an SQL-like queries. This is just what I'm looking for, but unfortunately it does not support the armv7l architecture - which most of my cluster currently runs on - thereby making it rather useless to me.

Additionally, from looking at the docs it seems to be extremely complicated to setup. Finally, it does not seem to have a web interface. While not essential, it's a nice-to-have

To this end, I decided to implement my own system inspired by osquery, and I'm calling it systemquery. I have the following goals:

  1. Allow querying all the hosts in the swarm at once
  2. Make it dead-easy to install and use (just like Pepperminty Wiki)
  3. Make it peer-to-peer and decentralised
  4. Make it tolerate random failures of nodes participating in the systemquery swarm
  5. Make it secure, such that any given node must first know a password before it is allowed to join the swarm, and all network traffic is encrypted

As a stretch goal, I'd also like to implement a mesh message routing system too, so that it's easy to connect multiple hosts in different networks and monitor them all at once.

Another stretch goal I want to work towards is implementing a nice web interface that provides an overview of all the hosts in a given swarm.

Encryption Protocols

With all this in mind, the first place to start is to pick a language and platform (Javascript + Node.js) and devise a peer-to-peer protocol by which all the hosts in a given swarm can communicate. My vision here is to encrypt everything using a join secret. Such a secret would lend itself rather well to a symmetrical encryption scheme, as it could act as a pre-shared key.

A number of issues stood in the way of actually implementing this though. At first, I thought it best to use Node.js' built-in TLS-PSK (stands for Transport Layer Security - Pre-Shared Key) implementation. Unlike regular TLS which uses asymmetric cryptography (which works best in client-server situations), TLS-PSK uses a pre-shared key and symmetrical cryptography.

Unfortunately, although Node.js advertises support for TLS-PSK, it isn't actually implemented or is otherwise buggy. This not only leaves me with the issue of designing a encryption protocol, but also:

  1. The problem of transferring binary data
  2. The problem of perfect forward secrecy
  3. The problem of actually encrypting the data

Problem #1 here turned out to be relatively simple. I ended up abstracting away a raw TCP socket into a FramedTransport class, which implements a simple protocol that sends and receives messages in the form <length_in_bytes><data....>, where <length_in_bytes> is a 32 bit unsigned integer.

With that sorted and the nasty buffer manipulation safely abstracted away, I could turn my attention to problems 2 and 3. Let's start with problem 3 here. There's a saying when programming things relating to cryptography: never roll your own. By using existing implementations, these existing implementations are often much more rigorously checked for security flaws.

In the spirit of this, I sought out an existing implementation of a symmetric encryption algorithm, and found tweetnacl. Security audited, it provides what looks to be a secure symmetric encryption API, which is the perfect foundation upon which to build my encryption protocol. My hope is that by simply exchanging messages I've encrypted with an secure existing algorithm, I can reduce the risk of a security flaw.

This is a good start, but there's still the problem of forward secrecy to tackle. To explain, perfect forward secrecy is where should an attacker be listening to your conversation and later learn your encryption key (in this case the join secret), they still are unable to decrypt your data.

This is achieved by using session keys and a key exchange algorithm. Instead of encrypting the data with the join secret directly, we use it only to encrypt the initial key-exchange process, which then allows 2 communicating parties to exchange a session key, which used to encrypt all data from then on. By re-running the key-exchange process to and generating new session keys at regular intervals, forward secrecy can be achieved: even if the attacker learns a session key, it does not help them to obtain any other session keys, because even knowledge of the key exchange algorithm messages is not enough to derive the resulting session key.

Actually implementing this in practice is another question entirely however. I did some research though and located a pre-existing implementation of JPAKE on npm: jpake.

With this in hand, the problem of forward secrecy was solved for now. The jpake package provides a simple API by which a key exchange can be done, so then it was just a case of plugging it into the existing system.

Where next?

After implementing an encryption protocol as above (please do comment below if you have any suggestions), the next order of business was to implement a peer-to-peer swarm system where agents connect to the network and share peers with one another. I have the basics of this implemented already: I just need to test it a bit more to verify it works as I intend.

It would also be nice to refactor this system into a standalone library for others to use, as it's taken quite a bit of effort to implement. I'll be holding off on doing this though until it's more stable however, as refactoring it now would just slow down development since it has yet to stabilise as of now.

On top of this system, the plan is to implement a protocol by which any peer can query any other peer for system information, and then create a command-line interface for easily querying it.

To make querying flexible, I plan on utilising some form of in-memory database that is populated with queries to other hosts based on the tables mentioned in the user's query. SQLite3 is the obvious choice here, but I'm reluctant to choose it as it requires compilation upon installation - and given that I have experienced issues with this in the past, I feel this has the potential to limit compatibility with some system configurations. I'm going to investigate some other in-memory database libraries for Javascript - giving preference to those which are both light and devoid of complex installation requirements (pure JS is best if I can manage it I think). If you know of a pre Javascript in-memory database that has a query syntax, do let me know in the comments below!

As for querying system information directly, that's an easy one. I've previously found systeminformation - which seems to have an API to fetch pretty much anything you'd ever want to know about the host system!

Sources and further reading

Cluster, Part 12: TLS for Breakfast | Configuring Fabio for HTTPS

Hey there, and happy new year 2022! It's been a little while, but I'm back now with another blog post in my cluster series. In this shorter post, I'm going to show you how I've configured my Fabio load balancer to serve HTTPS.

Before we get started though, I can recommend visiting the series list to check out all the previous parts in this series, as a number of them give useful context for this post.

In the last post, I showed you how to setup certbot / let's encrypt in a Docker container. Building on this, we can now reconfigure Fabio (which we setup in part 9) to take in the TLS certificates we are now generating. I'll be assuming that the certificates are stored on your NFS share you've got setup (see part 8) for this post. In the future I'd love to use Hashicorp Vault for storing these certificates, but as of now I've found Hashicorp Vault to be far too complicated to setup, so I'll be using the filesystem instead.

Configuring Fabio to use HTTPS is actually really quite simple. Open /etc/fabio/fabio.properties for editing, and at the beginning insert a line like this:

proxy.cs = cs=some_name_here;type=file;cert=/absolute/path/to/fullchain.pem;key=/absolute/path/to/privkey.pem

cs stands for certificate store, and this tells Fabio about where your certificates are located. some_name_here is a name you'd like to assign to your certificate store - this is used to reference it elsewhere in the configuration file. /absolute/path/to/fullchain.pem and /absolute/path/to/privkey.pem are the absolute paths to the fullchaim.pem and privkey.pem files from Let's Encrypt. These can be found in the live directory in the Let's Encrypt configuration directory in the subdirectory for the domain in question.

Now that Fabio knows about your new certificates, find the line that starts with proxy.addr. In the last tutorial, we configured this to have a value of :80;proto=http. proxy.addr can take a comma-separated list of ports to listen on, so append the following to the existing value:

:443;proto=https;cs=some_name_here;tlsmin=tls12

This tells Fabio to listen on TCP port 443 for HTTPS requests, and also tells it which certificate store to use for encryption. We also set the minimum TLS version supported to TLS 1.2 - but you should set this value to 1 version behind the current latest version (check this page for that). For those who want extra security, you can also add the tlsciphers="CIPHER,LIST" argument too (see the official documentation for more information - cross referencing it with the ssl-config.mozilla.org is a good idea).

Now that we have this configured, this should be all you need to enable HTTPS! That was easy, right?

We still have little more work to do though to make HTTPS the default and to redirect all HTTP requests to HTTPS. We can do this by adding a route to the Consul key-value store under the path fabio/config. You can do this either by editing it in the web interface by creating a new key under fabio/config and pasting the following in & saving it:

route add route_name_here example.com:80 https://example.com$path opts "redirect=308"

Alternatively, through the command line:

consul kv put fabio/config/some_name_here 'route add some_name_here example.com:80 https://example.com$path opts "redirect=308"'

No need to restart fabio - it should pick routes up automatically. I have found however that I do need to restart it occasionally if it doesn't pick up some changed routes as fast as I'd like though.

With this, we now have automatic HTTPS setup and configured! Coming up in this series:

  • Using Caddy as an entrypoint for port forwarding on my router (status: implemented; there's an awesome plugin for single sign-on, and it's amazing in other ways too) - this replaces the role HAProxy was going to play that I mentioned in part 11
  • Password protecting Docker, Nomad, and Consul (status: on the todo list)
  • Semi-automatic docker image rebuilding with Laminar CI (status: implemented)

Sources and further reading

Art by Mythdael