Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

The plan to caption and index images

Something that has been on my mind for a while are the photos that I take. At last count on my NAS I have 8564 pictures I have taken so far since I first got a phone to take them with, and many more belonging to other family members.

I have blogged before about a script I've written that automatically processes photos graphs and files them in by year and month. It fixes the date taken, set the thumbnail for rapid preview loading, automatically rotates them to be the right way up, losslessly optimises them, and more.

The one thing it can't do though is to help me locate a specific photo I'm after, so given my work with AI recently I have come up with a plan to do something about this, and I want to blog about it here.

By captioning the images with an AI, I plan to index the captions (and other image metadata) and have a web interface in the form of a search engine. In this blog post, I'm going to outline the AI I intend to use, and the architecture of the image search engine I have already made a start on implementing.

AI for image captioning

The core AI to do image captioning will be somewhat based on work I've done for my PhD. The first order of business was finding a dataset to train on, and I stumbled across Microsoft's Common Objects in Context dataset. The next and more interesting part was to devise a model architecture that translate an image into text.

When translating 1 thing (or state space) into another in AI, it is generally done with an encoder-decoder architecture. In my case here, that's an encoder for the image - to translate it into an embedded feature space - and a decoder to turn that embedded feature space into text.

There are many options for these - especially for encoding images - which I'll look at first. While doing my PhD, I've come across many different encoders for images, which I'd roughly categorise into 2 main categories:

Since the transformer model was invented, they have been widely considered to be the best option. Swin Transformers adapt this groundbreaking design for images - transformers originally handled text - from what I can tell better than the earlier Vision Transformer architecture.

On the other side, a number of encoders were invented before transformers were a thing - the most famous of which was ResNet (I think I have the right paper), which was basically just a bunch of CNN layers stacked on top of one another with a few extra bits like normalisation and skip connections.

Recently though, a new CNN-based architecture that draws inspiration from the strong points of transformers - and it's called ConvNeXt. Based on the numbers in the paper, it even beats the swin transformer model mentioned earlier. Best of all, it's much simpler in design so it makes it relatively easy to implement. It is this model architecture I will be using.

For the text, things are both straight forward - the model architecture I'll be using is a transformer (of course - I even implemented it myself from scratch!) - but the trouble is representation. Particularly the representation of the image caption we want the model to predict.

There are many approaches to this problem, but the one I'm going to try first is a word-based solution using one-hot encoding. There are about 27K different unique words in the dataset, so I've assigned each one a unique number in a dictionary file. Then, I can turn this:

[ "a", "cat", "sat", "on", "a", "mat" ]

....into this:

[ 0, 1, 2, 3, 0, 4 ]

...then, the model would predict something like this:

[
    [ 1, 0, 0, 0, 0, 0, ... ],
    [ 0, 1, 0, 0, 0, 0, ... ],
    [ 0, 0, 1, 0, 0, 0, ... ],
    [ 0, 0, 0, 1, 0, 0, ... ]
    [ 1, 0, 0, 0, 0, 0, ... ]
    [ 0, 0, 0, 0, 1, 0, ... ]
]

...where each sub-array is a word.

This will as you might suspect use a lot of memory - especially with 27K words in the dictionary. By my calculations, with a batch size of 64 and a maximum caption length of 25, each output prediction tensor will use a whopping 172.8 MiB memory as float32, or 86.4 MiB memory as float16 (more on memory usage later).

I'm considering a variety of techniques to combat this if it becomes an issue. For example, reducing the dictionary size by discarding infrequently used words.

Another option would be to have the model predict GloVe vectors as an output and then compare the output to the GloVe dictionary to tell which one to pick. This would come with it's own set of problems however, like lots of calculations to compare each word to every word in the dictionary.

My final thought was that I could maybe predict individual characters instead of full words. There would be more items in the sequence predicted, but each character would only have up to 255 choices (probably more like 36-ish), potentially saving memory.

I have already implemented this AI - I just need to debug and train it now. To summarise, here's a diagram:

The last problem with the AI though is memory usage. I plan on eventually running the AI on a raspberry pi, so much tuning will be required to reduce memory usage and latency as much I can. In particular, I'll be trying out quantisating my model and writing the persistent daemon to use Tensorflow Lite to reduce memory usage. Models train using the float32 data type - which uses 32 bits per value, but quantising it after training to use float16 (16 bits / value) or even uint8 (8 bits / value) would significantly reduce memory usage.

Search engine and indexing

The second part of this is the search engine. The idea here is to index all my photos ahead of time, and then have a web interface I can use to search and filter them. The architecture I plan on using to achieve this is rather complicated, and best explained with a diagram:

The backend I have chosen for the index is called meilisearch. It's written in Rust, and provides advanced as-you-type search functionality. This is for 2 reasons:

  1. While I'd love to implement my own, meilisearch is an open source project where they have put in more hours into making it cool than I ever would be able to
  2. Being a separate daemon means I can schedule it on my cluster as a separate task, which potentially might end up on a different machine

With this in mind, the search engine has 2 key parts to it: the crawler / indexer, and the HTTP server that serves the web interface. The web interface will talk to meilisearch to perform searches (not directly; requests will be proxied and transformed).

The crawler will periodically scan the disk for new, updated, and deleted files, and pass them on to the indexer queue. The indexer will do 4 things:

  1. Caption the image, by talking to a persistent Python child process via Inter Process Communication (IPC) - captions will be written as EXIF data to images
  2. Thumbnail images and store them in a cache (perhaps some kinda key-value store, as lots of little files on disk would be a disaster for disk space)
  3. Extract EXIF (and other) metadata
  4. Finally, push the metadata to meilisearch for indexing

Tasks 2 and 3 can be done in parallel, but the others will need to be done serially - though multiple images can of course be processed concurrently. I anticipate much asynchronous code here, which I'm rather looking forward to finishing writing :D

I already have a good start on the foundation of the search engine here. Once I've implemented enough that it's functional, I'll open source everything.

To finish this post, I have a mockup screenshot of what the main search page might look like:

Obviously the images are all placeholders (append ?help to this URL see the help page) for now and I don't yet have a name for it (suggestions in the comments are most welcome!), but the rough idea is there.

systemquery, part 1: encryption protocols

Unfortunately, my autoplant project is taking longer than I anticipated to setup and debug. In the meantime, I'm going to talk about systemquery - another (not so) little project I've been working on in my spare time.

As I've acquired more servers of various kinds (mostly consisting of Raspberry Pis), I've found myself with an increasing need to get a high-level overview of the status of all the servers I manage. At the moment, this need is satisfied by my monitoring system's (collectd, which while I haven't blogged about my setup directly, I have posted about it here and here) web-based dashboard called Collectd Graph Panel (sadly now abandonware, but still very useful):

This is great and valuable, but if I want to ask questions like "are all apt updates installed", or "what's the status of this service on all hosts?", or "which host haven't I upgraded to Debian bullseye yet?", or "is this mount still working", I currently have to SSH into every host to find the information I'm looking for.

To solve this problem, I discovered the tool osquery. Osquery is a tool to extract information from a network of hosts with an SQL-like queries. This is just what I'm looking for, but unfortunately it does not support the armv7l architecture - which most of my cluster currently runs on - thereby making it rather useless to me.

Additionally, from looking at the docs it seems to be extremely complicated to setup. Finally, it does not seem to have a web interface. While not essential, it's a nice-to-have

To this end, I decided to implement my own system inspired by osquery, and I'm calling it systemquery. I have the following goals:

  1. Allow querying all the hosts in the swarm at once
  2. Make it dead-easy to install and use (just like Pepperminty Wiki)
  3. Make it peer-to-peer and decentralised
  4. Make it tolerate random failures of nodes participating in the systemquery swarm
  5. Make it secure, such that any given node must first know a password before it is allowed to join the swarm, and all network traffic is encrypted

As a stretch goal, I'd also like to implement a mesh message routing system too, so that it's easy to connect multiple hosts in different networks and monitor them all at once.

Another stretch goal I want to work towards is implementing a nice web interface that provides an overview of all the hosts in a given swarm.

Encryption Protocols

With all this in mind, the first place to start is to pick a language and platform (Javascript + Node.js) and devise a peer-to-peer protocol by which all the hosts in a given swarm can communicate. My vision here is to encrypt everything using a join secret. Such a secret would lend itself rather well to a symmetrical encryption scheme, as it could act as a pre-shared key.

A number of issues stood in the way of actually implementing this though. At first, I thought it best to use Node.js' built-in TLS-PSK (stands for Transport Layer Security - Pre-Shared Key) implementation. Unlike regular TLS which uses asymmetric cryptography (which works best in client-server situations), TLS-PSK uses a pre-shared key and symmetrical cryptography.

Unfortunately, although Node.js advertises support for TLS-PSK, it isn't actually implemented or is otherwise buggy. This not only leaves me with the issue of designing a encryption protocol, but also:

  1. The problem of transferring binary data
  2. The problem of perfect forward secrecy
  3. The problem of actually encrypting the data

Problem #1 here turned out to be relatively simple. I ended up abstracting away a raw TCP socket into a FramedTransport class, which implements a simple protocol that sends and receives messages in the form <length_in_bytes><data....>, where <length_in_bytes> is a 32 bit unsigned integer.

With that sorted and the nasty buffer manipulation safely abstracted away, I could turn my attention to problems 2 and 3. Let's start with problem 3 here. There's a saying when programming things relating to cryptography: never roll your own. By using existing implementations, these existing implementations are often much more rigorously checked for security flaws.

In the spirit of this, I sought out an existing implementation of a symmetric encryption algorithm, and found tweetnacl. Security audited, it provides what looks to be a secure symmetric encryption API, which is the perfect foundation upon which to build my encryption protocol. My hope is that by simply exchanging messages I've encrypted with an secure existing algorithm, I can reduce the risk of a security flaw.

This is a good start, but there's still the problem of forward secrecy to tackle. To explain, perfect forward secrecy is where should an attacker be listening to your conversation and later learn your encryption key (in this case the join secret), they still are unable to decrypt your data.

This is achieved by using session keys and a key exchange algorithm. Instead of encrypting the data with the join secret directly, we use it only to encrypt the initial key-exchange process, which then allows 2 communicating parties to exchange a session key, which used to encrypt all data from then on. By re-running the key-exchange process to and generating new session keys at regular intervals, forward secrecy can be achieved: even if the attacker learns a session key, it does not help them to obtain any other session keys, because even knowledge of the key exchange algorithm messages is not enough to derive the resulting session key.

Actually implementing this in practice is another question entirely however. I did some research though and located a pre-existing implementation of JPAKE on npm: jpake.

With this in hand, the problem of forward secrecy was solved for now. The jpake package provides a simple API by which a key exchange can be done, so then it was just a case of plugging it into the existing system.

Where next?

After implementing an encryption protocol as above (please do comment below if you have any suggestions), the next order of business was to implement a peer-to-peer swarm system where agents connect to the network and share peers with one another. I have the basics of this implemented already: I just need to test it a bit more to verify it works as I intend.

It would also be nice to refactor this system into a standalone library for others to use, as it's taken quite a bit of effort to implement. I'll be holding off on doing this though until it's more stable however, as refactoring it now would just slow down development since it has yet to stabilise as of now.

On top of this system, the plan is to implement a protocol by which any peer can query any other peer for system information, and then create a command-line interface for easily querying it.

To make querying flexible, I plan on utilising some form of in-memory database that is populated with queries to other hosts based on the tables mentioned in the user's query. SQLite3 is the obvious choice here, but I'm reluctant to choose it as it requires compilation upon installation - and given that I have experienced issues with this in the past, I feel this has the potential to limit compatibility with some system configurations. I'm going to investigate some other in-memory database libraries for Javascript - giving preference to those which are both light and devoid of complex installation requirements (pure JS is best if I can manage it I think). If you know of a pre Javascript in-memory database that has a query syntax, do let me know in the comments below!

As for querying system information directly, that's an easy one. I've previously found systeminformation - which seems to have an API to fetch pretty much anything you'd ever want to know about the host system!

Sources and further reading

Add your blog to hullblogs.com

For many years, csblogs.com has aggregated posts from the blogs of current and past students of the University of Hull. Unfortunately, recently the site has started to generate random error messages. To fix this, Freeside (of which I'm the head, as it turns out) have decided to implement a replacement.

It looks rather cool if I do say so myself, so I thought I'd share it here. I'll talk first about how you can add your blog to it and why it's a good idea to start one yourself (and how you can do so for free!). Then, I'll talk a little bit at the end about how I built hullblogs.com and how it's put together.

@closebracket had the idea to call it hullblogs.com, and then I implemented the code for it:

A screenshot of hullblogs.com

(Above: A screenshot of hullblogs.com)

You can visit it here: hullblogs.com.

If you're a current or past student at the University of Hull, then you we want to hear from you! If you've got a blog (and even if you haven't yet!), then you can add your blog by going to hullblogs.com, and then clicking "add your blog".

If you don't yet have a blog, then we have you covered. Our guide has a number fo really easy ways to get started and host a blog for free! There are multiple ways to host a blog without paying a penny (no hidden charges after some time either).

But I don't have a blog!

If you're not convinced about setting up a blog and putting it on the Internet, there are a number of key reasons you should:

  • It's not about other people reading it. By setting up a blog, you don't need to have 100s of views. What matters is that you write for you - not for anyone else.
  • If only 1 person reads your blog, then it's worth it. You personal blog and/or website makes a great addition to your CV. It's a wonderful way of documenting the things you're doing and learning while doing your degree and beyond - and a great way for potential employers to find more detail on all this information in 1 place.
  • Write for yourself. Personally, I find my blog a great place to post about difficult problems that I've encountered and solved. Then when I encounter a similar problem again, I can re-read my own blog post about it. Because I'm the one who wrote the post, I can anticipate what I may find confusing and explain that in more detail to help out future me.
  • Improve your technical writing skills. Technical writing - the process of writing about complex technical subjects (whether this is Computer Science or beyond) is an invaluable skill. Writing documentation, communicating complex ideas, and helping others are all situations which can call for technical writing - and it's a great thing to put on your CV too. Just doing a thing isn't enough - being able to write about it and document it is really important when working for example on commercial (or even academic) work.

If you're not computer science oriented, then Wordpress, Blogger, or maybe Squarespace [unclear if squarespace is free or not] would be a a great place to start your blogging journey. You can host a blog for free too!

If you are computer science oriented (I'd guess most of the people reading this probably are given the kinds of posts I write for my blog here), then I strongly recommend Eleventy. It's a static site generator, and there are a whole bunch of ready-made templates for you to use to get started quickly.

In fact, hullblogs.com itself is built with Eleventy, using feedme to parse RSS/Atom feeds.

Every night, the site will be rebuilt. During this process, it will download the RSS/Atom feed from all the blogs registered, and order the posts chronologically. If an image is associated with a post, this will also be downloaded - the site also looks for the first image in a post's content if an image isn't explicitly specified to be attached to a blog. Once ordered, the posts are then paginated (split into multiple pages) and the rest of the site is rendered.

Being a static site, once the (re)build is complete hosting the site is simple and secure. Currently, Freeside hosts it using Nginx to statically serve it.

If your feed is malformed/contains errors or your blog is offline don't worry! hullblogs.com will transparently ignore your feed when it rebuilds. Then, once your blog is back up and running, it will add it back into the main hullblogs.com site the next night when it rebuilds again.

If you're interested in digging into the source code of the site, I've open sourced it on GitHub under the Apache 2.0 Licence: https://github.com/FreesideHull/hullblogs.com

If you've read this far, thanks so much for reading! We're after a better favicon logo for hullblogs.com. If you've got an idea, please do let us know by opening an issue.

Unethically disclosed vulnerabilities in Pepperminty Wiki: My perspective

Recently, I've made a new release of my PHP-based wiki engine Pepperminty Wiki - v0.23. This would not normally be notable, but as it turns out there were a number of security issues (the severity of which varies) that needed fixing. I fixed them of course, but the manner in which they were disclosed to me was less than ethical.

In this post, I want to explain what happened from my perspective, and why I'm rather frustrated with the way the reporter handled things.

It all started with issue #222 that was opened by @hmaverickadams. In that issue, they say that they have discovered a number of vulnerabilities:

Hi,

I am a penetration tester and discovered a couple of vulnerabilities within your application. I will be applying for CVE status on the findings, but would like to work with you on the issues if possible. I could not locate an email, so please feel free to shoot me your contact info if possible.

Thank you!

So far, so good! Seems responsible, right? It did to me too. For reference, CVE there refers to the Common Vulnerabilities and Exposures, a website that tracks vulnerabilities in software from across the globe.

I would have left it at that, but I decided to check out the GitHub projects that @hmaverickadams (henceforth "the reporter") had. To my surprise, I found these public GitHub repositories:

These appeared to be created just 1 day after the issue #222 was opened against Pepperminty Wiki. I was on holiday at the time (3 weeks; and I've haven't been checking my GitHub notifications as often as I perhaps should), so it took me 22 days to get to it. Generally speaking I would consider a minimum of 90 days with no response before publishing a vulnerability publicly like that - this is the core of the matter, but more on this later. Here are links these vulnerabilites on the CVE website:

You may also ask yourself "what were the vulnerabilities in question in the first place?" - glad you asked! Let's take a look.

CVE-2021-38600

Described officially as a "a stored Cross Site Scripting (XSS) vulnerability", this essentially that you can convince Pepperminty Wiki to store some arbitrary HTML (which may contain a malicious script for example) and later serve it to some poor unsuspecting visitors.

In this particular vulnerability, the reporter found that when filling out the initial setup web form that appears the first time you load Pepperminty Wiki up with a wiki name that contains arbitrary HTML, Pepperminty Wiki will blindly serve this to users.

It sounds like a big issue, but once you realise that to fill out the first run web form you need the site secret - which is generated randomly and stored in peppermint.json, which itself has a check to ensure it can't be loaded through the web server, you realise that this isn't actually a big deal. In fact, Pepperminty Wiki has a number of settings that by design allow one to serve arbitrary HTML:

  • editing_message - a message that appears below the page editing form and before the submit button
  • admindisplaychar - inserts text (or arbitrary HTML) before the name of an administrator
  • footer_message - a message (that may contain arbitrary HTML) that is displayed at the bottom of every page

All of these can be modified either by a moderator in the site settings page, or through peppermint.json directly.

...so personally I don't class this as a vulnerability. Regardless, I've fixed this by running the wiki name through htmlentities() - but in doing so I speculate that some special characters (e.g. quotes) will no longer display properly because of how I fixed CVE-2021-38600 (see below) - I'll continue working on this.

CVE-2021-38601

This vulnerability is described as "a reflected Cross Site Scripting (XSS) vulnerability". This is similar to CVE-2021-38600, but instead of storing a value the attack makes use of various GET parameters. There are (were, since I've fixed it) examples of GET parameters that caused this issue, including action (sets the subcommand/action that should be taken - e.g. view, edit, etc) and page (the current page on the wiki you're looking at).

Unlike CVE-2021-38600, this is a more serious vulnerability. Someone could generate a malicious link to a Pepperminty Wiki instance that instead redirects you to an attacker-controller website (i.e. by using location.href = "blah").

Fixing it though required me to do a comprehensive review of every single line of Pepperminty Wiki's codebase, which took me multiple hours of intense programming and was really rather unpleasant. The description by the reporter in the repo was quite unhelpful:

A reflected Cross Site Scripting (XSS) vulnerability exists on multiple parameters in version 0.23-dev of the Pepperminty-Wiki application that allows for arbitrary execution of JavaScript commands.

Affected page: http://localhost/index.php

Sample payload: https://localhost/index.php?action=<script>alert(1)</script>

CVE: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-38601

I discovered in my comprehensive review that action and page were not the only parameters affected - I fixed every instance I could find.

Reaching out

To try and understand their side of the story and gain additional insight into the vulnerabilities they discovered, I attempted to reach out to them. First, I tried opening an issue on the above GitHub repositories they created.

Instead of replying to the issues though, the reporter instead deleted the issues I opened, and set it so that nobody could opened issues on the repositories anymore! As of the time of typing I still do not have a response from the reporter about this.

Not to be deterred, I found a pair of twitter accounts they controlled and tweeted at them:

As you can probably guess, I haven't had a response yet (though I will of course update this blog post if I do). To make absolutely sure that I got through, I also filled out the contact form on their website - also to no avail so far.

With all this in mind, I get the impression the reporter does not want to talk to me - especially since they deleted the issues I opened against their repositories instead of replying to them. This is frustrating, because I was put in a really awkward position of having to deal with a zero day vulnerability as fast as I could after they publicly disclosed the vulnerabilities (worse still, I could tell that those repositories had some significant traffic since they have been starred by 7 + 4 people as of the time of typing).

Can't find an email address?

After this (and in between comprehensively reviewing Pepperminty Wiki's codebase), I also re-read the initial issue. When re-reading it, a particular sentence also struck me as odd:

I could not locate an email, so please feel free to shoot me your contact info if possible.

This is very strange, since I have the following paragraph in Pepperminty Wiki's README:

If you've found a security issue, please don't open an issue. Instead, get in touch privately - e.g. via Keybase or by email (security [at sign] starbeamrainbowlabs [replace me with a dot] com), and I'll try to respond ASAP.

I also have my website and email address on my GitHub profile, and my website lists:

  • My email address
  • My Keybase details
  • My Twitter account
  • My Stack Exchange account
  • My reddit account
  • My GPG/PGP key id

I don't have my Discord account on there, but I can chat over that too after first using one of the above.

With this in mind, I found it to be very strange that the reporter was unable to find a means of contact to use to responsibly disclose the vulnerabilities.

CVE confusion

Now that I've fixed the vulnerabilities, I'm somewhat confused above how I update the pair of CVEs. This website gives the following instructions:

  1. Identify the CNA that published the CVE Record by searching for the CVE Record on the CVE List.
  2. Locate the responsible CNA in the “Assigning CNA” field of the CVE Record.
  3. Contact the CNA using their preferred contact method to request the update.

In my case, the assigning CNA is stated as "N/A" - I assume it's the unresponsive reporter above. I'm confused here then about how I'm supposed to update the the CVE, since I can't contract the original reporter.

If anyone can suggest a way in which I can update these CVEs to reflect what has happened and the fact that I've fixed them, I'd love to know - I would hate to leave those CVEs outdated as they may misinform someone. Contact details on my website homepage. You can also leave a comment on this blog post.

Conclusion

I'm upset not because the reporter found a vulnerability - it's great they even took the time to find it in the first place in my little small-time project! I'm upset because they failed to properly disclose the vulnerabilities by privately contacting me. In fact, they would have discovered that CVE-2021-38600 is not really a vulnerability at all.

I'm also upset because despite the effort I've gone to in order to reach out, instead of opening a civil and polite discussion about the whole issue I've instead been met with doors slammed in my face (i.e. issues deleted instead of being replied to).

I wanted to document my experiences here also to educate others about ethical vulnerability / security issue / disclosure. Ethics, justice, and honesty are really important to me - and I'd like to try and avoid any accidental disclosures if at all possible.

If you find a vulnerability in someone's code (be it open or closed source), I would advise you to:

  1. Go in search of a security vulnerability disclosure policy (or, for open-source projects, search the README / find the contact details for the maintainers)
  2. Contact the authors of the code (or company, if commercial) to organise responsible disclosure
  3. When a patch has been written and tested, co-ordinate the release of the patch and the disclosure of the vulnerability

Please do not release the vulnerability publicly without first contacting the author (I suggest waiting 60 days and trying multiple methods of communication). This causes maintainers of projects (who in the case of open source are mostly volunteers who pour their time into projects without asking for anything in return) a lot of stress and anxiety, as I've discovered during this incident.

Given that I haven't experienced anything like this before and that I'm only human I'm sure that my response to this incident could use some work - but the manner in which these vulnerabilities were disclosed could use a lot of work too.

Sources and further reading

NAS Backups, Part 2: Btrfs send / receive

Hey there! In the first post of this series, I talked about my plan for a backup NAS to complement my main NAS. In this part, I'm going to show the pair of scripts I've developed to take care of backing up btrfs snapshots.

The first script is called snapshot-send.sh, and it:

  1. Calculates which snapshot it is that requires sending
  2. Uses SSH to remote into the backup NAS
  3. Pipes the output of btrfs send to snapshot-receive.sh on the backup NAS that is called with sudo

Note there that while sudo is used for calling snapshot-receive.sh, the account it uses to SSH into the backup NAS, it doesn't have completely unrestricted sudo access. Instead, a sudo rule is used to restrict it to allow only specific commands to be called (without a password, as this is intended to be a completely automated and unattended system).

The second script is called snapshot-receive.sh, and it receives the output of btrfs send and pipes it to btrfs receive. It also has some extra logic to delete old snapshots and stuff like that.

Both of these are designed to be command line programs in their own right with a simple CLI, and useful error / help messages to assist in understanding it when I come back to it to fix an issue or extend it after many months.

snapshot-send.sh

As described above, snapshot-send.sh sends btrfs snapshot to a remote host via SSH and the snapshot-receive.sh script.

Before we continue and look at it in detail, it is important to note that snapshot-send.sh depends on btrfs-snapshot-rotation. If you haven't already done so, you should set that up first before setting up my scripts here.

If you have btrfs-snapshot-rotation setup correctly, you should have something like this in your crontab:

# Btrfs automatic snapshots
0 * * * *       cronic /root/btrfs-snapshot-rotation/btrfs-snapshot /mnt/some_btrfs_filesystem/main /mnt/some_btrfs_filesystem/main/.snapshots hourly 8
0 2 * * *       cronic /root/btrfs-snapshot-rotation/btrfs-snapshot /mnt/some_btrfs_filesystem/main /mnt/some_btrfs_filesystem/main/.snapshots daily 4
0 2 * * 7       cronic /root/btrfs-snapshot-rotation/btrfs-snapshot /mnt/some_btrfs_filesystem/main /mnt/some_btrfs_filesystem/main/.snapshots weekly 4

I use cronic there to reduce unnecessary emails. I also have a subvolume there for the snapshots:

sudo btrfs subvolume create /mnt/some_btrfs_filesystem/main/.snapshots

Because Btrfs does not take take a snapshot of any child subvolumes when it takes a snapshot, I can use this to keep all my snapshots organised and associated with the subvolume they are snapshots of.

If done right, ls /mnt/some_btrfs_filesystem/main/.snapshots should result in something like this:

2021-07-25T02:00:01+00:00-@weekly  2021-08-17T07:00:01+00:00-@hourly
2021-08-01T02:00:01+00:00-@weekly  2021-08-17T08:00:01+00:00-@hourly
2021-08-08T02:00:01+00:00-@weekly  2021-08-17T09:00:01+00:00-@hourly
2021-08-14T02:00:01+00:00-@daily   2021-08-17T10:00:01+00:00-@hourly
2021-08-15T02:00:01+00:00-@daily   2021-08-17T11:00:01+00:00-@hourly
2021-08-15T02:00:01+00:00-@weekly  2021-08-17T12:00:01+00:00-@hourly
2021-08-16T02:00:01+00:00-@daily   2021-08-17T13:00:01+00:00-@hourly
2021-08-17T02:00:01+00:00-@daily   [email protected]
2021-08-17T06:00:01+00:00-@hourly

Ignore the [email protected] there for now - it's created by snapshot-send.sh so that it can remember the name of the snapshot it last sent. We'll talk about it later.

With that out of the way, let's start going through snapshot-send.sh! First up is the CLI and associated error handling:

#!/usr/bin/env bash
set -e;

dir_source="${1}";
tag_source="${2}";
tag_dest="${3}";
loc_ssh_key="${4}";
remote_host="${5}";

if [[ -z "${remote_host}" ]]; then
    echo "This script sends btrfs snapshots to a remote host via SSH.
The script snapshot-receive must be present on the remote host in the PATH for this to work.
It pairs well with btrfs-snapshot-rotation: https://github.com/mmehnert/btrfs-snapshot-rotation
Usage:
    snapshot-send.sh <snapshot_dir> <source_tag_name> <dest_tag_name> <ssh_key> <[email protected]>

Where:
    <snapshot_dir> is the path to the directory containing the snapshots
    <source_tag_name> is the tag name to look for (see btrfs-snapshot-rotation).
    <dest_tag_name> is the tag name to use when sending to the remote. This must be unique across all snapshot rotations sent.
    <ssh_key> is the path to the ssh private key
    <[email protected]> is the user@host to connect to via SSH" >&2;
    exit 0;
fi

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ ! -e "${loc_ssh_key}" ]]; then
    echo "Error: When looking for the ssh key, no file was found at '${loc_ssh_key}' (have you checked the spelling and file permissions?)." >&2;
    exit 1;
fi
if [[ ! -d "${dir_source}" ]]; then
    echo "Error: No source directory located at '${dir_source}' (have you checked the spelling and permissions?)" >&2;
    exit 2;
fi

###############################################################################

Pretty simple stuff. snapshot-send.sh is called like so:

snapshot-send.sh /absolute/path/to/snapshot_dir SOURCE_TAG DEST_TAG_NAME path/to/ssh_key [email protected]

A few things to unpack here.

  • /absolute/path/to/snapshot_dir is the path to the directory (i.e. btrfs subvolume) containing the snapshots we want to read, as described above.
  • SOURCE_TAG: Given the directory (subvolume) name of a snapshot (e.g. 2021-08-17T02:00:01+00:00-@daily), then the source tag is the bit at the end after the at sign @ - e.g. daily.
  • DEST_TAG_NAME: The tag name to give the snapshot on the backup NAS. Useful, because you might have multiple subvolumes you snapshot with btrfs-snapshot-rotation and they all might have snapshots with the daily tag.
  • path/to/ssh_key: The path to the (unencrypted!) SSH key to use to SSH into the remote backup NAS.
  • [email protected]: The user and hostname of the backup NAS to SSH into.

This is a good time to sort out the remote user we're going to SSH into (we'll sort out snapshot-receive.sh and the sudo rules in the next section below).

Assuming that you already have a Btrfs filesystem setup and automounting on boot on the remote NAS, do this:


sudo useradd --system --home /absolute/path/to/btrfs-filesystem/backups backups
sudo groupadd backup-senders
sudo usermod -a -G backup-senders backups
cd /absolute/path/to/btrfs-filesystem/backups
sudo mkdir .ssh
sudo touch .ssh/authorized_keys
sudo chown -R backups:backups .ssh
sudo chmod -R u=rwX,g=rX,o-rwx .ssh

Then, on the main NAS, generate the SSH key:


mkdir -p /root/backups && cd /root/backups
ssh-keygen -t ed25519 -C backups@main-nas -f /root/backups/ssh_key_backup_nas_ed25519

Then, copy the generated SSH public key to the authorized_keys file on the backup NAS (located at /absolute/path/to/btrfs-filesystem/backups/.ssh/authorized_keys).

Now that's sorted, let's continue with snapshot-send.sh. Next up are a few miscellaneous functions:


# The filepath to the last sent text file that contains the name of the snapshot that was last sent to the remote.
# If this file doesn't exist, then we send a full snapshot to start with.
# We need to keep track of this because we need this information to know which
# snapshot we need to parent the latest snapshot from to send snapshots incrementally.
filepath_last_sent="${dir_source}/last_sent_@${tag_source}.txt";

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

## Lists all the currently available snapshots for the current source tag.
list_snapshots() {
    find "${dir_source}" -maxdepth 1 ! -path "${dir_source}" -name "*@${tag_source}" -type d;
}

## Returns an exit code of 0 if we've sent a snapshot, or 1 if we haven't.
have_sent() {
    if [[ ! -f "${filepath_last_sent}" ]]; then
        return 1;
    else
        return 0;
    fi
}

## Fetches the directory name of the last snapshot sent to the remote with the given tag name.
last_sent() {
    if [[ -f "${filepath_last_sent}" ]]; then
        cat "${filepath_last_sent}";
    fi
}

# Runs snapshot-receive on the remote host.
do_ssh() {
    ssh -o "ServerAliveInterval=900" -i "${loc_ssh_key}" "${remote_host}" sudo snapshot-receive "${tag_dest}";
}

Particularly of note is the filepath_last_sent variable - this is set to the path to that text file I mentioned earlier.

Other than that it's all pretty well commented, so let's continue on. Next, we need to determine the name of the latest snapshot:

latest_snapshot="$(list_snapshots | sort | tail -n1)";
latest_snapshot_dirname="$(dirname "${latest_snapshot}")";

With this information in hand we can compare it to the last snapshot name we sent. We store this in the text file mentioned above - the path to which is stored in the filepath_last_sent variable.

if [[ "$(dirname "${latest_snapshot_dirname}")" == "$(cat "${filepath_last_sent}")" ]]; then
    if [[ -z "${FORCE_SEND}" ]]; then
        echo "We've sent the latest snapshot '${latest_snapshot_dirname}' already and the FORCE_SEND environment variable is empty or not specified, skipping";
        exit 0;
    else
        echo "We've sent it already, but sending it again since the FORCE_SEND environment variable is specified";
    fi
fi

If the latest snapshot has the same name as the one we last send, we exit out - unless the FORCE_SEND environment variable is specified (to allow for an easy way to fix stuff if it goes wrong on the other end).

Now, we can actually send the snapshot to the remote:


if ! have_sent; then
    log_msg "Sending initial snapshot $(dirname "${latest_snapshot}")";
    btrfs send "${latest_snapshot}" | do_ssh;
else
    parent_snapshot="${dir_source}/$(last_sent)";
    if [[ ! -d "${parent_snapshot}" ]]; then
        echo "Error: Failed to locate parent snapshot at '${parent_snapshot}'" >&2;
        exit 3;
    fi

    log_msg "Sending incremental snapshot $(dirname "${latest_snapshot}") parent $(last_sent)";
    btrfs send -p "${parent_snapshot}" "${latest_snapshot}" | do_ssh;
fi

have_sent simply determines if we have previously sent a snapshot before. We know this by checking the filepath_last_sent text file.

If we haven't, then we send a full snapshot rather than an incremental one. If we're sending an incremental one, then we find the parent snapshot (i.e. the one we last sent). If we can't find it, we generate an error (it's because of this that you need to store at least 2 snapshots at a time with btrfs-snapshot-rotation).

After sending a snapshot, we need to update the filepath_last_sent text file:

log_msg "Updating state information";
basename "${latest_snapshot}" >"${filepath_last_sent}";
log_msg "Snapshot sent successfully";

....and that concludes snapshot-send.sh! Once you've finished reading this blog post and testing your setup, put your snapshot-send.sh calls in a script in /etc/cron.daily or something.

snapshot-receive.sh

Next up is the receiving end of the system. The CLI for this script is much simpler, on account of sudo rules only allowing exact and specific commands (no wildcards or regex of any kind). I put snapshot-receive.sh in /usr/local/sbin and called it snapshot-receive.

Let's get started:

#!/usr/bin/env bash

# This script wraps btrfs receive so that it can be called by non-root users.
# It should be saved to '/usr/local/sbin/snapshot-receive' (without quotes, of course).
# The following entry needs to be put in the sudoers file:
# 
# %backup-senders   ALL=(ALL) NOPASSWD: /usr/local/sbin/snapshot-receive TAG_NAME
# 
# ....replacing TAG_NAME with the name of tag you want to allow. You'll need 1 line in your sudoers file per tag you want to allow.
# Edit your sudoers file like this:
# sudo visudo

# The ABSOLUTE path to the target directory to receive to.
target_dir="CHANGE_ME";

# The maximum number of backups to keep.
max_backups="7";

# Allow only alphanumeric characters in the tag
tag="$(echo "${1}" | tr -cd '[:alnum:]-_')";

snapshot-receive.sh only takes a single argument, and that's the tag it should use for the snapshot being received:


sudo snapshot-receive DEST_TAG_NAME

The target directory it should save snapshots to is stored as a variable at the top of the file (the target_dir there). You should change this based on your specific setup. It goes without saying, but the target directory needs to be a directory on a btrfs filesystem (preferable raid1, though as I've said before btrfs raid1 is a misnomer). We also ensure that the tag contains only safe characters for security.

max_backups is the maximum number of snapshots to keep. Any older snapshots will be deleted.

Next, ime error handling:

###############################################################################

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ -z "${tag}" ]]; then
    echo "Error: No tag specified. It should be specified as the 1st and only argument, and may only contain alphanumeric characters." >&2;
    echo "Example:" >&2;
    echo "    snapshot-receive TAG_NAME_HERE" >&2;
    exit 4;
fi

Nothing too exciting. Continuing on, a pair of useful helper functions:


###############################################################################

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

list_backups() {
    find "${target_dir}/${tag}" -maxdepth 1 ! -path "${target_dir}/${tag}" -type d;
}

list_backups lists the snapshots with the given tag, and log_msg logs messages to stdout (not stderr unless there's an error, because otherwise cronic will dutifully send you an email every time the scripts execute). Next up, more error handling:

###############################################################################

if [[ "${target_dir}" == "CHANGE_ME" ]]; then
    echo "Error: target_dir was not changed from the default value." >&2;
    exit 1;
fi

if [[ ! -d "${target_dir}" ]]; then
    echo "Error: No directory was found at '${target_dir}'." >&2;
    exit 2;
fi

if [[ ! -d "${target_dir}/${tag}" ]]; then
    log_msg "Creating new directory at ${target_dir}/${tag}";
    mkdir "${target_dir}/${tag}";
fi

We check:

  • That the target directory was changed from the default CHANGE_ME value
  • That the target directory exists

We also create a subdirectory for the given tag if it doesn't exist already.

With the preamble completed, we can actually receive the snapshot:

log_msg "Launching btrfs in chroot mode";

time nice ionice -c Idle btrfs receive --chroot "${target_dir}/${tag}";

We use nice and ionice to reduce the priority of the receive to the lowest possible level. If you're using a Raspberry Pi (I have a Raspberry Pi 4 with 4GB RAM) like I am, this is important for stability (Pis tend to fall over otherwise). Don't worry if you experience some system crashes on your Pi when transferring the first snapshot - I've found that incremental snapshots don't cause the same issue.

We also use the chroot option there for increased security.

Now that the snapshot is transferred, we can delete old snapshots if we have too many:

backups_count="$(echo -e "$(list_backups)" | wc -l)";

log_msg "Btrfs finished, we now have ${backups_count} backups:";
list_backups;

while [[ "${backups_count}" -gt "${max_backups}" ]]; do
    oldest_backup="$(list_backups | sort | head -n1)";
    log_msg "Maximum number backups is ${max_backups}, requesting removal of backup for $(dirname "${oldest_backup}")";

    btrfs subvolume delete "${oldest_backup}";

    backups_count="$(echo -e "$(list_backups)" | wc -l)";
done

log_msg "Done, any removed backups will be deleted in the background";

Sorted! The only thing left to do here is to setup those sudo rules. Let's do that now. Execute sudoedit /etc/sudoers, and enter the following:

%backup-senders ALL=(ALL) NOPASSWD: /usr/local/sbin/snapshot-receive TAG_NAME

Replace TAG_NAME with the DEST_TAG_NAME you're using. You'll need 1 entry in /etc/sudoers for each DEST_TAG_NAME you're using.

We assign the rights to the backup-senders group we created earlier, of which the user we are going to SSH in with is a member. This make the system more flexible should we want to extend it later.

Warning: A mistake in /etc/sudoers can leave you unable to use sudo! Make sure you have a root shell open in the background and that you test sudo again after making changes to ensure you haven't made a mistake.

That completes the setup of snapshot-receive.sh.

Conclusion

With snapshot-send.sh and snapshot-receive.sh, we now have a system for transferring snapshots from 1 host to another via SSH. If combined with full disk encryption (e.g. with LUKS), this provides a secure backup system with a number of desirable qualities:

  • The main NAS can't access the backups on the backup NAS (in case fo ransomware)
  • Backups are encrypted during transfer (via SSH)
  • Backups are encrypted at rest (LUKS)

To further secure the backup NAS, one could:

  • Disable SSH password login
  • Automatically start / shutdown the backup NAS (though with full disk encryption when it boots up it would require manual intervention)

At the bottom of this post I've included the full scripts for you to copy and paste.

As it turns out, there will be 1 more post in this series, which will cover generating multiple streams of backups (e.g. weekly, monthly) from a single stream of e.g. daily backups on my backup NAS.

Sources and further reading

Full scripts

snapshot-send.sh

#!/usr/bin/env bash
set -e;

dir_source="${1}";
tag_source="${2}";
tag_dest="${3}";
loc_ssh_key="${4}";
remote_host="${5}";

if [[ -z "${remote_host}" ]]; then
    echo "This script sends btrfs snapshots to a remote host via SSH.
The script snapshot-receive must be present on the remote host in the PATH for this to work.
It pairs well with btrfs-snapshot-rotation: https://github.com/mmehnert/btrfs-snapshot-rotation
Usage:
    snapshot-send.sh <snapshot_dir> <source_tag_name> <dest_tag_name> <ssh_key> <[email protected]>

Where:
    <snapshot_dir> is the path to the directory containing the snapshots
    <source_tag_name> is the tag name to look for (see btrfs-snapshot-rotation).
    <dest_tag_name> is the tag name to use when sending to the remote. This must be unique across all snapshot rotations sent.
    <ssh_key> is the path to the ssh private key
    <[email protected]> is the user@host to connect to via SSH" >&2;
    exit 0;
fi

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ ! -e "${loc_ssh_key}" ]]; then
    echo "Error: When looking for the ssh key, no file was found at '${loc_ssh_key}' (have you checked the spelling and file permissions?)." >&2;
    exit 1;
fi
if [[ ! -d "${dir_source}" ]]; then
    echo "Error: No source directory located at '${dir_source}' (have you checked the spelling and permissions?)" >&2;
    exit 2;
fi

###############################################################################

# The filepath to the last sent text file that contains the name of the snapshot that was last sent to the remote.
# If this file doesn't exist, then we send a full snapshot to start with.
# We need to keep track of this because we need this information to know which
# snapshot we need to parent the latest snapshot from to send snapshots incrementally.
filepath_last_sent="${dir_source}/last_sent_@${tag_source}.txt";

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

## Lists all the currently available snapshots for the current source tag.
list_snapshots() {
    find "${dir_source}" -maxdepth 1 ! -path "${dir_source}" -name "*@${tag_source}" -type d;
}

## Returns an exit code of 0 if we've sent a snapshot, or 1 if we haven't.
have_sent() {
    if [[ ! -f "${filepath_last_sent}" ]]; then
        return 1;
    else
        return 0;
    fi
}

## Fetches the directory name of the last snapshot sent to the remote with the given tag name.
last_sent() {
    if [[ -f "${filepath_last_sent}" ]]; then
        cat "${filepath_last_sent}";
    fi
}

do_ssh() {
    ssh -o "ServerAliveInterval=900" -i "${loc_ssh_key}" "${remote_host}" sudo snapshot-receive "${tag_dest}";
}

latest_snapshot="$(list_snapshots | sort | tail -n1)";
latest_snapshot_dirname="$(dirname "${latest_snapshot}")";

if [[ "$(dirname "${latest_snapshot_dirname}")" == "$(cat "${filepath_last_sent}")" ]]; then
    if [[ -z "${FORCE_SEND}" ]]; then
        echo "We've sent the latest snapshot '${latest_snapshot_dirname}' already and the FORCE_SEND environment variable is empty or not specified, skipping";
        exit 0;
    else
        echo "We've sent it already, but sending it again since the FORCE_SEND environment variable is specified";
    fi
fi

if ! have_sent; then
    log_msg "Sending initial snapshot $(dirname "${latest_snapshot}")";
    btrfs send "${latest_snapshot}" | do_ssh;
else
    parent_snapshot="${dir_source}/$(last_sent)";
    if [[ ! -d "${parent_snapshot}" ]]; then
        echo "Error: Failed to locate parent snapshot at '${parent_snapshot}'" >&2;
        exit 3;
    fi

    log_msg "Sending incremental snapshot $(dirname "${latest_snapshot}") parent $(last_sent)";
    btrfs send -p "${parent_snapshot}" "${latest_snapshot}" | do_ssh;
fi


log_msg "Updating state information";
basename "${latest_snapshot}" >"${filepath_last_sent}";
log_msg "Snapshot sent successfully";

snapshot-receive.sh

#!/usr/bin/env bash

# This script wraps btrfs receive so that it can be called by non-root users.
# It should be saved to '/usr/local/sbin/snapshot-receive' (without quotes, of course).
# The following entry needs to be put in the sudoers file:
# 
# %backup-senders   ALL=(ALL) NOPASSWD: /usr/local/sbin/snapshot-receive TAG_NAME
# 
# ....replacing TAG_NAME with the name of tag you want to allow. You'll need 1 line in your sudoers file per tag you want to allow.
# Edit your sudoers file like this:
# sudo visudo

# The ABSOLUTE path to the target directory to receive to.
target_dir="CHANGE_ME";

# The maximum number of backups to keep.
max_backups="7";

# Allow only alphanumeric characters in the tag
tag="$(echo "${1}" | tr -cd '[:alnum:]-_')";

###############################################################################

# $EUID = effective uid
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This script must be run as root (currently running as effective uid ${EUID})" >&2;
    exit 5;
fi

if [[ -z "${tag}" ]]; then
    echo "Error: No tag specified. It should be specified as the 1st and only argument, and may only contain alphanumeric characters." >&2;
    echo "Example:" >&2;
    echo "    snapshot-receive TAG_NAME_HERE" >&2;
    exit 4;
fi

###############################################################################

## Logs a message to stderr.
# $*    The message to log.
log_msg() {
    echo "[ $(date +%Y-%m-%dT%H:%M:%S) ] remote/${HOSTNAME}: >>> ${*}";
}

list_backups() {
    find "${target_dir}/${tag}" -maxdepth 1 ! -path "${target_dir}/${tag}" -type d;
}

###############################################################################

if [[ "${target_dir}" == "CHANGE_ME" ]]; then
    echo "Error: target_dir was not changed from the default value." >&2;
    exit 1;
fi

if [[ ! -d "${target_dir}" ]]; then
    echo "Error: No directory was found at '${target_dir}'." >&2;
    exit 2;
fi

if [[ ! -d "${target_dir}/${tag}" ]]; then
    log_msg "Creating new directory at ${target_dir}/${tag}";
    mkdir "${target_dir}/${tag}";
fi

log_msg "Launching btrfs in chroot mode";

time nice ionice -c Idle btrfs receive --chroot "${target_dir}/${tag}";

backups_count="$(echo -e "$(list_backups)" | wc -l)";

log_msg "Btrfs finished, we now have ${backups_count} backups:";
list_backups;

while [[ "${backups_count}" -gt "${max_backups}" ]]; do
    oldest_backup="$(list_backups | sort | head -n1)";
    log_msg "Maximum number backups is ${max_backups}, requesting removal of backup for $(dirname "${oldest_backup}")";

    btrfs subvolume delete "${oldest_backup}";

    backups_count="$(echo -e "$(list_backups)" | wc -l)";
done

log_msg "Done, any removed backups will be deleted in the background";

MIDI to Music Box score converter

Keeping with the theme of things I've forgotten to blog about (every time I think I've mentioned everything, I remember something else), in this post I'm going to talk about yet another project of mine that's been happily sitting around (and I've nearly blogged about but then forgot on at least 3 separate occasions): A MIDI file to music box conversion script. A while ago, I bought one of these customisable music boxes:

(Above: My music box (centre), and the associated hole punching device it came with (top left).)

The basic principle is that you bunch holes in a piece of card, and then you feed it into the music box to make it play the notes you've punched into the card. Different variants exist with different numbers of notes - mine has a staggering 30 different nodes it can play!

It has a few problems though:

  1. The note names on the card are wrong
  2. Figuring out which one is which is a problem
  3. Once you've punched a hole, it's permanent
  4. Punching the holes is fiddly

Solving problem #4 might take a bit more work (a future project out-of-scope of this post), but the other 3 are definitely solvable.

MIDI is a protocol (and a file format) for storing notes from a variety of different instruments (this is just the tip of the iceberg, but it's all that's needed for this post), and it just so happens that there's a C♯ library on NuGet called DryWetMIDI that can read MIDI files in and allow one to iterate over the notes inside.

By using this and an homebrew SVG writer class (another blog post topic for another time once I've tidied it up), I implemented a command-line program I've called MusicBoxConverter (the name needs work - comment below if you can think of a better one) that converts MIDI files to an SVG file, which can then be printed (carefully - full instructions on the project's page - see below).

Before I continue, it's perhaps best to give an example of my (command line) program's output:

Example output

(Above: A lovely little tune from the TV classic The Clangers, converted for a 30 note music box by my MusicBoxConverter program. I do not own the song!)

The output from the program can (once printed to the correct size) then be cut out and pinned onto a piece of card for hole punching. I find paper clips work for this, but in future I might build myself an Arduino-powered device to punch holes for me.

The SVG generated has a bunch of useful features that makes reading it easy:

  • Each note is marked by a blue circle with a red plus sign to help with accuracy (the hole puncher that came with mine has lines on it that make a plus shape over the viewing hole)
  • Each hole is numbered to aid with getting it the right way around
  • Big arrows in the background ensure you don't get it backwards
  • The orange bar indicates the beginning
  • The blue circle should always be at the bottom and the and the red circle at the top, so you don't get it upside down
  • The green lines should be lined up with the lines on the card

While it's still a bit tedious to punch the holes, it's much easier be able to adapt a score in Musescore, export to MIDI, push it through MusicBoxConverter than doing lots of trial-and-error punching it directly unaided.

I won't explain how to use the program in this blog post in any detail, since it might change. However, the project's README explains in detail how to install and use it:

https://git.starbeamrainbowlabs.com/sbrl/MusicBoxConverter

Something worth a particular mention here is the printing process for the generated SVGs. It's somewhat complicated, but this is unfortunately necessary in order to avoid rescaling the SVG during the conversion process - otherwise it doesn't come out the right size to be compatible with the music box.

A number of improvements stand out to me that I could make to this project. While it only supports 30 note music boxes at the moment, it can easily be extended to support multiple other different type of music box (I just don't have them to hand).

Implementing a GUI is another possible improvement but would take a lot of work, and might be best served as a different project that uses my MusicBoxConverter CLI under the hood.

Finally, as I mentioned above, in a future project (once I have a 3D printer) I want to investigate building an automated hole punching machine to make punching the holes much easier.

If you make something cool with this program, please comment below! I'd love to hear from you (and I may feature you in the README of the project too).

Sources and further reading

  • Musescore - free and open source music notation program with a MIDI export function
  • MusicBoxConverter (name suggestions wanted in the comments below)

NAS Backups, Part 1: Overview

After building my nice NAS, the next thing on my sysadmin todo list was to ensure it is backed up. In this miniseries (probably 2 posts), I'm going to talk about the backup NAS that I've built. As of the time of typing, it is sucessfully backing up my Btrfs subvolumes.

In this first post, I'm going to give an overview of the hardware I'm using and the backup system I've put together. In future posts, I'll go into detail as to how the system works, and which areas I still need to work on moving forwards.

Personally, I find that the 3-2-1 backup strategy is a good rule of thumb:

  • 3 copies of the data
  • 2 different media
  • 1 off-site

What this means is tha you should have 3 copies of your data, with 2 local copies and one remote copy in a different geographical location. To achieve this, I have solved first the problem of the local backup copy, since it's a lot easier and less complicated than the remote one. Although I've talked about backups before (see also), in this case my solution is slightly different - partly due to the amount of data involved, and partly due to additional security measures I'm trying to get into place.

Hardware

For hardware, I'm using a Raspberry Pi 4 with 4GB RAM (the same as the rest of my cluster), along with 2 x 4 TB USB 3 external hard drives. This is a fairly low-cost and low-performance solution. I don't particularly care how long it takes the backup to complete, and it's relatively cheap to replace if it fails without being unreliable (Raspberry Pis are tough little things).

Here's a diagram of how I've wired it up:

(Can't see the above? Try a direct link to the SVG. Created with drawio.)

I use USB Y-cables to power the hard rives directly from the USB power supply, as the Pi is unlikely to be able to supply enough power for mulitple external hard drives on it's own.

Important Note: As I've discovered with a different unrelated host on my network, if you do this you can back-power the Pi through the USB Y cable, and potentially corrupt the microSD card by doing so. Make sure you switch off the entire USB power supply at once, rather than unplug just the Pi's power cable!

For a power supply, I'm using an Anker 10 port device (though I bought through Amazon, since I wasn't aware that Anker had their own website) - the same one that powers my Pi cluster.

Strategy

To do the backup itself I'm using the fact that I store my data in Btrfs subvolumes and the btrfs send / btrfs receive commands to send my subvolumes to the remote backup host over SSH. This has a number of benefits:

  1. The host doing the backing up has no access to the resulting backups (so if it gets infected it can't also infect the backups)
  2. The backups are read-only Btrfs snapshots (so if the backup NAS gets infected my backups can't be altered without first creating a read-write snapshot)
  3. Backups are done incrementally to save time, but a full backup is done automatically on the first run (or if the local metadata is missing)

While my previous backup solution using Restic for the server that sent you this web page has point #3 on my list above, it doesn't have points 1 and 2.

Restic does encrypt backups at rest though, which the system I'm setting up doesn't do unless you use LUKS to encrypt the underlying disks that Btrfs stores it's data on. More on that in the future, as I have tentative plans to deal with my off-site backup problem using a similar technique to that which I've used here that also encrypts data at rest when a backup isn't taking place.

In the next post, I'll be diving into the implementation details for the backup system I've created and explaining it in more detail - including sharing the pair of scripts that I've developed that do the heavy lifting.

WorldEditAdditions: The story of the //convolve command

Sometimes, one finds a fascinating new use for an algorithm in a completely unrelated field to its original application. Such is the case with the algorithm behind the //convolve command in WorldEditAdditions which I recently blogged about. At the time, I had tried out the //smooth command provided by we_env, but The //smooth command smooths out rough edges in the terrain inside the defined WorldEdit region, but I was less than satisfied with it's flexibility, configurability, and performance.

To this end, I set about devising a solution. My breakthrough in solving the problem occurred when I realised that the problem of terrain smoothing is much easier when you consider it as a 2D heightmap, which is itself suspiciously like a greyscale image. Image editors such as GIMP already have support for smoothing with their various blur filters, the most interesting of which for my use case was the gaussian blur filter. By applying a gaussian blur to the terrain heightmap, it would have the effect of smoothing the terrain - and thus //convolve was born.

If you're looking for a detailed explanation on how to use this command, you're probably in the wrong place - this post is more of a developer commentary on how it is implemented. For an explanation on how to use the command, please refer to the chat command reference.

The gaussian blur effect is applied by performing an operation called a convolution over the input 2D array. This function is not specific to image editor (it can be found in AI too), and it calculates the new value for each pixel by taking a weighted sum of it's existing value and all of it's neighbours. It's perhaps best explained with a diagram:

All the weights in the kernel (sometimes called a filter) are floating-point numbers and should add up to 1. Since we look at each pixels neighbours, we can't operate on the pixels around the edge of the image. Different implementations have different approaches to solving this problem:

  1. Drop them in the output image (i.e. the output image is slightly smaller than the input) - a CNN in AI does this by default in most frameworks
  2. Ignore the pixel and pass it straight through
  3. Take only a portion of the kernel and remap the weights to that we can use that instead
  4. Wrap around to the other side of the image

For my implementation of the //convolve command, the heightmap in my case is essentially infinite in all directions (although I obviously request a small subset thereof), so I chose option 2 here. I calculate the boundary given the size of the kernel and request an area with an extra border around it so I can operate on all the pixels inside the defined region.

Let's look at the maths more specifically here:

Given an input image, for each pixel we take a multiply the kernel by each pixel and it's neighbours, and for each resulting matrix we sum all the values therein to get the new pixel value.

The specific weightings are of course configurable, and are represented by a rectangular (often square) 2D array. By manipulating these weightings, one can perform different kinds of operations. All of the following operations can be performed using this method:

Currently, WorldEditAdditions supports 3 different kernels in the //convolve command:

  • Box blur - frequently has artefacts
  • Gaussian Blur (additional parameter: sigma, which controls the 'blurriness') - the default, and also most complicated to implement
  • Pascal - A unique variant I derived from Pascal's Triangle. Works out as slightly sharper than Gaussian Blur without having artifacts like box blur.

After implementing the core convolution engine, writing functions to generate kernels of defined size for the above kernels was a relatively simple matter. If you know of another kernel that you think would be useful to have in WorldEditAdditions, I'm definitely open to implementing it.

For the curious, the core convolutional filter can be found in convolve.lua. It takes 4 arguments:

  1. The heightmap - a flat ZERO indexed array of values. Think of a flat array containing image data like the HTML5 Canvas getImageData() function.
  2. The size of the heightmap in the form { x = width, z = height }
  3. The kernel (internally called a matrix), in the same form as the heightmap
  4. The size of the kernel in the same form as the size of the heightmap.

Reflecting on the implementation, I'm pretty happy with it within the restrictions of the environment within which it runs. As you might have suspected based on my explanation above, convolutionally-based operations are highly parallelisable (there's a reason they are used in AI), and so there's a ton of scope for at using multiple threads, SIMD, and more - but unfortunately the Lua (5.1! It's so frustrating since 5.4 is out now) environment in Minetest doesn't allow for threading or other optimisations to be made.

The other thing I might do something about soon is the syntax of the command. Currently it looks like this:

//convolve <kernel> [<width>[,<height>]] [<sigma>]

Given the way I've used this command in-the-field, I've found myself specifying the kernel width and the sigma value more often than I have found myself changing the kernel. With this observation in hand, it would be nice to figure out a better syntax here that reduces the amount of typing for everyday use-cases.

In a future post about WorldEditAdditions, I want to talk about a number of topics, such as the snowballs algorithm in the //erode. I also potentially may post about the new //noise2d command I've working on, but that might take a while (I'm still working on implementing a decent noise function for that, since at the time of typing the Perlin noise implementation I've ported is less than perfect).

WorldEditAdditions: More WorldEdit commands for Minetest

Personally, I see enormous creative potential in games such as Minetest. For those who aren't in the know, Minetest is a voxel sandbox game rather like the popular Minecraft. Being open-source though, it has a solid Lua Modding API that allows players with a bit of Lua knowledge to script their own mods with little difficulty (though Lua doesn't come with batteries included, but that's a topic for another day). Given the ease of making mods, Minetest puts much more emphasis on installing and using mods - in fact most content is obtained this way.

Personally I find creative building in Minetest a relaxing activity, and one of the mods I have found most useful for this is WorldEdit. Those familiar with Minecraft may already be aware of WorldEdit for Minecraft - Minetest has it's own equivalent too. WorldEdit in both games provides an array of commands one can type into the chat window to perform various functions to manipulate the world - for example fill an area with blocks, create shapes such as spheres and pyramids, or replace 1 type of node with another.

Unfortunately though, WorldEdit for Minetest (henceforth simply WorldEdit) doesn't have quite the same feature set that WorldEdit for Minecraft does - so I decided to do something about it. Initially, I contributed a pull request to node weighting support to the //mix command, but I quickly realised that the scale of the plans I had in mind weren't exactly compatible with WorldEdit's codebase (as of the time of typing WorldEdit for Minetest's codebase consists of a relatively small number of very long files).

(Above: The WorldEditAdditions logo, kindly created by @VorTechnix with Blender 2.9)

To this end, I decided to create my own project in which I could build out the codebase in my own way, without waiting for lots of pull requests to be merged (which would probably put strain on the maintainers of WorldEdit).

The result of this is a new mod for Minetest which I've called WorldEditAdditions. Currently, it has over 35 additional commands (though by the time you read this that number has almost certainly grown!) and 2 handheld tools that extend the core feature set provided by WorldEdit, adding commands to do things like:

These are just a few of my favourites - I've implemented many more beyond those in this list. VorTechnix has also contributed a raft of improvements, from improvements to //torus to an entire suite of selection commands (//srect, //scol, //scube, and more), which has been an amazing experience to collaborate on an open-source project with someone on another continent (open-source is so cool like that).

A fundamental difference I decided on at the beginning when working of WorldEditAdditions was to split my new codebase into lots of loosely connected files, and have each file have a single purpose. If a files gets over 150 lines, then it's a candidate for being split up - for example every command has a backend (which sometimes itself spans multiple files, as in the case of //convolve and //erode) that actually manipulates the Minetest world and a front-end that handles the chat command parsing (this way we also get an API for free! Still working on documenting that properly though - if anyone knows of something like documentation for Lua please comment below). By doing this, I enable the codebase to scale in a way that the original WorldEdit codebase does not.

I've had a ton of fun so far implementing different commands and subsequently using them (it's so satisfying to see a new command finally working!), and they have already proved very useful in creative building. Evidently others think so too, as we've already had over 4800 downloads on ContentDB: ContentDB Shield listing the live number of downloads

Given the enormous amount of work I and others have put into WorldEditAdditions and the level of polish it is now achieving (on par with my other big project Pepperminty Wiki, which I've blogged about before), recently I also built a website for it:

The WorldEditAdditions website

You can visit it here: https://worldeditadditions.mooncarrot.space/

I built the website with Eleventy, as I did with the Pepperminty Wiki website (blog post). My experience with Eleventy this time around was much more positive than last time - more on this in future blog post. For the observant, I did lift the fancy button code on the new WorldEditAdditions website from the Pepperminty Wiki website :D

The website has the number of functions:

  • Explaining what WorldEditAdditions is
  • Instructing people on how to install it
  • Showing people how to use it
  • Providing a central reference of commands

I think I covered all of these bases fairly well, but only time will tell how actual users find it (please comment below! It gives me a huge amount of motivation to continue working on stuff like this).

Several components of the WorldEditAdditions codebase deserve their own blog posts that have not yet got one - especially //erode and //convolve, so do look out for those at some point in the future.

Moving forwards with WorldEditAdditions, I want to bring it up to feature parity with Worldedit for Minecraft. I also have a number of cool and unique commands in mind I'm working on such as //noise (apply an arbitrary 2d noise function to the height of the terrain), //mathapply (execute another command, but only apply changes to the nodes whose coordinates result in a number greater than 0.5 when pushed through a given arbitrary mathematical expression - I've already started implementing a mathematical expression parser in Lua with a recursive-descent parser, but have run into difficulties with operator precedence), and exporting Minetest regions to obj files that can then be imported into Blender (a collaboration with @VorTechnix).

If WorldEditAdditions has caught your interest, you can get started by visiting the WorldEditAdditions website: https://worldeditadditions.mooncarrot.space/.

If you find it useful, please consider starring it on GitHub and leaving a review on ContentDB! If you run into difficulties, please open an issue and I'll help you out.

simple-dash fork: now with directory support!

A while back (I still have all sorts of projects I've forgotten to blog about - with many more to come), I forked an excellent project called simple-dash, which is a web dashboard. You can configure it to display 1 or more links, and it presents them nice and cleanly in the middle of the page.

I don't make forks lightly, but in this case I liked the project a lot - but I wanted to add enough features that I felt that I might be taking it in a different direction than the original project. The original project also hasn't been touched in 2+ years, and the author hasn't had any contributions on GitHub in that time either - so think it's fair to say that it's unlikely that any pull request I open wouldn't be looked at either (if the original author is reading this, I'm happy to open one!).

Anyway, before I continue too far, here's a screenshot of my improvements in action:

A screenshot of my improvements - explained in more detail below.

I use simple-dash in multiple places to provide a dashboard of links to the various services that I run so I both don't lose them and, in some cases, other people in my family can easily access said services.

I added a number of features here. The first is invisible, but I completely re-implemented the layout to use the CSS Grid (see also: a, b). If you've played with CSS before but aren't yet aware of the CSS grid yet - I can thoroughly recommend you take a moment to investigate - it will blow you away and solve all your layout problems all at the same time! In short, it's like a 2d version of the flexbox.

Since the original has full mobile support, I continue that trend in the rewrite with some CSS media queries to change the number of items per row based on the width of your screen.

The other invisible change is that I changed the language the configuration file is written in to TOML, which is a much more friendly language to write configuration files in.

Anyway, in terms of more visible changes, I also added the ability to set a background image, as well as the default random triangles background. Icons also got the same treatment - gaining the ability to display an image instead of a Font Awesome icon (I haven't actually used Font Awesome before, so this was an interesting experience - even if it was already setup in this project).

Last but certainly not least, I added the ability group pages into folders. Here's a screenshot of what the contents of that folder in the top left looks like when opened:

simple-dash with a folder open

You can't see it here, but it's even animated! Link to a demo at the end of this post.

There were a number of different challenges to overcome to get this working right actually - it was not trivial at all. There are 2 components to it: The CSS to style it, and the Javascript to fiddle the class list on the folder itself to add / remove the active class so that I could distinguish between open and closed folders in the CSS, and also prevent the click event from propagating through to the <a href="https://example.com/">links</a> links when the folder is closed.

Thinking about it, it may be possible use a clever pointer-events: none to avoid the Javascript.

The CSS does the heavy lifting here though. For inactive folders, I use a CSS grid with overflow: none to display the 1st 4 icons in a preview. When the folder becomes active, position: fixed breaks it out of the layout of the rest of the page (sadly leaving a placeholder behind would require an additional html element), and the content reflows to use the same CSS as the main grid of tiles.

Through some CSS grid wizardry (you can do anything with CSS grid, it's amazing) and a container element, I can even fade out the rest of the page while the folder is open.

Clicking on the items in a folder when the folder is open takes you to their destination as usual, while clicking anywhere else closes the folder again.

I've got a demo running over here if you'd like to play around with it:

sbrl's simple-dash fork demo

The background is set to a random image from Unsplash. It loads fine for me, but sometimes it takes a moment.

If this looks like something, you'd like to use for yourself, my fork is open-source! Check it out here:

sbrl/simple-dash on GitHub

You can find instructions on how to set it up for yourself in the README. You'll need npm to install dependencies - this should come bundled with Node.js. You can also find a lovingly-commented example configuration file here:

config.sample.toml

If you have any difficulties setting it up, want to request a feature, or even (gasp!) report a bug, please open an issue. While I do monitor the comments here on this blog, GitHub issues are a much better place to track bugs and feature requests.

Art by Mythdael