Starbeamrainbowlabs

Stardust
Blog

Backing up with tar, curl, and SFTP with key-based authentication

I have multiple backup strategies, from restic (which was preceded by duplicity) to btrfs snapshots that I sync over ssh. You can never have too many backups though (especially for your most valuable data that can't be easily replaced), so in this post I want to share another of the mechanisms I employ.

Backup systems have to suit the situation at hand, and in this case I have a personal git server which I backup daily to Backblaze B2. In order to be really absolutely sure that I don't lose it though, I also back it up to my home NAS (see also the series that I wrote on it). As you might have guessed km the title of this post, it takes backups using tar. I have recently upgraded it to transfer these backups over SFTP (SSH File Transfer Protocol).

Given that the sftp command exists, one might wonder why I use curl instead. Unfortunately, sftp as far as I can tell does not support uploading a file passed in though stdin - which is very useful when you have limited disk space on the source host! But using curl, we can pipe the output of tar directly to curl without touching the disk.

Documentation is sadly rather sparse on using curl to upload via SFTP, so it took some digging to figure out how to do it using SSH keys. SSH keys are considerably more secure than using a password (and a growing number of my systems are setup to disallow password authentication altogether), so I'll be using SSH key based authentication in this post.

To start, you'll need to generate a new SSH keypair. I like to use ed25519:

ssh-keygen -t ed25519

When prompted, choose where you want to save it to (preferably with a descriptive name), and then do not put a password on it. This is important, because at least in my case want this to operate completely autonomously without any user input.

Then, copy the public SSH key to your remote server (I strongly recommend using an account that is locked to be SFTP-only and no shell access - this tutorial seems to be good at explaining the steps involved in doing this), and then on the device doing the backing up do a test to both make sure it works and add the remote server to the known_hosts file:

sudo -u backupuser bash
ssh -i path/to/keyfile -T remoteuser@remotehost

Now we've got our SSH / SFTP setup done, we can do the backup itself:

ionice -c Idle nice -n20 tar --create --exclude-tag .BACKUP_IGNORE --gzip --file path/to/dir_to_backup | curl -sS --user "remoteuser:" --key "path/to/sshkey_ed25519" --pubkey "path/to/sshkey_ed25519.pub" -T - "sftp://example.com/path/on/remote/upload_filename.tar.gz"

Let's break this down a bit:

Personally, I'm using this technique with an SSH tunnel, so my variant of the above command looks a bit like this (extra bits around the edges stripped away for clarity):

git_backup_user="sftpbackups";
git_backup_location="sftp://localhost:20204/git-backups";
git_backup_key="path/to/sshkey_ed25519";
upload_filename="git-$(date +"%Y-%m-%d").tar.gz";

nice -n20 tar --create --exclude-tag .BACKUP_IGNORE --gzip --file - git/{data,gitea,repos}/ www/blog | curl -sS --user "${git_backup_user}:" --key "${git_backup_key}" --pubkey "${git_backup_key}.pub" -T - "${git_backup_location}/${upload_filename}"

That's it for this post. If you've got any questions or comments, please post them below.

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

Archive

Art by Mythdael