Cluster, Part 8: The Shoulders of Giants | NFS, Nomad, Docker Registry
Welcome back! It's been a bit of a while, but now I'm back with the next part of my cluster series. As a refresher, here's a list of all the parts in the series so far:
- Cluster, Part 1: Answers only lead to more questions
- Cluster, Part 2: Grand Designs
- Cluster, Part 3: Laying groundwork with Unbound as a DNS server
- Cluster, Part 4: Weaving Wormholes | Peer-to-Peer VPN with WireGuard
- Cluster, Part 5: Staying current | Automating apt updates and using apt-cacher-ng
- Cluster, Part 6: Superglue Service Discovery | Setting up Consul
- Cluster, Part 7: Wrangling... boxes? | Expanding the Hashicorp stack with Docker and Nomad
In this one, we're going to look at running our first job on our Nomad cluster! If you haven't read the previous posts in this series, you'll probably want to go back and read them now, as we're going to be building on the infrastructure we've setup and the groundwork we've laid in the previous posts in this series.
Before we get to that though, we need to sort out shared storage - as we don't know which node in the cluster tasks will be running on. In my case, I'll be setting up NFS. This is hardly the only solution to the issue though - other options include:
If you're going to choose NFS like me though, you should be warned that it's neither encrypted not authenticated. You should ensure that NFS is only run on a trusted network. If you don't have a trusted network, use the WireGuard Mesh VPN trick in part 4 of this series.
NFS: Server
Setting up a server is relatively easy. Simply install the relevant package:
sudo apt install nfs-kernel-server
....edit /etc/exports
to look something like this:
/mnt/somedrive/subdirectory 10.1.2.0/24(rw,async,no_subtree_check)
/mnt/somedrive/subdirectory
is the directory you'd like clients to be able to access, and 10.1.2.0/24
is the IP range that should be allowed to talk to your NFS server.
Next, open up the relevant ports in your firewall (I use UFW):
sudo ufw allow nfs
....and you're done! Pretty easy, right? Don't worry, it'll get harder later on :P
NFS: Client
The client, in theory, is relatively straightforward too. This must be done on all nodes in the cluster - except the node that's acting as the NFS server (although having the NFS server as a regular node in the cluster is probably a bad idea). First, install the relevant package:
sudo apt install nfs-common
Then, update /etc/fstab
and add the following line:
10.1.2.10:/mnt/somedrive/subdirectory /mnt/shared nfs auto,nofail,noatime,intr,tcp,bg,_netdev 0 0
Again, 10.1.2.10
is the IP of the NFS server, and /mnt/somedrive/subdirectory
must match the directory exported by the server. Finally, /mnt/shared
is the location that we're going to mount the directory from the NFS server to. Speaking of, we should create that directory:
sudo mkdir /mnt/shared
I have yet to properly tune the options there on both the client and the server. If I find that I have to change anything here, I'll both come back and edit this and mention it in a future post that I did.
From here, you should be able to mount the NFS share like so:
sudo mount /mnt/shared
You should see the files from the NFS server located in /mnt/shared
. You should check to make sure that this auto-mounts it on boot too (that's what the auto
and _netdev
are supposed to do).
If you experience issues on boot (like me), you might see something like this buried in /var/log/syslog
:
mount[586]: mount.nfs: Network is unreachable
....then we can quickly hack this by creating a script in the directory /etc/network/if-up.d
. It should read something like this should fix the issue:
#!/usr/bin/env bash
mount /mnt/shared
Save this to /etc/network/if-up.d/cluster-shared-nfs
for example, not forgetting to mark it as executable:
sudo chmod +x /etc/network/if-up.d/cluster-shared-nfs
Alternatively, there's autofs
that can do this more intelligently if you prefer.
First Nomad Job: Docker Registry
Now that we've got shared storage online, it's time for the big moment. We're finally going to start our very first job on our Nomad cluster!
It's going to be a Docker registry, and in my very specific case I'm going to be marking it as insecure (gasp!) because it's only going to be accessible from the WireGuard VPN - which I figure provides the encryption and authentication for us to get started reasonably simply without jumping through too many hoops. In the future, I'll probably revisit this in a later post to tighten things up.
Tasks on a Nomad cluster take the form of a Nomad job file. These can written in JSON or HCL (Hashicorp Configuration Language). I'll be using HCL here, because it's easier to read and we're not after machine legibility yet at this stage.
Nomad job files work a little bit like Nginx config files, in that they have nested sequences of blocks in a hierarchical structure. They loosely follow the following pattern:
job > group > task
The job is the top-level block that contains everything else. task
s are the items that actually run on the cluster - e.g. a Docker container. group
s are a way to logically group tasks in a job, and are not required as far as I can tell (but we'll use one here anyway just for illustrative purposes). Let's start with the job spec:
job "registry" {
datacenters = ["dc1"]
# The Docker registry *is* pretty important....
priority = 80
# If this task was a regular task, we'd use a constraint here instead & set the weight to -100
affinity {
attribute = "${attr.class}"
value = "controller"
weight = 100
}
# .....
}
This defines a new job called registry
, and it should be pretty straight forward. We don't need to worry about the datacenters
definition there, because we've only got the 1 (so far?). We set a priority of 80, and get the job to prefer running on nodes with the controller
class (though I observe that this hasn't actually made much of a difference to Nomad's scheduling algorithm at all).
Let's move on to the real meat of the job file: the task definition!
group "main" {
task "registry" {
driver = "docker"
config {
image = "registry:2"
labels { group = "registry" }
volumes = [
"/mnt/shared/registry:/var/lib/registry"
]
port_map {
registry = 5000
}
}
resources {
network {
port "registry" {
static = 5000
}
}
}
# .......
}
}
There's quite a bit to unpack here. The task itself uses the Docker driver, which tells Nomad to run a Docker container.
In the config
block, we define the Docker driver-specific settings. The docker image we're going to run is registry:2
where registry
is the image name, and 2
is the tag. This will to automatically pulled from the Docker hub. Future tasks will pull docker images from our very own private Docker registry, which we're in the process of setting up :D
We also mount a directory into the Docker container to allow it to persist the images that we push to it. This is done through a volume, which is the Docker word for bind-mounting a specific directory on the host system into a given location inside the guest container. For me I'm (currently) going to store the Docker registry data at /mnt/shared/registry
- you should update this if you want to store it elsewhere. Remember this this needs to be a location on your shared storage, as we don't know which node in the cluster the Docker registry is going to run on in advance.
The port_map
allows us to tell Nomad the port(s) that our service inside the Docker container listens on, and attach a logical name to them. We can then expose them in the resources
block. In this specific case, I'm forcing Nomad to statically allocate port 5000 on the host system to point to port 5000 inside the container, for reasons that will become apparent later. This is done with the static
keyword there. If we didn't do this, Nomad would allocate a random port number (which is normally what we'd want, because then we can run lots of copies of the same thing at the same time on the same host).
The last block we need to add to complete the job spec file is the service
block. with a service
block, Nomad will inform Consul that a new service is running, which will then in turn allow us to query it via DNS.
service {
name = "${TASK}"
tags = [ "infrastructure" ]
address_mode = "host"
port = "registry"
check {
type = "tcp"
port = "registry"
interval = "10s"
timeout = "3s"
}
}
The service name here is pulled from the name of the task. We tell Consul about the port number by specifying the logical name we assigned to it earlier.
Finally, we add a health check, to allow Consul to keep an eye on the health of our Docker registry for us. This will appear as a green tick if all is well in the web interface, which we'll be getting to in a future post. The health check in question simply ensures that the Docker registry is listening via TCP on the port it should be.
Here's the completed job file:
job "registry" {
datacenters = ["dc1"]
# The Docker registry *is* pretty important....
priority = 80
# If this task was a regular task, we'd use a constraint here instead & set the weight to -100
affinity {
attribute = "${attr.class}"
value = "controller"
weight = 100
}
group "main" {
task "registry" {
driver = "docker"
config {
image = "registry:2"
labels { group = "registry" }
volumes = [
"/mnt/shared/registry:/var/lib/registry"
]
port_map {
registry = 5000
}
}
resources {
network {
port "registry" {
static = 5000
}
}
}
service {
name = "${TASK}"
tags = [ "infrastructure" ]
address_mode = "host"
port = "registry"
check {
type = "tcp"
port = "registry"
interval = "10s"
timeout = "3s"
}
}
}
// task "registry-web" {
// driver = "docker"
//
// config {
// // We're going to have to build our own - the Docker image on the Docker Hub is amd64 only :-/
// // See https://github.com/Joxit/docker-registry-ui
// image = ""
// }
// }
}
}
Save this to a file, and then run it on the cluster like so:
nomad job run path/to/job/file.nomad
I'm as of yet unsure as to whether Nomad needs the file to persist on disk to avoid it getting confused - so it's probably best to keep your job files in a permanent place on disk to avoid issues.
Give Nomad to start the job, and then you can check on it's status like so:
nomad job status
This will print a summary of the status of all jobs on the cluster. To get detailed information about our new job, do this:
nomad job status registry
It should show that 1 task is running, like this:
ID = registry
Name = registry
Submit Date = 2020-04-26T01:23:37+01:00
Type = service
Priority = 80
Datacenters = dc1
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
main 0 0 1 5 6 1
Latest Deployment
ID = ZZZZZZZZ
Status = successful
Description = Deployment completed successfully
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
main 1 1 1 0 2020-06-17T22:03:58+01:00
Allocations
ID Node ID Task Group Version Desired Status Created Modified
XXXXXXXX YYYYYYYY main 4 run running 6d2h ago 2d23h ago
Ignore the Failed
, Complete
, and Lost
there in my output - I ran into some snags while learning the system and setting mine up :P
You should also be able to resolve the IP of your Docker registry via DNS:
dig +short registry.service.mooncarrot.space
mooncarrot.space
is the root domain I've bought for my cluster. I highly recommend you do the same if you haven't already. Consul exposes all services under the service
subdomain, so in the future you should be able to resolve the IP of all your services in the same way: service_name.service.DOMAIN_ROOT
.
Take care to ensure that it's showing the right IP address here. In my case, it should be the IP address of the wgoverlay
network interface. If it's showing the wrong IP address, you may need to carefully check the configuration of both Nomad and Consul. Specifically, start by checking the network_interface
setting in the client
block of your Nomad worker nodes from part 7 of this series.
Conclusion
We're getting there, slowly but surely. Today we've setup shared storage with NFS, and started our first Nomad job. In doing so, we've started to kick the tyres of everything we've installed so far:
- wesher, our WireGuard Mesh VPN
- Unbound, our DNS server
- Consul, our service discovery superglue
- Nomad, our task scheduler
Truly, we are standing on the shoulders of giants: a whole host of open-source software that thousands of people from across the globe have collaborated together to produce which makes this all possible.
Moving forwards, we're going to be putting that Docker registry to good use. More immediately, we're going to be setting up Fabio (who's documentation is only marginally better than Traefik's, but just good enough that I could figure out how to use it....) in order to take a peek at those cool web interfaces for Nomad and Consul that I keep talking about.
We're also going to be looking at setting up Vault for secret (and certificate, if all goes well) management.
Until then, happy cluster configuration! If you're confused about anything so far, please leave a comment below. If you've got a suggestion to make it even better, please comment also! I'd love to know.
Sources and further reading
- Alternatives to NFS
/etc/network/if-up.d/
NFS automount trickautofs
- Docker Registry
- Nomad