Own your code, part 5: git post-receive hook
In the last post, I took a deep dive into the master git-repo
job that powers the my entire build system. In the next few posts, I'm going to take a look at the bits around the edges that interact with this laminar job - starting with the git post-receive hook in this post.
When you push commits to a git repository, the remote server does a bunch of work to integrate your changes into the remote master copy of the repository. At various points in the process, git allows you to run scripts to augment your repository, and potentially alter the way git ultimately processes the push. You can send content back to the pushing user too - which is how you get those messages on the command-line occasionally when you push to a GitHub repository.
In our case, we want to queue a new Laminar CI job when new commits are pushed to a private Gitea server, for instance (like mine). Doing this isn't particularly difficult, but we do need to collect a bunch of information about the environment we're running in so that we can correctly inform the git-repo
task where it needs to pull the repository from, who pushed the commits, and which commits need testing.
In addition, we want to write 1 universal git post-receive hook script that will work everywhere - regardless of the server the repository is hosted on. Of course, on GitHub you can't run a script directly, but if I ever come into contact with another supporting git server, I want to minimise the amount of extra work I've got to do to hook it up.
Let's jump into the script:
#!/usr/bin/env bash
if [ "${GIT_HOST}" == "" ]; then
GIT_HOST="git.starbeamrainbowlabs.com";
fi
Fairly standard stuff. Here we set a shebang and specify the GIT_HOST
variable if it's not set already. This is mainly just a placeholder for the future, as explained above.
Next, we determine the git repository's url, because I'm not sure that Gitea (my git server, for which this script is intended) actually tells you directly in a git post-receive hook. The post-receive hook script does actually support HTTPS, but this support isn't currently used and I'm unsure how the git-repo
Laminar CI job would handle a HTTPS url:
# The url of the repository in question. SSH is recommended, as then you can use a deploy key.
# SSH:
GIT_REPO_URL="git@${GIT_HOST}:${GITEA_REPO_USER_NAME}/${GITEA_REPO_NAME}.git";
# HTTPS:
# git_repo_url="https://git.starbeamrainbowlabs.com/${GITEA_REPO_USER_NAME}/${GITEA_REPO_NAME}.git";
With the repository url determined, next on the list is the identity of the pusher. At this stage it's a simple matter of grabbing the value of 1 variable and putting it in another as we're only supporting Gitea at the moment, but in the future we may have some logic here to intelligently determine this value.
GIT_AUTHOR="${GITEA_PUSHER_NAME}";
With the basics taken care of, we can start getting to the more interesting bits. Before we do that though, we should define a few common settings:
###### Internal Settings ######
version="0.2";
# The job name to queue.
job_name="git-repo";
###############################
job_name
refers to the name of the Laminar CI job that we should queue to process new commits. version
is a value that we can increment should we iterate on this script in the future, so that we can then tell which repositories have the new version of the post-receive hook and which ones don't.
Next, we need to calculate the virtual name of the repository. This is used by the git-repo
job to generate a 'hologram' copy of itself that acts differently, as explained in the previous post. This is done through a series of Bash transformations on the repository URL:
# 1. Make lowercase
repo_name_auto="${GIT_REPO_URL,,}";
# 2. Trim git@ & .git from url
repo_name_auto="${repo_name_auto/git@}";
repo_name_auto="${repo_name_auto/.git}";
# 3. Replace unknown characters to make it 'safe'
repo_name_auto="$(echo -n "${repo_name_auto}" | tr -c '[:lower:]' '-')";
The result is quite like 'slugification'. For example, this URL:
[email protected]:sbrl/Linux-101.git
...will get turned into this:
git-starbeamrainbowlabs-com-sbrl-linux----
I actually forgot to allow digits in step #3, but it's a bit awkward to change it at this point :P Maybe at some later time when I'm feeling bored I'll update it and fiddle with Laminar's data structures on disk to move all the affected repositories over to the new naming scheme.
Now that we've got everything in place, we can start to process the commits that the user has pushed. The documentation on how this is done in a post-receive hook is a bit sparse, so it took some experimenting before I had it right. Turns out that the information we need is provided on the standard input, so a while-read loop is needed to process it:
while read next_line
do
# .....
done
For each line on the standard input, 3 variables are provided:
- The old commit reference (i.e. the commit before the one that was pushed)
- The new commit reference (i.e. the one that was pushed)
- The name of the reference (usually the branch that the commit being pushed is on)
Commits on multiple branches can be pushed at once, so the name of the branch each commit is being pushed to is kind of important.
Anyway, I pull these into variables like so:
oldref="$(echo "${next_line}" | cut -d' ' -f1)";
newref="$(echo "${next_line}" | cut -d' ' -f2)";
refname="$(echo "${next_line}" | cut -d' ' -f3)";
I think there's some clever Bash trick I've used elsewhere that allows you to pull them all in at once in a single line, but I believe I implemented this before I discovered that trick.
With that all in place, we can now (finally) queue the Laminar CI job. This is quite a monster, as it needs to pass a considerable number of variables to the git-repo
job itself:
LAMINAR_HOST="127.0.0.1:3100" LAMINAR_REASON="Push from ${GIT_AUTHOR} to ${GIT_REPO_URL}" laminarc queue "${job_name}" GIT_AUTHOR="${GIT_AUTHOR}" GIT_REPO_URL="${GIT_REPO_URL}" GIT_COMMIT_REF="${newref}" GIT_REF_NAME="${refname}" GIT_AUTHOR="${GIT_AUTHOR}" GIT_REPO_NAME="${repo_name_auto}";
Laminar CI's management socket listens on the abstract unix socket laminar
(IIRC). Since you can't yet forward abstract sockets over SSH with OpenSSH, I instead opt to use a TCP socket instead. To this end, the LAMINAR_HOST
prefix there is needed to tell laminarc
where to find the management socket that it can use to talk to the Laminar daemon, laminard
- since Gitea and Laminar CI run on different servers.
The LAMINAR_REASON
there is the message that is displayed in the Laminar CI web interface. Said interface is read-only (by design), but very useful for inspecting what's going on. Messages like this add context as to why a given job was triggered.
Lastly, we should send a message to the pushing user, to let them know that a job has been queued. This can be done with a simple echo
, as the standard output is sent back to the client:
echo "[Laminar git hook ${version}] Queued Laminar CI build ("${job_name}" -> ${repo_name_auto}).";
Note that we display the version number of the post-receive hook here. This is how I tell whether I need to give into the Gitea settings to update the hook or not.
With that, the post-receive hook script is complete. It takes a bunch of information lying around, transforms it into a common universal format, and then passes the information on to my continuous integration system - which is then responsible for building the code itself.
Here's the completed script:
#!/usr/bin/env bash
##############################
########## Settings ##########
##############################
# Useful environment variables (gitea):
# GITEA_REPO_NAME Repository name
# GITEA_REPO_USER_NAME Repo owner username
# GITEA_PUSHER_NAME The username that pushed the commits
# GIT_HOST Domain name the repo is hosted on. Default: git.starbeamrainbowlabs.com
if [ "${GIT_HOST}" == "" ]; then
GIT_HOST="git.starbeamrainbowlabs.com";
fi
# The url of the repository in question. SSH is recommended, as then you can use a deploy key.
# SSH:
GIT_REPO_URL="git@${GIT_HOST}:${GITEA_REPO_USER_NAME}/${GITEA_REPO_NAME}.git";
# HTTPS:
# git_repo_url="https://git.starbeamrainbowlabs.com/${GITEA_REPO_USER_NAME}/${GITEA_REPO_NAME}.git";
# The user that pushed the commits
GIT_AUTHOR="${GITEA_PUSHER_NAME}";
##############################
###### Internal Settings ######
version="0.2";
# The job name to queue.
job_name="git-repo";
###############################
# 1. Make lowercase
repo_name_auto="${GIT_REPO_URL,,}";
# 2. Trim git@ & .git from url
repo_name_auto="${repo_name_auto/git@}";
repo_name_auto="${repo_name_auto/.git}";
# 3. Replace unknown characters to make it 'safe'
repo_name_auto="$(echo -n "${repo_name_auto}" | tr -c '[:lower:]' '-')";
while read next_line
do
oldref="$(echo "${next_line}" | cut -d' ' -f1)";
newref="$(echo "${next_line}" | cut -d' ' -f2)";
refname="$(echo "${next_line}" | cut -d' ' -f3)";
# echo "********";
# echo "oldref: ${oldref}";
# echo "newref: ${newref}";
# echo "refname: ${refname}";
# echo "********";
LAMINAR_HOST="127.0.0.1:3100" LAMINAR_REASON="Push from ${GIT_AUTHOR} to ${GIT_REPO_URL}" laminarc queue "${job_name}" GIT_AUTHOR="${GIT_AUTHOR}" GIT_REPO_URL="${GIT_REPO_URL}" GIT_COMMIT_REF="${newref}" GIT_REF_NAME="${refname}" GIT_AUTHOR="${GIT_AUTHOR}" GIT_REPO_NAME="${repo_name_auto}";
# GIT_REF_NAME and GIT_AUTHOR are used for the LAMINAR_REASON when the git-repo task recursively calls itself
# GIT_REPO_NAME is used to auto-name hologram copies of the git-repo.run task when recursing
echo "[Laminar git hook ${version}] Queued Laminar CI build ("${job_name}" -> ${repo_name_auto}).";
done
#cat -;
# YAY what we're after is on the first line of stdin! :D
# The format appears to be documented here: https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks#_server_side_hooks
# Line format:
# oldref newref refname
# There may be multiple lines that all need handling.
In the next post, I want to finally introduce my very own home-brew build engine: lantern. I've used it in over half a dozen different projects by now, so it's high time I talked about it a bit more formally.
Found this interesting? Spotted a mistake? Got a suggestion to improve it? Comment below!