ENOSUCHBLOG

Programming, philosophy, pedaling.


This blog is now automatically deployed

Oct 4, 2022     Tags: meta    

This post is at least a year old.

This is the first “meta” post I’ve done since 2018, when I updated the blog to support IPv6.

Unlike those changes, these ones aren’t really visible to you (dear reader). But I figured I’d announce them anyways, as both a curiosity and potential reference to other Jekyll users1.

Status quo ante nūntius

This is a Jekyll blog; it’s been one since its inception (nearly a decade now).

It used to be hosted on GitHub Pages, but has been self-hosted since 2018. My main reason for that was HTTPS support2, although control of the entire deployment has afforded me other liberties.

In particular, the blog has grown a couple of extensions (as custom Jekyll plugins):

  centered containers for
       pre-formatted
           text
  

Tables from raw CSV data:

foo bar baz
69 420 1312

…and so forth.

These are all relatively small things, but they add up in terms of productivity and a pleasant blogging experience.

They also add up in terms of dependencies: I couldn’t go back to GitHub pages even if I wanted to, and a complete rebuild of the blog3 requires all kinds of third-party utilities (ImageMagick and LaTeX, among them).

As a result, writing for my blog on new machines is a bit of a pain. Thus, the plan: to move blog deployment (the actual final HTML generation and copying to my server) to a CI provider.

Nitty gritty

The current deployment workflow, from my deskop, is a shell script that (more or less) does the following:

1
2
3
4
5
# generate the site's HTML
jekyll build

# sync with the remote site
rsync -avz --delete --progress _site/ blog:/var/www/html/blog.yossarian.net/

The goal was to take that and stuff it into GitHub Actions4. That means:

  1. Getting the right Ruby and Jekyll environment up and running in CI;
  2. Figuring out SSH authentication and rsync, including access restrictions.

Build environment

It turns out this was really easy: the ruby/setup-ruby gave me just about everything I needed, including automatic bundle invocation and caching.

With four lines of YAML, I had all the (Ruby) dependencies I needed:

1
2
3
4
- uses: ruby/setup-ruby@v1
  with:
    ruby-version: 3.1
    bundler-cache: true

Deployment via SSH with rsync

This part was a little more involved: I wanted to reuse my existing deployment strategy (rsync over SSH), but without sharing an overly broad SSH key to my server.

To begin, I created a brand new ed25519 keypair:

1
ssh-keygen -t ed25519 -f blog-deploy

On the server side, I used two settings in my .ssh/authorized_keys to restrict the key’s use:

  1. The restrict setting, which disables all of SSH’s normal forwarding techniques as well as PTY allocation and a few sources of command execution, like ~/.ssh/rc (a file I didn’t know about)!

  2. The command= setting, which specifies the singular command that gets run after successful key authentication.

The latter setting composes well with rrsync, or “restricted rsync”. rrsync does exactly what it sounds like it does: it restricts the underlying command (supplied by the SSH daemon via SSH_ORIGINAL_COMMAND) to an rsync invocation with a small handful of restricted flags.

Put together, this results in an .ssh/authorized_keys line that looks a big like this:

1
command="/usr/bin/rrsync -wo /var/www/html/blog.yossarian.net/",restrict ssh-ed25519 LONG-KEY-HERE william@blog

The end result: if all functions correctly, someone who manages to steal my CI’s SSH key will be unable to do anything besides execute rsync and, even then, be unable to do anything other than write to the directory my blog is in. In other words, they could deface the blog, but that’s about it.

To actually use my new blog-deploy key in GitHub Actions I tried to start with a ssh-agent setup, but ran into all kinds of problems with passing the right environment variables around. I eventually gave up and did things the silly way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
- name: configure SSH
  env:
    SSH_KEY: $
  run: |
    mkdir -p ~/.ssh/
    echo "${SSH_KEY}" > ~/.ssh/blog-deploy.key
    chmod 600 ~/.ssh/blog-deploy.key

    cat >>~/.ssh/config <<END
    Host blog
        HostName HOST
        User william
        IdentityFile ~/.ssh/blog-deploy.key
        StrictHostKeyChecking no
    END

(I disabled StrictHostKeyChecking because, for these limited purposes, a MITM attack is not a serious concern of mine. An attacker, at worst, would only be able to see the files that will immediately become public anyways.)

From there my deployment script only needed one tweak:

1
2
3
4
5
6
# If we're running in GitHub Actions, rsync is restricted to the right directory.
if [[ -n "${CI}" ]]; then
  rsync -avz --delete --progress _site/ blog:.
else
  rsync -avz --delete --progress _site/ blog:/var/www/html/blog.yossarian.net/
fi

In English: rrsync changes directories to the directory specified with -wo and concatenates any requested paths to the working directory, so the original invocation fails because no such nested directory exists. The “fix” is simply to sync at the host’s current directory in those cases.

Wrapping it up

Put together, this all results in a relatively tidy GitHub Actions workflow:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
on:
  workflow_dispatch:
    inputs:
      dry-run:
        description: "Perform a dry run"
        required: true
        default: "false"

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - uses: ruby/setup-ruby@v1
        with:
          ruby-version: 3.1
          bundler-cache: true

      - name: configure SSH
        env:
          SSH_KEY: $
        run: |
          mkdir -p ~/.ssh/
          echo "${SSH_KEY}" > ~/.ssh/blog-deploy.key
          chmod 600 ~/.ssh/blog-deploy.key

          cat >>~/.ssh/config <<END
          Host blog
              HostName HOST
              User william
              IdentityFile ~/.ssh/blog-deploy.key
              StrictHostKeyChecking no
          END

      - name: deploy
        run: |
          if [[ "$" = "true" ]]; then
            echo "[+] dry run; not deploying"
          else
            echo "[+] deploying"
            ./deploy.sh
          fi

and the deploy.sh:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env bash

set -eo pipefail

installed() {
  cmd=$(command -v "${1}")

  [[ -n "${cmd}" ]] && [[ -f "${cmd}" ]]
  return ${?}
}

bundle check || bundle install

bundle exec jekyll build

[[ -d _site/ ]] || { echo "No _site dir?"; exit 1; }

# If we're running in GitHub Actions, rsync is restricted to the right directory.
if [[ -n "${CI}" ]]; then
  rsync -avz --delete --progress _site/ blog:.
else
  rsync -avz --delete --progress _site/ blog:/var/www/html/blog.yossarian.net/
fi

With all this, I can trigger blog deployments at any time, including dry runs:

With any luck, this will be a strong foundation for many years of blogging ahead.


  1. If there are any others left. 

  2. I think GitHub Pages supports HTTPS for custom subdomains now, but I haven’t checked. The last time I tried, the “standard” way to achieve that was to use a CDN company’s free tier to provide HTTPS. 

  3. Which I almost never need to do, thanks to asset caching. 

  4. Since GitHub is where the blog’s source is currently stored, and GitHub Actions is the CI platform I know best. 


Discussions: Reddit Twitter