Sep 22, 2023 Tags: programming, rant, workflow
I love GitHub Actions: I’ve been a daily user of it since 2019 for both professional and hobbyist projects, and have found it invaluable to both my overall productivity and peace of mind. I’m just old enough to have used Travis CI et al. professionally before moving to GitHub Actions, and I do not look back with joy1.
By and large, GitHub Actions continues to delight me and grow new features that I appreciate: reusable workflows, OpenID connect, job summaries, integrations into GitHub Mobile, and so forth.
At the same time, GitHub Actions is a regular source of profound frustration and time loss2 in my development processes. This post lists some of those frustrations, and how I think GitHub could selfishly3 improve on them (or even fix them outright)4.
Here’s a pretty typical session of me trying to set up a release workflow on GitHub Actions:
In this particular case, it took me 4 separate commits (and 4 failed releases) to debug
the various small errors I made: not using ${{ ... }}
5 where I needed to, forgetting
a needs:
relationship, &c.
Here’s another (this time of a PR-creating workflow), from a few weeks later:
I am not the world’s most incredible programmer; like many (most?), I program intuitively and follow the error messages until they stop happening.
GitHub Actions is not responsible for catching every possible error I could make, and ensuring that every workflow I write will run successfully on the first try.
At the same time, the current debugging cycle in GitHub Actions is ridiculous: even the smallest change on the most trivial workflow is a 30+ second process of tabbing out of my development environment (context switch #1), digging through my browser for the right tab (context switch #2), clicking through the infernal nest of actions summaries, statuses, &c. (context switch #3), and impatiently refreshing a buffered console log to figure out which error I need to fix next (context switch #4). Rinse and repeat.
Give us an interactive debugging shell, or (at least) let us re-run workflows
with small changes without having to go through a git add; git commit; git push
cycle6.
Give us a repository setting to reject commits with obviously invalid workflows (things
like syntax that can’t possibly work, or references to jobs/steps that don’t exist).
It’s infuriating when I git push
a workflow that silently fails because of invalid YAML;
especially when I then merge that workflow’s branch under the mistaken impression
that the workflow is passing, rather than not running at all.
Speaking from experience: it’s shockingly easy to wreck yourself with GitHub Actions. Way easier than it should be.
Here is just a small handful of the ways in which I have personally written potentially vulnerable workflows over the past few years:
Using the ${{ ... }}
expansion syntax in a shell or other context where a
(potentially malicious) user controls the expansion’s contents. The following, for example, would
allow a user to inject code that could then exfiltrate $MY_IMPORTANT_SECRET
:
1
2
3
4
5
- name: do something serious
run: |
something-serious "${{ inputs.frob }}"
env:
MY_IMPORTANT_SECRET: ${{ secrets.MY_IMPORTANT_SECRET }}
Some among you will observe that a ✨good✨ programmer would simply know not to do this, and that a bad programmer would eventually learn their (painful) lesson. This might be an acceptable position for a niche piece of software to hold; it is not an acceptable position for the CI/CD platform that, to a first approximation, hosts the entire open source ecosystem.
Using pull_request_target
. As far as I can tell, it’s practically
impossible to use this event safely in a non-trivial workflow7.
This event appears to exist for an extremely narrow intended use case, i.e.
labeling or commenting on PRs that come from forks. I don’t understand
why GitHub Actions chooses to expose such a (relatively) simple operation
through as massive of a foot-gun as pull_request_target
.
Over-scoping my workflow and job-level permissions.
The default access set for Actions’ ordinary GITHUB_TOKEN
is
very permissive:
the only thing it doesn’t provide access to are the workflow’s OpenID Connect token.
This consistently bites me in two different ways:
I consistently over-scope my tokens because I don’t know exactly how much access my workflow will need.
This is further complicated by the messy ways in which GitHub’s permission
model gets shoehorned into a single permissions dimension of read/write/none
:
why does id-token: write
grant me the ability to read the workflow’s OpenID Connect
token? Why do
some GET
operations
on security advisories require write
, while others only require read
?
There are also a few things that I haven’t done8, but are scary enough that I think they’re worth mentioning.
For example, can you see what’s wrong with this workflow step?
1
2
steps:
- uses: actions/checkout@c7d749a2d57b4b375d1ebcd17cfbfb60c676f18e
Despite all appearances, SHA ref
c7d749a2d57b4b375d1ebcd17cfbfb60c676f18e
is not a commit on the actions/checkout
repository! It’s actually a commit on a fork in
actions/checkout
’s network which, thanks to GitHub’s use of
alternates,
appears to belong to the parent repository.
Chainguard has an excellent post on this9, but to summarize:
/repos/{user}/{repo}/commits/{ref}
returns a JSON response that only references
{user}/{repo}
, even if {ref}
is only on a fork.GitHub’s response to this (so far) has been to add a little bit of additional language to their documentation, rather than to forbid misleading SHA references outright.
Give us push-time rejection of obviously insecure workflows. In other words:
let us toggle10 a “paranoid workflow security” mode that, when enabled,
causes git push
to fail with an explanation of what I’m doing wrong. Essentially
the same thing as the debugging request above, but for security!
Give us runtime checks on our workflows, analogous to runtime instrumentation like
AddressSanitizer
in the world of compiled languages. There are so many things that could
be turned into hard failures for security wins without breaking 99.9% of legitimate
users, like failing any attempt to use actions/checkout
on a pull_request_target
with a ref that isn’t from the targeted repository.
Maybe just deprecate and remove pull_request_target
entirely.
GitHub’s own Security Lab
has been aware of how dangerous this event is for years; maybe it’s time to get rid of it
entirely.
Allow us to set a more restrictive default token scope on our personal repositories,
similar to how organizations and enterprises can restrict their default
GITHUB_TOKEN
scopes across all repositories at once.
By default, reject any SHA-pinned action for which the SHA only appears on a fork and not the referenced repository. It’s hard to imagine a legitimate reason to ever need to do this!
When writing a custom GitHub Action, you can specify the actions inputs
using a mapping under the inputs:
key. For example, the following
defines a frobulation-level
input with a description (used for tooltips
in many IDEs) and a default value:
1
2
3
4
inputs:
frobulation-level:
description: "the level to frobulate at"
default: "1"
Notably, this syntax does not allow for type enforcement; the following does not work:
1
2
3
4
5
6
7
inputs:
frobulation-level:
description: "the level to frobulate to"
default: 1
# NOTE: this SHOULD cause a workflow failure if the input
# isn't a valid number, but doesn't
type: number
This absence is strange, but what makes it bizarre is that GitHub is inconsistent about where types can appear in actions and workflows:
workflow_call
supports type
with boolean
, number
, or string
workflow_dispatch
supports type
with boolean
, choice
, number
, or string
Unfortunately, this is only the first level: even inputs that do support typing doesn’t support compounded data structures, like lists or objects. For example, neither of the following works:
1
2
3
4
5
6
7
8
- uses: example/example
with:
# INVALID: can't use arrays as inputs
paths: [foo, bar, baz]
# INVALID: can't use objects as inputs
headers:
foo: bar
baz: quux
…which means that action writers end up requiring users to do silly things like these:
1
2
3
4
5
6
7
- uses: example/example
with:
# SILLY: action does ad-hoc CSV-ish parsing
paths: foo,bar,baz
# SILLY: action forcefully flattens a natural hierarchy
header-foo: bar
header-baz: quux
This is bad for maintainability, and bad for security: maintainability because actions must carefully manage a single flat namespace of inputs (with no types!), and security because both action writer and workflow writer are forced into ad-hoc, unspecified languages for complex inputs.
Let action and workflow writers use type:
everywhere, and let
us use choice
everywhere — not just in workflow_dispatch
!
Give us stricter type-checking. Where action and workflow types
can be inferred statically, detect errors and reject incorrectly typed
workflow changes at push
time, rather than waiting for the workflow
to inevitably fail.
Give us type: object
and type: array
types. These won’t be perfect
to start with (thanks to potentially heterogeneous interior types),
but they’ll be a significant improvement over the status quo. Implementation-wise,
forward these as JSON-serialized strings or something similar11 where
appropriate (such as in auto-created INPUT_{WHATEVER}
environment variables).
The third-party ecosystem on GitHub Actions is great: there are a lot of high-quality, easy-to-use actions being maintained by open source contributors. I maintain a handful of them!
Beneath the surface of these excellent third-party actions is a substrate of official, GitHub-maintained actions. These actions primarily address three classes of fundamental CI/CD activities:
git
operations: actions/checkout
actions/{upload,download}-artifact
,
actions/cache
, actions/stale
actions/setup-python
, actions/setup-node
These classes are somewhat distinct from “higher-level” workflows (like the kind I write): because of their centrality and universal demand, they benefit from singular, high-quality, officially maintained implementations.
And so, the question: why are there so few of them?
Here is just a smattering of the official actions that don’t exist:
gh pr merge
already exists.
It just isn’t exposed as an action; users are (presumably) expected
to piece it together themselves.Even worse, there are actions that did exist but were deprecated (generally for unclear reasons12):
actions/create-release
:
unmaintained as of March 2021. Users
encouraged to switch to various community maintained workflows, most notably13
softprops/action-gh-release
.actions/upload-release-asset
: marked
as unmaintained at the same time as actions/create-release
.actions/setup-ruby
:
unmaintained as of February 2021. Users
encouraged to switch to ruby/setup-ruby
.I’m sympathetic to the individual maintainers here and, in each case, the transition to a “recommended” third-party action was relatively painless.
Still, the overall impression given here is unmistakable: that GitHub does not see official actions for its own platform features (or key ecosystem users, like Ruby) as priorities, and would rather have the community develop and choose unofficial favorites. This is not unreasonable on a strategic level (it induces third-party development in their ecosystem), but has a deleterious effect on trust in the platform. I’d like to be able to write workflows and know that they’ll run (with minimal changes) 5 years from now, and not worry that GitHub has abandoned core pieces underneath me!
Apart from imparting a general feeling of shabbiness, this compounds with GitHub Action’s poor security story (per above): not providing official high-quality actions for their own API surfaces means that users will continue to make exploitable security mistakes in their workflows. Nobody wins14.
Give us more official actions. As a very rough rule of thumb: if a thing
directly ties different pieces of GitHub infrastructure together and currently
needs to be done manually (with REST API calls, gh
invocations, or whatever else),
it probably deserves a full official action!
Give us more pseudo-official actions. Work with the biggest third-party actions15
to form a community-actions
(or whatever) org, with the expectation that actions homed under
that org have been reviewed (at some point) by GitHub, are forced to adhere to best practices
for repository security, receive semantically versioned updates, &c &c.
This is a long and meandering post, and many parts are in conflict: security and stability (in the form of more official actions that break less often), for example, are in eternal conflict with each other.
I’m just one user, and I don’t expect my interests or frustrations to be overriding ones. Still, I hope that the problems (and potential fixes) above aren’t unique to me, and that there are engineers at GitHub who (again, selfishly!) share these concerns and would like to see them fixed.
In a large part because, at GitHub’s size, I worry much less about private equity enshittifying it. ↩
Just enough for it to really hurt, against the backdrop of GitHub Actions’ overall productivity benefits. ↩
In the sense that these things would be in GitHub’s own self-interest, making GHA even more appealing to developers, further cement its dominance in the CI/CD space, &c. They should do these things for their own sake! ↩
After finishing this post, I discovered that GitHub has a public roadmap for Actions features. Maybe some of my grievances are already known and listed here; it’s a big roadmap! ↩
Completely unrelated to this post: writing ${{ ... }}
is remarkably painful in a Liquid-rendered Jekyll blog. ↩
Yes, I know this fundamentally breaks the GitHub Actions data model; I didn’t say it would be easy! ↩
In the sense that “using pull_request_target
safely” means being confident that you never accidentally run anything from the pull request that just triggered your workflow. ↩
And I think haven’t been done to me. ↩
Which I stole the actions/checkout
example from, since I was too lazy to make my own. ↩
Even better, make it the default, and require people to click through a “destructive action” modal similar to the ones for other dangerous user or repository setting changes. ↩
JSON is a semi-obvious choice here, since GitHub Actions already has a fromJSON(...)
function and maps cleanly from YAML. ↩
The primary stated reason is time, leading to the revelation that these critical actions were side projects. That isn’t these engineers’ fault; they seem to have been making the best out of a bad situation! But it’s incredible to see GitHub, organizationally, squander so much value and community goodwill here. ↩
In my opinion. It seems to have the most users and most activity, although it’s bonkers that I’m evaluating something as critical as this based on those kind of weak proxy signals. ↩
Except for the pentesting industrial complex. ↩
Off the top of my head: actions like ruby/setup-ruby
, shivammathur/setup-php
, and peaceiris/actions-gh-pages
(among others) have hundreds of thousands of active users, and form a critical part of the Actions ecosystem. They should be treated as such! ↩