Feb 28, 2021 Tags: oss, programming, rant
This post is at least a year old.
This post contains my own opinions, not the opinions of my employer or any open source groups I belong or contribute to.
It’s also been rewritten 2½ times, and (I think) reads confusingly in places. But I promised myself that I’d get it out of the door instead of continuing to sit on it, so here we go.
There’s been a decent amount of drama debate in the open source community about support
recently, originating primarily from
pyca/cryptography’s decision to use Rust for some ASN.1 parsing routines1.
To summarize the situation: building the latest pyca/cryptography
release from scratch now requires
a Rust toolchain. The only current2 Rust toolchain is built on LLVM, which
supports a (relatively) limited
set of architectures. Rust further whittles this
set down into support tiers, with
some targets not receiving automated testing (tier 2) or official builds (tier 3).
By contrast, upstream3 GCC supports a somewhat larger set of architectures. But C4, cancer that it is, finds its way onto every architecture with or without GCC (or LLVM’s) help, and thereby bootstraps everything else.
Program packagers and distributors (frequently separate from project maintainers themselves) are very used to C’s universal presence. They’re so used to it that they’ve built generic mechanisms for putting entire distributions onto new architectures with only a single assumption: the presence of a serviceable C compiler.
This is the heart of the conflict: Rust (and many other modern, safe languages) use LLVM for its relative simplicity5, but LLVM does not support either native or cross-compilation to many less popular (read: niche) architectures. Package managers are increasingly finding that one of their oldest assumptions can be easily violated, and they’re not happy about that.
But here’s the problem: it’s a bad assumption. The fact that it’s the default represents an unmitigated security, reliability, and reproducibility disaster.
Imagine, for a moment, that you’re a maintainer of a popular project.
Everything has gone right for you: you have happy users, an active development base, and maybe even corporate sponsors. You’ve also got a CI/CD pipeline that produces canonical releases of your project on tested architectures; you treat any issues with uses of those releases as a bug in the project itself, since you’ve taken responsibility for packaging it.
Because your project is popular, others also distribute it: Linux distributions, third-party package managers, and corporations seeking to deploy their own controlled builds. These others have slightly different needs and setups and, to varying degrees, will:
You don’t know about any of the above until the bug reports start rolling in: users will report bugs that have already been fixed, bugs that you explicitly document as caused by unsupported configurations, bugs that don’t make any sense whatsoever.
You struggle to debug your users’ reports, since you don’t have access to the niche hardware, environments, or corporate systems that they’re running on. You slowly burn out under an unending deluge of already fixed bugs that never seem to make it to your users. Your user base is unhappy, and you start to wonder why you’re putting all this effort into project maintenance in the first place. Open source was supposed to be fun!
What’s the point of this spiel? It’s precisely what happened to pyca/cryptography
:
nobody asked them whether it was a good idea to try to run their code on
HPPA, much less
System/3906; some packagers just went ahead
and did it, and are frustrated that it no longer works. People just assumed that it
would, because there is still a norm that everything flows from C, and that any
host with a halfway-functional C compiler should have the entire open source ecosystem
at its disposal.
Security-sensitive software8,9, particularly software written in unsafe languages, is never secure in its own right.
The security of a program is a function of its own design and testing, as well as the design, testing, and basic correctness of its underlying platform: everything from the userspace, to the kernel, to the compilers themselves. The latter is an unsolved problem in the very best of cases: bugs are regularly found in even the most mature compilers (Clang, GCC) and their most mature backends (x86, ARM). Tiny changes to or differences in build systems can have profound effects at the binary level, like accidentally removing security mitigations. Seemingly innocuous patches can make otherwise safe code exploitable in the context of other vulnerabilities.
The problem gets worse as we move towards niche architectures and targets that are used
primarily by small hobbyist communities.
Consider m68k
(one of the other architectures affected by pyca/cryptography
’s move to Rust): even
GCC was considering removing
support due to lack of maintenance, until hobbyists stepped in. That isn’t to say that any
particular niche target is full of bugs10; only to say that it’s a greater likelihood
for niche targets in general. Nobody is regularly testing the mountain of userspace
code that implicitly forms an operating contract with arbitrary programs on these platforms.
Project maintainers don’t want to chase down compiler bugs on ISAs or systems that they never intended to support in the first place, and aren’t receiving any active support feedback about. They especially don’t want to have vulnerabilities associated with their projects because of buggy toolchains or tooling inertia when working on security improvements.
As someone who likes C: this is all C’s fault. Really.
Beyond language-level unsafety (plenty of people have covered that already), C is organizationally unsafe:
There’s no standard way to write tests for C.
Functional and/or unit tests alone would go a long way in assuring baseline correctness on weird architectures or platforms, but the cognitive overhead of testing C and getting those tests running ensures that well-tested builds of C programs will continue to be the exception, rather than the rule.
There’s no standard way to build C programs.
Make is fine, but it’s not standard. Disturbingly large swathes of critical open source infrastructure are compiled using a hodgepodge of Make, autogenerated rules from autotools, and the maintainer’s boutique shell scripts. One consequence of this is that C builds tend to be flexible to a fault: prospective packagers can inject all sorts of behavior-modifying flags that may not be attested directly in the compiled binary or other build products. The result: it’s almost impossible to prove that two separate builds on different machines are the same, which means more maintainer pain.
There’s no standard way to distribute C programs.
Yes, I know that package managers exist. Yes, I know how to statically link. Yes, I know how to vendor libraries and distribute self-contained program “bundles”. None of these are or amount to a complete standard, and each introduces additional logistical or security problems.
There’s no such thing as truly cross-platform C.
The C abstract machine, despite looking a lot like a PDP-11, leaks the underlying memory and ordering semantics of the architecture being targeted. The result is that even seasoned C programmers regularly rely on architecture-specific assumptions when writing ostensibly cross-platform code: assumptions about the atomicity of reads and writes, operation ordering, coherence and visibility in self-modifying code, the safety and performance of unaligned accesses, and so forth. Each of these, apart from being a potential source of unsafety, are impossible to detect statically in the general case: they are, after all, perfectly correct (and frequently intended!) on the programmer’s host architecture.
By contemporary programming language standards, these are conspicuous gaps in functionality: we’ve long since learned to bake testing, building, distribution, and sound abstract machine semantics into the standard tooling for languages (and language design itself). But their absence is doubly pernicious: they ensure that C remains a perpetually unsafe development ecosystem, and an appealing target when bootstrapping a new platform.
The project maintainer isn’t the only person hurting in the status quo.
Everything stated above also leads to a bum job for the lowly package maintainer11. They’re (probably) also an unpaid open source hobbyist, and they’re operating with constraints that the upstream isn’t likely to immediately understand:
glibc
or x86-64 CPUs without modern extensionsThey also have to deal with users who are unsympathetic to those reports, and who:
All of this leads to package maintainer burnout12, and an (increasingly) adversarial relationship between projects and their downstream distributors. Neither of those bodes well for projects, the health of critical packaging ecosystems, or (most importantly of all) the users themselves.
I am just barely conceited enough to think that my potential solutions are worth broadcasting
to the world. Here they are.
Build systems are a mess; I’ve talked about their complexity in a professional setting.
A long term solution to the problem of support for platforms not originally considered by project authors is going to be two-pronged:
Builds need to be observable and reviewable: project maintainers should be able to get the exact invocations and dependencies that a build was conducted with and perform automatic triaging of build information. This will require environment and ecosystem-wide changes: object and packaging formats will need to be updated; standards for metadata and sharing information from an arbitrary distributor to a project will need to be devised. Reasonable privacy concerns about the scope of information and its availability will need to be addressed.
Reporting needs to be better directed: individual (minimally technical!) end users should be able to figure out what exactly is failing and who to phone when it falls over. That means rigorously tracking the patches that distributors apply (see build observability above) and creating mechanisms that deliver information to the people who need it. Those same mechanisms need to have some mechanism for interaction: there’s nothing worse than a flood of automated, bug reports with insufficient context13.
Rust certainly isn’t the first ecosystem to provide different support tiers, but they do a great job:
Tiers are explicitly enumerated and documented. If you’re in a particular tier bucket, you know exactly what you’re getting, what’s guaranteed about it, and what you’ll need to do on your own.
Official builds provide transitive guarantees: they can carry patches to the compiler and other components without needing the entire system to be patched. Carrying patches still isn’t great, but it currently isn’t avoidable.
Tiers are baked into the tooling itself: you can’t use rustup
on DEC ALPHA and (incorrectly)
expect to pull down a mature, tested toolchain. You can’t because it would be a lie. This
is in contrast to the C paradigm, where an un(der)-tested compiler will happily be under-checked
by a big blob of
autotools
shell, producing a build of indeterminate correctness.
Expectations are managed. This point is really just a culmination of the first three: with explicit tiers, there’s no more implicit guarantee that a minimally functional build toolchain entails fully functional and supported software. Users can be pointed to a single page that tells them that they’re doing something that nobody has tried to (or currently wants to) support, and expose options to them: help out, fund the project, nag their employer, &c.
I put this one last because it’s flippant, but it’s maybe the most important one: outside of hobbyists playing with weird architectures for fun (and accepting the overwhelming likelihood that most projects won’t immediately work for them), open source groups should not be unconditionally supporting the ecosystem for a large corporation’s hardware and/or platforms.
Companies should be paying for this directly: if pyca/cryptography
actually broke on HPPA
or IA-64, then HP or Intel or whoever should be forking over money to get it fixed or
using their own horde of engineers to fix it themselves. No free work for platforms that only
corporations are using14. No, this doesn’t violate the open-source ethos15; nothing about
OSS says that you have to bend over backwards to support a corporate platform that you didn’t
care about in the first place.
For the unfamiliar: ASN.1 is a big, messy IDL and serialization format that has historically been a major source of easily exploitable bugs in cryptographic software. Cryptographic protocols regularly parse untrusted ASN.1; rewriting any amount of ASN.1 handling in a safe language (such as Rust) confers significant security benefits. ↩
There’s a work-in-progress GCC frontend for Rust, but it can’t compile meaningful programs yet (as of writing). There’s also cranelift, which may break Rust’s dependency on LLVM in the future, but doesn’t support nearly as many targets yet. ↩
There are too many vendor- and platform-specific versions of GCC in the wild to count. ↩
Mentally substitute “C” for “C and/or C++” in various parts of this post. ↩
It cannot be overstated just how important LLVM has been to the last decade or so of language research and development, and just how easy it’s made that work. But that’s a topic for an entirely separate post. ↩
That’s the original S/390, mind you, not the 64-bit “s390x” (also known as z/Architecture). Think about your own C projects for a minute: are you willing to bet that they perform correctly on a 31-bit architecture that even Linux doesn’t support anymore? ↩
With apologies to Ken Thompson. ↩
Newsflash: all software is security sensitive. ↩
I’m also conflating security and reliability here, which is potentially contentious. Maybe a future post. ↩
Although m68k does seem to have its fair share. In a twist of irony: the GCC maintainers can’t repro some of the reports, since Debian may have patched the compiler! ↩
Including yours truly. ↩
Look at the maintainer turnover rate and/or unmaintained package ratio for your manager of choice. Either stat is probably higher than you’d expect. ↩
And no, there’s no way to guess at the right amount of context ahead-of-time. A coredump doesn’t always cut it, and it probably wouldn’t be very cool of us to image the whole user’s machine. ↩
In my extremely humble opinion. ↩