PGP signatures on PyPI: worse than useless

TL;DR: A large number of PGP signatures on PyPI can’t be correlated to any well-known PGP key and, of the signatures that can be correlated, many are generated from weak keys or malformed certificates. The results suggest widespread misuse of GPG and other PGP implementations by Python packagers, with said misuse being encouraged by the PGP ecosystem’s poor defaults, opaque and user-hostile interfaces, and outright dangerous recommendations.

Preword

I’ve been sitting on this post for a few months, in part because of travel and in part because its (intended) scope was beginning to reflect PGP’s own fractal complexity.

The version that I’m publishing now has been significantly pared down to remove extended digressions on how bad PGP’s packet format is, all the different ways in which a signature or certificate packet can be broken, incorrectly bound, &c.

I’ve removed those things because I think the results, as present, are sufficient evidence for the actual claims I’d like to make, namely:

Background

PyPI has supported PGP signatures in some form or another for a very long time².

To this date, PGP is still (minimally) supported: package uploaders can still sign for their package distributions and upload the resulting .asc to PyPI for inclusion in the index. The official uploading utility even supports invoking gpg directly via the --sign and --sign-with arguments!

To a novice Python programmer looking to publish their first package to PyPI, this might give the following impressions:

The third is harder to immediately refute: PyPI still hosts signatures, after all. Absent any other information, it’s entirely possible that companies and end users are quietly and diligently verifying whatever signatures are present, using trust sets, tracking revoked and expired keys, and so forth.

Methodology

Relatively early in the process I decided not to collect every single signature on PyPI, for two main reasons:

Given these considerations, I decided to limit my analysis to only signatures uploaded to PyPI on or after 2020-03-27. I chose that date somewhat arbitrarily³ while also satisfying a few constraints:

Actually retrieving the signatures was a multi-step process. To start, I used PyPI’s BigQuery dataset to give me some basic metadata on every distribution file with an associated signature:

This produced 52900 distributions uploaded since 2020-03-27 for which PyPI also had a signature (subtract 1 for the CSV header):

From here, I needed to retrieve each release distribution’s detached signature, i.e. the adjacent .asc URL in PyPI’s object storage.

I initially did this with the “conveyor” service, which turns PEP 491 names into URLs like so:

However, this was pretty lossy: for whatever reason⁴ my URLs were slightly off about 20% of the time, resulting in lots of missed signatures. I eventually realized that the BigQuery dataset also includes the Blake2 digest for each distribution, meaning that I could use the actual package URLs instead:

From here, I wanted to figure out (roughly) how many unique keys produced these ~50k signatures. I decided to use PGPy⁵ for that; excerpted from dists-by-keyid.py:

This left me with a big map of PGP key IDs⁶ to a list of distributions signed by them, including 26 distributions whose signatures PGPy couldn’t parse:

This is a tiny failure (26 distributions out of 52900, or roughly 0.5%), but it sets the tone for the rest of the post.

Package name	Distribution count
agraph-python	2
excerpt-html	4
lektor-index-pages	6
lektor-expression-type	2
lektor-git-timestamp	2
lektor-datetime-helpers	3
lektor-limit-dependencies	2
lektorlib	2
lektor-polymorphic-type	3

Apart from these 26 failures, the remaining 52874 signatures were produced from 1067 “unique”⁷ PGP keys.

Results

At this point, I had 1067 unique key IDs, each of which needed to be retrieved from a keyserver.

My expectation was that this wouldn’t be a significant challenge, despite the widely publicized implosion of the SKS keyserver network back in 2018: there are still a few major keyservers running, and package authors pushing to PyPI should have the presence of mind to upload their keys. Right?

Wrong. Of the 1067 keys IDs collected through signatures on PyPI, a full 308 (or roughly 29%) had no publicly discoverable key on the major remaining keyservers. In other words: roughly 1/3rd of all signatures added to PyPI since 2020 are bound to keys that aren’t discoverable by the PGP ecosystem’s own tooling. They might exist, hidden on personal domains and documentation pages, but, for all intents and purposes, these 29% of keys are useless⁸.

So, our first graphic of the post: discoverable keys versus undiscoverable ones:

That left 759 discovered keys to actually audit. To keep things simple⁹, I limited my analysis to just the following considerations:

If that seems like a limited analysis, it’s because it is: there are too many ways to produce a weirdly shaped PGP certificate and/or key packet sequence, and the existing tooling (things like pgpdump and pgp --with-colons) weren’t up to the task.

Takeaways

Key type	Count
RSA-4096	497
RSA-2048	127
RSA-3072	45
DSA-1024	40
EdDSA	35
DSA-3072	7
DSA-2048	4
NIST P-521	1
RSA-4064	1
RSA-4032	1

RSA-4096	471
RSA-2048	151
RSA-3072	47
EdDSA	43
DSA-1024	31
DSA-3072	7
DSA-2048	5
NIST P-521	1
brainpoolP512r1	1
RSA-4032	1

To summarize: of just the PGP signatures uploaded to PyPI in the last three years:

By all rights, these numbers represent the best possible case for PGP signatures on PyPI. Expanding the audit to 2015 or even earlier would likely reveal far worse practices.

In one sense, none of this is a problem: the breadth and depth of issues here suggests that nobody (thankfully!) is actually relying on these signatures, and the continued presence of new signatures on PyPI is primarily a vestige of forgotten automation and outdated tutorials.

On the other hand, these results present a strong case against attempting to “rehabilitate” PGP signatures for PyPI, or any other packaging ecosystem: all evidence points to end users (i.e., signers) being unable¹⁹ to distinguish between the “good” and “bad” parts of PGP, much less use them at all (e.g. keyservers).

As with previous posts, I’ve tried to make my steps and data reproducible, and have checked them all into this repo. I welcome any discoveries of mistakes I’ve made, as well as any attempts to improve the overall detail or fidelity of the results!

In a domain-specific sense: nobody should have to be an expert in compilers to enable basic security mitigations, and nobody should have to be an expert in cryptographic protocol design to generate a good signature. ↩
It’s hard to tell exactly how long, but it’s potentially as old as PyPI itself: 23 year old design threads mention PGP as an early consideration. ↩
It’s exactly three years before before the day I began this post. ↩
I was too lazy to debug this, but it was probably because I was assuming that all distribution URLs were wheel-like, when many were source distributions. Update: Ee has informed me that this was probably because of a lack of normalization: conveyor doesn’t normalize package or version names on either end. ↩
As the snippet suggests, this was probably a mistake: PGPy is very lightly maintained and appears the win the jackpot in terms of simultaneously being incompatible with old PGP signatures and lagging behind the rest of the PGP ecosystem. ↩
As in, the 32 byte/8 hexdigit key IDs that everyone is used to. You know, the ones that are trivially collidable and have been for years. ↩
PGP has both keys and “subkeys,” and the relationships between them are pointlessly malleable. Given that, the number is really 1067 unique key IDs; it’s impossible to say how many unique containing certificates or representations of each key have been made over the years. ↩
I’m also giving the PGP ecosystem a break here, by acting as if a key’s presence on a keyserver somehow makes it trustworthy. This isn’t true: you still need to have a reason to trust the key, which schemes like the web of trust and strong set were meant (and failed) to provide. ↩
Things were originally not simple: I started out by writing a full PGP certificate and key linter, ↩
A PGP certificate that doesn’t contain a binding signature is effectively not a certificate, since it contains no positive evidence that someone actually possesses the private half of the key. ↩
Really PGP “certificates” or “sequences of packets resembling PGP certificates,” but nobody uses these terms consistently in the PGP ecosystem. ↩
The eagle eyed might notice that the total key count here is off by one: 758 instead of 759. That’s because there’s one key ID, CD6F6C3E0A50F73B, that doesn’t even match the key returned by the keyserver! I have no clue how this happened, and I can’t be bothered to figure out. ↩
“Effective” means the signing key, which can either be the primary key or a subkey. I audited both (when different), under the operating theory that it’s bad to have a strong subkey bound to a weak primary key (cf. a strong TLS certificate issued by a weak CA). ↩
Meaning RFC 4880 compliant, not the miscellaneous other optional RFCs that various implementations may or may not choose to support. ↩
In terms of cryptographic safety margins, not representation size. Representation wise, both RSA-3072 and RSA-4096 are ridiculously large and unwieldy compared to EC keys with similar or stronger margins. ↩
Which itself is discouraged: NIST’s own recommendation is to prefer a minimum of 128 bits of security, which would correspond (roughly) to RSA-3072. ↩
And, if your use of PGP involves an incompatible subset, you might as well just do things right and drop PGP entirely. ↩
And I didn’t bother checking. ↩
Which, again, is not their fault: the system itself bears complete responsibility. ↩

ENOSUCHBLOG

Programming, philosophy, pedaling.