May 21, 2023 Tags: cryptography, devblog, programming, python, rant
TL;DR: A large number of PGP signatures on PyPI can’t be correlated to any well-known PGP key and, of the signatures that can be correlated, many are generated from weak keys or malformed certificates. The results suggest widespread misuse of GPG and other PGP implementations by Python packagers, with said misuse being encouraged by the PGP ecosystem’s poor defaults, opaque and user-hostile interfaces, and outright dangerous recommendations.
I’ve been sitting on this post for a few months, in part because of travel and in part because its (intended) scope was beginning to reflect PGP’s own fractal complexity.
The version that I’m publishing now has been significantly pared down to remove extended digressions on how bad PGP’s packet format is, all the different ways in which a signature or certificate packet can be broken, incorrectly bound, &c.
I’ve removed those things because I think the results, as present, are sufficient evidence for the actual claims I’d like to make, namely:
That existing PGP signatures on PyPI serve no security purpose, and that all evidence points to nobody ever attempting to verify them;
Even advanced technical communities, as a whole, largely fail to reduce PGP’s complexity and unnecessary agility into a reasonable and tractable subset.
And, just in case it needs to be said:
This post isn’t intended to disparage PyPI: PyPI has done everything right, including purposely removing frontend support for PGP years ago.
This post isn’t intended to disparage individual packagers and maintainers still uploading signatures to PyPI. I suspect that much of the ongoing signature uploading is a result of long-forgotten automation and, even when it isn’t: developers cannot be blamed for their misuse of obtuse tools. Security tools, especially cryptographic ones, are only as good as their least-informed1 and most distracted user.
PyPI has supported PGP signatures in some form or another for a very long time2.
To this date, PGP is still (minimally) supported: package uploaders can still sign for their package
distributions and upload the resulting
.asc to PyPI for inclusion in the index. The
official uploading utility even supports invoking
gpg directly via the
To a novice Python programmer looking to publish their first package to PyPI, this might give the following impressions:
The first two are just wrong:
PGP is an insecure and outdated ecosystem that hasn’t reflected cryptographic best practices in decades.
PyPI’s support is vestigial in nature: signatures are not shown as part of the web interface, and are only obliquely referenced in the PEP 503 and JSON APIs.
The third is harder to immediately refute: PyPI still hosts signatures, after all. Absent any other information, it’s entirely possible that companies and end users are quietly and diligently verifying whatever signatures are present, using trust sets, tracking revoked and expired keys, and so forth.
Thus, my goal with this blog post:
Relatively early in the process I decided not to collect every single signature on PyPI, for two main reasons:
Relevance: PyPI hosts many old package distributions, including distributions for Python 2.7 (and earlier!). Given that Python 2 has been EOL for over three years at this point, it didn’t feel relevant (or efficient) to retrieve large quantities of signatures that nobody is likely to ever try install the distributions for.
Fairness: both PGP and Python have a lot of history, much of which predates modern understandings around cryptographic best practices. Given that, it didn’t feel fair to analyze extremely old signatures, especially if doing so would bias the statistics away from newer users who are doing more responsible things.
Given these considerations, I decided to limit my analysis to only signatures uploaded to PyPI on or after 2020-03-27. I chose that date somewhat arbitrarily3 while also satisfying a few constraints:
It’s well after the 2018 deployment of the new PyPI, which didn’t emphasize support for PGP signatures (while still retaining it). In other words: signatures uploaded in 2020 or later were either done by automation (implying some degree of sophistication) or were likely a conscious decision by a packager to continue signing with PGP.
It’s very recent, and best practices around digital signatures have not changed substantially since 2020. In other words: a best-practices signature (and key) made in 2020 should look very similar to a best-practices signature (and key) made in 2023, and someone signing in 2020 would have no good excuses for not making reasonable choices.
Actually retrieving the signatures was a multi-step process. To start, I used PyPI’s BigQuery dataset to give me some basic metadata on every distribution file with an associated signature:
1 2 3 4 SELECT name, version, filename, python_version, blake2_256_digest FROM `bigquery-public-data.pypi.distribution_metadata` WHERE has_signature AND upload_time > TIMESTAMP("2020-03-27 00:00:00")
This produced 52900 distributions uploaded since 2020-03-27 for which PyPI also had a signature (subtract 1 for the CSV header):
1 2 3 4 5 6 $ wc -l inputs/dists-with-signatures.csv 52901 inputs/dists-with-signatures.csv $ head -2 inputs/dists-with-signatures.csv name,version,filename,python_version,blake2_256_digest pantsbuild.pants.testutil,1.30.0,pantsbuild.pants.testutil-1.30.0-py36.py37.py38-none-any.whl,py36.py37.py38,7ecbe47906ddbe8a2f1ee2505c2edb7f9313348d4925855e429be1d316660a00
From here, I needed to retrieve each release distribution’s detached signature, i.e.
.asc URL in PyPI’s object storage.
I initially did this with the “conveyor” service, which turns PEP 491 names into URLs like so:
However, this was pretty lossy: for whatever reason4 my URLs were slightly off about 20% of the time, resulting in lots of missed signatures. I eventually realized that the BigQuery dataset also includes the Blake2 digest for each distribution, meaning that I could use the actual package URLs instead:
…and this was perfectly reliable.
From here, I wanted to figure out (roughly) how many unique keys produced these ~50k signatures.
I decided to use PGPy5 for that; excerpted from
1 2 3 4 5 6 7 8 9 10 11 sig = pgpy.PGPSignature.from_blob(sig_resp.content) try: # https://github.com/SecurityInnovation/PGPy/issues/433 sig sig.signer except AttributeError: print("barf: couldn't get signer, probably ancient", file=sys.stderr) _KEY_ID_MAP["<invalid signer>"].append(rec) continue _KEY_ID_MAP[sig.signer].append(rec)
This left me with a big map of PGP key IDs6 to a list of distributions signed by them, including 26 distributions whose signatures PGPy couldn’t parse:
|Package name||Distribution count|
This is a tiny failure (26 distributions out of 52900, or roughly 0.5%), but it sets the tone for the rest of the post.
Apart from these 26 failures, the remaining 52874 signatures were produced from 1067 “unique”7 PGP keys.
At this point, I had 1067 unique key IDs, each of which needed to be retrieved from a keyserver.
My expectation was that this wouldn’t be a significant challenge, despite the widely publicized implosion of the SKS keyserver network back in 2018: there are still a few major keyservers running, and package authors pushing to PyPI should have the presence of mind to upload their keys. Right?
Pictured: your author immediately before trying to retrieve PGP keys in 2023.
Wrong. Of the 1067 keys IDs collected through signatures on PyPI, a full 308 (or roughly 29%) had no publicly discoverable key on the major remaining keyservers. In other words: roughly 1/3rd of all signatures added to PyPI since 2020 are bound to keys that aren’t discoverable by the PGP ecosystem’s own tooling. They might exist, hidden on personal domains and documentation pages, but, for all intents and purposes, these 29% of keys are useless8.
So, our first graphic of the post: discoverable keys versus undiscoverable ones:
Pictured: a very normal and healthy signing ecosystem.
That left 759 discovered keys to actually audit. To keep things simple9, I limited my analysis to just the following considerations:
Does the key’s certificate have a binding signature10?
What algorithm does the key use?
If that seems like a limited analysis, it’s because it is: there are too many
ways to produce a weirdly shaped PGP certificate and/or key packet sequence,
and the existing tooling (things like
pgp --with-colons) weren’t up to the task.
Instead, I wrote a little tool (
pgpkeydump) to give me machine-readable
dumps of PGP keys11, and then wrapped it in a bulk auditing script
that does some basic statistics on the results.
To summarize the results:
Then, on the algorithm and parameter sides12:
Or again, as pretty charts:
First, the “good” parts:
Then, the meh:
A sizeable minority (20% of effective keys, and 17% of primary keys) are RSA-2048. NIST considers RSA-2048 to be equivalent to roughly 112 bits of security16, and does not recommend its use on data that’s expected to have a security life of 15 years…starting in 2015. That means that PyPI-hosted signatures against RSA-2048 keys have roughly 7 years of “shelf life” in them. Version turnover in packaging ecosystems has accelerated over the last decade; let’s hope that applies here too!
Some enterprising people are on the “bleeding edge”: they’re using EdDSA and a few different ECDSA curves. It’s hard to say whether this is good or bad: it’s good in the sense that these are almost certainly better than anything offered by strictly RFC 4880 PGP implementations, but pointless in the sense that support for verifying these signatures is limited17 to just a few clients. It’s also probably pointlessly slow (for P-521 and brainpoolP512r1 in particular).
And finally, the insane:
Roughly 5% of all keys used to sign for packages on PyPI are DSA. The majority of those are DSA-1024, which is roughly equivalent in strength to RSA-1024. DSA of any size is already very bad, and DSA-1024 is well outside of any acceptable safety margin for signatures in 2023, much less 2020 or even 2010.
RSA-4064 and RSA-4032. I have no idea why anyone would do this18. Maybe some misguided attempt to calculate a precise security margin, or a misreading of someone else’s recommendations?
One of the RSA-2048 keys has a public exponent of
41, rather than
65537 (which every other
RSA key in the dataset uses). Again, I have no idea why anyone would do this: it’s pointlessly
slower and opens up padding concerns that
e = 65537 is resilient against.
To summarize: of just the PGP signatures uploaded to PyPI in the last three years:
Of the remaining discoverable keys:
Nearly half (49%) have no active binding signature when retrieved from a keyserver, giving them indefinite (at the absolute best) identity properties.
A sizeable minority (20%) are using weak RSA keys with less than a decade before NIST considers them insecure.
A smaller but still appreciable minority (5%) are using DSA-1024 keys, which have been considered insecure for well over a decade.
By all rights, these numbers represent the best possible case for PGP signatures on PyPI. Expanding the audit to 2015 or even earlier would likely reveal far worse practices.
In one sense, none of this is a problem: the breadth and depth of issues here suggests that nobody (thankfully!) is actually relying on these signatures, and the continued presence of new signatures on PyPI is primarily a vestige of forgotten automation and outdated tutorials.
On the other hand, these results present a strong case against attempting to “rehabilitate” PGP signatures for PyPI, or any other packaging ecosystem: all evidence points to end users (i.e., signers) being unable19 to distinguish between the “good” and “bad” parts of PGP, much less use them at all (e.g. keyservers).
So, for final conclusions:
As with previous posts, I’ve tried to make my steps and data reproducible, and have checked them all into this repo. I welcome any discoveries of mistakes I’ve made, as well as any attempts to improve the overall detail or fidelity of the results!
In a domain-specific sense: nobody should have to be an expert in compilers to enable basic security mitigations, and nobody should have to be an expert in cryptographic protocol design to generate a good signature. ↩
It’s exactly three years before before the day I began this post. ↩
I was too lazy to debug this, but it was probably because I was assuming that all distribution URLs were wheel-like, when many were source distributions. Update: Ee has informed me that this was probably because of a lack of normalization: conveyor doesn’t normalize package or version names on either end. ↩
As the snippet suggests, this was probably a mistake: PGPy is very lightly maintained and appears the win the jackpot in terms of simultaneously being incompatible with old PGP signatures and lagging behind the rest of the PGP ecosystem. ↩
PGP has both keys and “subkeys,” and the relationships between them are pointlessly malleable. Given that, the number is really 1067 unique key IDs; it’s impossible to say how many unique containing certificates or representations of each key have been made over the years. ↩
I’m also giving the PGP ecosystem a break here, by acting as if a key’s presence on a keyserver somehow makes it trustworthy. This isn’t true: you still need to have a reason to trust the key, which schemes like the web of trust and strong set were meant (and failed) to provide. ↩
Things were originally not simple: I started out by writing a full PGP certificate and key linter, ↩
A PGP certificate that doesn’t contain a binding signature is effectively not a certificate, since it contains no positive evidence that someone actually possesses the private half of the key. ↩
Really PGP “certificates” or “sequences of packets resembling PGP certificates,” but nobody uses these terms consistently in the PGP ecosystem. ↩
The eagle eyed might notice that the total key count here is off by one: 758 instead of 759. That’s because there’s one key ID,
CD6F6C3E0A50F73B, that doesn’t even match the key returned by the keyserver! I have no clue how this happened, and I can’t be bothered to figure out. ↩
“Effective” means the signing key, which can either be the primary key or a subkey. I audited both (when different), under the operating theory that it’s bad to have a strong subkey bound to a weak primary key (cf. a strong TLS certificate issued by a weak CA). ↩
In terms of cryptographic safety margins, not representation size. Representation wise, both RSA-3072 and RSA-4096 are ridiculously large and unwieldy compared to EC keys with similar or stronger margins. ↩
Which itself is discouraged: NIST’s own recommendation is to prefer a minimum of 128 bits of security, which would correspond (roughly) to RSA-3072. ↩
And, if your use of PGP involves an incompatible subset, you might as well just do things right and drop PGP entirely. ↩
And I didn’t bother checking. ↩
Which, again, is not their fault: the system itself bears complete responsibility. ↩