ENOSUCHBLOG

Programming, philosophy, pedaling.


Introducing usb-ids.rs

Jan 21, 2021

Tags: programming, devblog, rust

This is just a brief announcement for a Rust library that I wrote last month: usb-ids.rs.

It’s available on crates.io; read on if you’re interested in the motivation and implementation.

Background

Universal Serial Bus, or USB, is a family of standards that governs both physical form factors and communication protocols for the vast majority of peripheral devices on modern computers.

In addition to their actual peripheral functionality, USB devices expose a wide variety of both generic and device-specific metadata to their connected host. This metadata is used to advertise the peripheral’s capabilities (e.g., resolutions for a webcam) as well as uniquely1 identify the peripheral to both the machine (device and product IDs) and to humans (device and product strings).

The latter is particularly important: humans don’t remember pairs of 16-bit integers very well, so they need friendly strings like "Logitech BRIO" to identify devices for management.

Let’s see what it looks like to ask each USB device for its human-readable identifiers, using libusb by way of Rust’s rusb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
use rusb::{self, UsbContext};

fn main() -> Result<(), rusb::Error> {
  let context = rusb::Context::new()?;

  for device in context.devices()?.iter() {
    let descriptor = device.device_descriptor()?;
    let handle = device.open()?;

    let vendor = handle.read_manufacturer_string_ascii(&descriptor)?;
    let product = handle.read_product_string_ascii(&descriptor)?;

    println!("vendor: {}, product: {}", vendor, product);
  }

  Ok(())
}

And as a normal user:

1
2
$ ./target/debug/enumusb
Error: Access

Oh no! What happened?

As it turns out, letting random unprivileged users directly connect to arbitrary USB devices is a no-no on most Linux distributions2: users are expected to interact with most peripherals through a specialized kernel driver or subsystem, like uvcvideo for USB webcams or ALSA for audio.

To get direct access to USB devices, we need to enter the world of udev rules. We can either write a targeted rule:

1
SUBSYSTEM=="usb", ATTRS{idVendor}=="acab", ATTRS{idProduct}=="acab", MODE="0666"

…or a general rule, exposing the entire USB subsystem to all unprivileged users:

1
SUBSYSTEM=="usb", MODE="0666"

But neither of these is a great solution for a userspace program: adding udev rules is itself a privileged and finicky operation, and opening all USB devices for direct access makes it easy for users to accidentally soft-lock their devices or expose them to malicious programs.

Device strings without privilege

If you’ve ever used lsusb, you’ll know that it provides a listing of active USB devices, including their vendor and product strings:

1
2
3
4
5
6
7
$ lsusb
Bus 004 Device 003: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 003: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

But wait, how does that work? lsusb is just a plain old unprivileged program. It links to libusb, but there’s nothing particularly special about that.

Here’s one of the places it grabs the vendor string:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static void dump_device(
  libusb_device *dev,
  struct libusb_device_descriptor *descriptor
)
{
  char vendor[128], product[128];
  char cls[128], subcls[128], proto[128];
  char mfg[128] = {0}, prod[128] = {0}, serial[128] = {0};
  char sysfs_name[PATH_MAX];

  get_vendor_string(vendor, sizeof(vendor), descriptor->idVendor);

  // ... snip ...
}

(Permalink).

where get_vendor_string boils down to this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
static const char *hwdb_get(const char *modalias, const char *key)
{
  struct udev_list_entry *entry;

  udev_list_entry_foreach(entry, udev_hwdb_get_properties_list_entry(hwdb, modalias, 0))
    if (strcmp(udev_list_entry_get_name(entry), key) == 0)
      return udev_list_entry_get_value(entry);

  return NULL;
}

const char *names_vendor(uint16_t vendorid)
{
  char modalias[64];

  sprintf(modalias, "usb:v%04X*", vendorid);
  return hwdb_get(modalias, "ID_VENDOR_FROM_DATABASE");
}

int get_vendor_string(char *buf, size_t size, uint16_t vid)
{
  const char *cp;

  if (size < 1)
          return 0;
  *buf = 0;
  if (!(cp = names_vendor(vid)))
          return 0;
  return snprintf(buf, size, "%s", cp);
}

So it’s a udev API. Specifically, it’s the hwdb API, which exists as an in-memory database that gets loaded from (in classic systemd fashion) an opaque binary blob (/usr/lib/udev/hwdb.bin on my system).

A little bit more digging, and we have our actual answer: the USB parts of the hardware database are generated from the USB ID Repository. The USB ID Repository contains a tab-structured database, with entries like this:

1
2
3
4
5
045e  Microsoft Corp.
  0007  SideWinder Game Pad
  0008  SideWinder Precision Pro
  0009  IntelliMouse
  000b  Natural Keyboard Elite

In other words: the USB ID Repository is an independent, community-maintained source of ground truth about USB vendors and their products. lsusb et al. engage in a bit of sleight of hand by using it behind the scenes, avoiding to open each device for its metadata.

usb-ids.rs

Finally, the intended content of this post.

usb-ids.rs is exactly what it sounds like: a library that wraps the USB ID repository. Specifically: usb-ids.rs is a generated Rust library that exposes the USB ID repository as static data. Said generation currently happens with a very hacky parser3 in the crate’s build.rs script.

Another nice thing: it’s built completely independently from udev, meaning that it can be used on non-Linux hosts as a source of “canonical” vendor and product names.

The API is extraordinarily simple:

1
2
3
4
5
6
7
8
9
10
11
use usb_ids::{FromId, Vendors};

// Iterate over every vendor and device.
for vendor in Vendors::iter() {
    for device in vendor.devices() {
        println!("vendor: {}, device: {}", vendor.name(), device.name());
    }
}

// Get a specific vendor by ID.
let vendor = Vendor::from_id(0x045e).unwrap();

Full documentation is available on docs.rs.

Next steps

As of today, usb-ids.rs is ready to use. I don’t expect the currently implemented APIs to change.

Future releases will focus on exposing additional entities in the USB ID repository. In particular, future releases will add support for enumerating device classes and subclasses, as well as various HID entities.


  1. Uniquely among different devices. Two of the same USB device plugged into the same host can be disambiguated either by additional peripheral-level metadata (such as a unique name or GUID) or by the underlying host’s bus and device number allocations. 

  2. I’m not sure what other OSes do, but it wouldn’t surprise me to learn that they similarly restrict direct USB access. In any case, it’s not a reliable cross-platform technique. 

  3. I’ll replace it with quote at some point.