May 20, 2020 Tags: programming, rant, rust
Five years ago, I wrote a post about the things I hated most about my then (and still) favorite scripting language: Ruby.
Today, I’m going to do the same about my current favorite compiled language: Rust.
Like the original Ruby post, these complains are personal and reflect my current best understanding of the language. Just like Ruby, they’re written from an overarching position of love for Rust.
Without further ado:
My development complains with strings in Rust fall along two general axes:
Off the top of my head, I can think of 5 different ways1 to represent strings, views of strings, or signatures that accept string-y things:
&str
for borrowed stringsString
for owned strings&OsStr
for borrowed strings in the OS’s representationOsString
for owned strings in the OS’s representationAsRef<str>
for signatures where a cheap &str
reference can be made(I’m aware that the last one isn’t really a string type, but it appears regularly in idiomatic string-handling code.)
As a Rust newbie, the distinctions between these types was deeply confusing, and made it more
difficult to understand references (Why is a &String
different from a &str
? Why can’t I create a
str
directly? Where the hell am I getting &&str
from?).
Multiple string types and relevant traits beget multiple conversion functions:
&str
to String
: String::from()
, to_string()
, to_owned()
, into()
, not counting
formatting routes or round-tripping with a Vec
or [u8]
String
to &str
: as_str()
, as_ref()
, Deref<Target=str>
, &x[..]
OsStr
and CStr
Most of these routes are equivalent in performance, and the Rust community seems divided on which ones are “right”.
I’ve ended up in the habit of using different ones depending on
the context (e.g. into()
to indicate that I’m turning a &str
into a String
so I can
return it, to_owned()
to indicate that I’m taking ownership to use the string later on).
The Rust standard library has some gaps that make aspects of userspace programming painful:
No current way to get the user’s home directory.
std::env::home_dir()
is explicitly marked as
deprecated, and the documentation encourages users to rely on the
dirs
crate (which is currently archived on GitHub)2.
No standard way to expand ~
.
std::fs::canonicalize
supports .
and
..
, but not ~
. Yes, I know this is a duplicate of the above.
No way to invoke a command through a system shell. Yes, I know that
system(3)
is bad.
Yes, I agree that it shouldn’t be the default interface for executing other processes, and
should even be quarantined to prevent unintentional use. None of that changes the fact
that it’s occasionally useful3 and can be implemented more reliably in the standard library
than by end developers throwing sh -c
around.
No standard way to glob
. It looks like the
glob
crate is the semi-official way to do this.
These are admittedly minor gaps, and are all addressed by high-quality crates. But they add friction to the development process, friction that’s especially noticeable given how frictionless Rust otherwise tends to be.
I love trait-based composition. What I don’t love:
Is being told that I’m missing use std::io::Read
or use std::io::Write
because I’m calling
one of their methods that’s been impl
‘d by something I already have in scope. I understand why
Rust does it this way but it feels weird, especially in the context of unused imports otherwise
being compiler warnings.
The syntax for implementing traits for traits. impl<T> for Trait for T where T: OtherTrait
isn’t too bad, but it doesn’t read nearly as naturally as impl Trait for OtherTrait
would.
Sometimes rustc
needs me to add where Self: Sized
to my static (i.e., non-self
) trait
functions. I still don’t understand why this is sometimes required and sometimes isn’t; I’m sure
there’s a decent reason.
Given a fixed array x = [T; N]
and an index variable i
of type U
such that U::MAX < N
,
indexing via x[i]
will always be safe. Despite this, rustc
expects the programmer to explicitly
widen i
to usize
:
1
2
3
4
5
fn main() {
let lookup_table: [u8; 256] = [0_u8; 256];
let index = 5_u8;
println!("{}", lookup_table[index]);
}
fails with:
1
2
3
4
5
6
7
8
error[E0277]: the type `[u8]` cannot be indexed by `u8`
--> src/main.rs:4:20
|
4 | println!("{}", lookup_table[index]);
| ^^^^^^^^^^^^^^^^^^^ slice indices are of type `usize` or ranges of `usize`
|
= help: the trait `std::slice::SliceIndex<[u8]>` is not implemented for `u8`
= note: required because of the requirements on the impl of `std::ops::Index<u8>` for `[u8]`
Understandable, but requires that the programmer either use as usize
everywhere they plan on
indexing (verbose, and masks the intent behind the index being a u8
) or that they make
index
itself into a usize
(also masks the intent, and makes it easier to do arithmetic that’ll
eventually be out-of-bounds).
cargo install
doesn’t, sometimesI don’t know whether this one’s a bona fide bug or not, but I’m tossing it in since it’s bitten me a few times.
cargo install
apparently doesn’t know how to discover suffixed package versions. For example, if
I publish myfakepackage
as version 0.0.1-alpha.0
, cargo install
will report:
1
2
$ cargo install myfakepackage
error: could not find `myfakepackage` in registry `https://github.com/rust-lang/crates.io-index`
You have to explicitly pass --version
:
1
$ cargo install myfakepackage --version 0.0.1-alpha.0
I had some other things that I wanted to kvetch about (aliases for core types not supporting
traits, the package ecosystem being a little too JS/npm
-y in style), but I figure that doing so
runs the risk of being too negative on a language that I am overwhelmingly happy with.
I still like Ruby five years later, and I’m feeling optimistic about Rust.
Not counting CString
and &CStr
, since those are primarily used in FFI contexts and are understandably different. ↩
I understand that it’s actually remarkably difficult to reliably get the user’s home directory on POSIX platforms. That doesn’t change the fact that the standard library should attempt to. ↩
Case in point: CLIs frequently expose hook-points and callbacks where being able to write in shell syntax is useful. ↩