ENOSUCHBLOG

Programming, philosophy, pedaling.


Introducing bindef, a DSL for defining binary files

Aug 25, 2018     Tags: devblog, programming, ruby    

This post is at least a year old.

After a few weekends of fiddling, I’m releasing bindef.

Background

I ran across t2b a few months ago, and really liked the idea of a small macro language for defining binary files. I work with binary samples on a daily basis (both professionally and in my spare time), and manipulating them can become a hassle quickly:

However, I didn’t want a new macro language: emitting binary is something that languages like Python and Ruby already do very well, albeit verbosely. Instead, I wanted something that operates within an interpreted language, supplementing it with a few simple commands.

Syntax

At first glance, bindef’s syntax is similar to t2b’s. The following input, stolen from the t2b README, generates the same output on both:

1
2
3
4
5
6
7
u8 10
u8 0xa

u16 0o777
u8 0b00001110

i64 25677

Differences begin to emerge in string handling, and in the flexibility of number parsing:

1
2
3
4
5
6
# `bindef` requires quotes on all strings
str "hello"

# `bindef` is a Ruby DSL, so we can use Ruby's flexible number parsing
# high nibble, low nibble
u8 0b1111_0000

bindef also supports endian and encoding hints via the pragma expression:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# encodes any subsequent strings as UTF-32
pragma encoding: "utf-32"
str "hello"
str "there"

# encodes any subsequent strings as UTF-8 (the default)
pragma encoding: "utf-8"
str "utf-8!"

# pragmas can also take a block, to provide a limited scope:
pragma endian: :big do
  # we only emit in big-endian inside this block.
  # note that Ruby namespaces and constants work correctly,
  # and that `bindef` uses `f32` and `f64` for floats instead
  # of `f` and `d`.
  i64 0xFF00FF00
  f64 Math::PI
  f32 Float::INFINITY
end

It’s also all Ruby underneath, so variables, interpolation, and methods are perfectly safe to use:

1
2
3
4
5
6
7
8
9
10
11
12
# Emits a `string` with a NUL terminator
# NOTE: This is actually provided as an extra command, see "usage" below.
def strz(string)
  str string
  str "\x00"
end

strz "this is my C-string"

foo = "foobar"

strz "#{foo}\n"

For a complete set of core commands and pragma settings, check out the SYNTAX file in the repository.

Installation and usage

For now, bindef is available through RubyGems:

1
2
3
$ gem install bindef
$ bd -h
$ bd < input > output

It doesn’t have any dependencies, though, so you can try it directly from the repository:

1
2
$ ruby -Ilib ./bin/bd -h
$ ruby -Ilib ./bin/bd < input > output

Check out the example directory for some basic real-world examples.

bindef also has a notion of “extras”, which are additional commands implemented in the core library but not exposed by default. These can be used within bindef inputs via the -e or --extra flag:

1
2
# Load extra commands for string and TLV output
$ bd -e string,tlv < input > output

The extras are sparsely documented, but should be pretty easy to read in source form. You can find them under lib/bindef/extras in the repository, and some of the examples use them.

Thanks

To Tobe Osakwe for t2b.

To Winny for the name bindef — my initial name was much worse.


Discussions: Reddit