Jul 7, 2016 Tags: gsoc, programming, ruby, ruby-macho
For a quick background, see the first post in this series.
As of the last two weeks, in rough chronological order:
PPC parsing is fixed in Homebrew!
Ruby-macho has been live on test-bot
since brew/378,
but the glue code between the parser and Homebrew failed to catch the case
where hardlinks to binaries were treated as separate files and sent to
relocation functions more than once. install_name_tool
and otool
failed
silently on this behavior, while ruby-macho complained
loudly:
1
2
3
4
5
6
Error: No such dylib name: @@HOMEBREW_PREFIX@@/opt/readline/lib/libreadline.6.dylib
/usr/local/Library/Homebrew/vendor/macho/macho/macho_file.rb:240:in `change_install_name'
/usr/local/Library/Homebrew/vendor/macho/macho/tools.rb:33:in `change_install_name'
/usr/local/Library/Homebrew/os/mac/ruby_keg.rb:13:in `change_install_name'
/usr/local/Library/Homebrew/keg_relocate.rb:45:in `block (3 levels) in relocate_install_names'
...
This was remedied in brew/400
with a quick change to the exclusion logic used in Keg#mach_o_files
.
Many thanks to Tim,
Mike, and
ilovezfs for their comments on the issue and
PR (+ unit tests), and to my mentor Martin
for discovering the bug!
(Correct) RPath support is finally here!
Partial support for runtime paths has been in ruby-macho for quite a while, but remained stubbornly untested. To make matters worse, there was actually a subtle bug in the null-padding calculations for load command strings that made RPath modification on 64-bit binaries fail catastrophically.
The bug has been fixed as of ruby-macho/35, which both corrects the load command string modification code and supplements the unit tests. On that topic…
Multibyte load command strings should now be handled correctly!
This was more of an oversight than a deferred feature, but
MachOFile#set_lc_str_in_cmd
used the String#size
method when calculating
necessary paddings. This hadn’t caused any problems before, but almost
certainly would have resulted in some very nasty bugs with strings
containing multibyte characters (e.g., UTF-8 above the ASCII plane) as
String#size
would no longer have equaled String#bytesize
. Thanks again
to Martin for catching this early!
This was fixed in ruby-macho/37,
and was supplemented by a new MachOFile#low_fileoff
convenience function
that gives the offset to the first section.
Synthesis and serialization of load commands is on the horizon!
Up until now, there was only one (correct) way to create a LoadCommand
(or subclass thereof) in ruby-macho - via a MachOFile
or FatFile
instance. This had certain benefits, including being able to make
assumptions (about endianness, padding width, and so forth) because the
“parent” Mach-O data was always available for referencing. While this made
initialization of load commands conceptually simpler, it also necessitated
hacky functions like MachOFile#set_lc_str_in_cmd
whose job was to modify
data present only in the load command while inside the control of the
MachOFile
parent. This approach also completely ignored an inevitable
reality of Mach-O modification - the need to add new load commands, not
just modify existing ones.
ruby-macho/38 isn’t the complete answer to this deficiency, but it’s the first step towards resolving it. In particular, once finished and merged, it will allow new load commands (of select types) to be created without access to parent data:
1
rpath = LoadCommand.create(:LC_RPATH, "/home/william/lib")
And, more importantly, serialized back to binary strings:
1
rpath.serialize(:little) # => "\x1c\x00\x00\x80..."
…which will eventually allow us to do something like this:
1
2
3
4
5
6
7
m = MachO::MachOFile.new("foo.bin")
# create the object first
m.add_command(rpath)
# or, let ruby-macho do the dirty work for you
m.add_rpath("/home/william/lib")
This is all preliminary
(see ruby-macho/40 for
the meat of the rpath work), so don’t take the examples above as concrete
documentation - the API will almost certainly change as we evaluate its
responsibilities and dependencies before merging. However, when finished,
this will allow load command additions and modifications to be realized
much more simply (and without the need for 57-line monstrosities like
set_lc_str_in_cmd
).
I’m also going to try something new in this post by listing a few (not all!) of the things I plan to do over the next two weeks:
Get ruby-macho/38 and
ruby-macho/40 into master
and cap them off with a release (and Homebrew re-vendor), bringing the long
march towards complete rpath support to an end.
Replace set_lc_str_in_cmd
with load command creation and serialization
calls for both dylibs and rpaths. This will eliminate the longest (and arguably
ugliest) method in ruby-macho.
Create some more powerful command-line tools backed by ruby-macho. Of the
four that are currently provided in the bin/ directory, only info.rb
is
really useful now that the test suite is much more complete. Overall, it would
be nice to have a few full utilities installed with the gem - perhaps even
clones of install_name_tool
and otool
.
Fix some deficiencies and mistakes in the test suite. I’ve been using pass
instead of skip
to skip tests, which is absolutely the wrong thing to do.
Similarly, a lot of the tests haven’t been updated in pace with the public API
and can almost certainly be made shorter and/or faster.
Documentation shoring. According to yard
, about 85% of the public API is
covered by documentation. I’d like to get this to 100%, which might be tedious
(lots of constants to be labeled) but shouldn’t be difficult.
Thanks for reading!
- William