ENOSUCHBLOG

Programming, philosophy, pedaling.


Cleaner Function Wrapping in C

May 5, 2017     Tags: c, programming    

This post is at least a year old.

Using LD_PRELOAD to override and wrap functions is a well known trick, with a lot of use cases:

In its most basic form, function overriding just involves matching the prototype of the target function (in this case read(2)):

1
2
3
4
5
6
7
#include <sys/types.h>
#include <errno.h>

ssize_t read(int fd, void *buf, size_t count)
{
	return errno = EINVAL, -1; /* this is going to make someone unhappy */
}

And then using LD_PRELOAD with the target program:

1
2
$ gcc -shared -fPIC fake_read.c -o libfakeread.so
$ LD_PRELOAD=./libfakeread.so cat /etc/hostname

The next step is actually wrapping the target function, which we can do with dlsym(3) and the RTLD_NEXT pseudo-handle:

1
2
3
4
5
6
7
8
9
10
11
12
#include <sys/types.h>
#include <dlfcn.h>

ssize_t read(int fd, void *buf, size_t count)
{
	ssize_t (*__real_read)(int fd, void *buf, size_t count);
	__real_read = dlsym(RTLD_NEXT, "read");

	/* do whatever you want here */

	return __real_read(fd, buf, count);
}

RTLD_NEXT isn’t defined by POSIX, so you’ll need to build with _GNU_SOURCE (and with -ldl, to link the dlopen library in).

1
$ gcc -D_GNU_SOURCE -shared -fPIC fake_read.c -o libfakeread.so -ldl

Hoever, calling dlsym(3) once per read(2) invocation isn’t very pretty. It might also perform pretty badly, depending on the frequency of invocation, the size of the dependency tree, and so forth.

To avoid this, we can take advantage of the constructor function attribute provided by GCC and Clang to mark a method for execution during shared library load:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>
#include <sys/types.h>
#include <dlfcn.h>

static ssize_t (*__real_read)(int fd, void *buf, size_t count);

__attribute__((constructor)) static void init_wraps()
{
	__real_read = dlsym(RTLD_NEXT, "read");
}

ssize_t read(int fd, void *buf, size_t count)
{
	int nread = __real_read(fd, buf, count);

	printf("wrapped read: %d bytes read\n", nread);

	return nread;
}

(There’s also the destructor attribute that gets run at program teardown, but we don’t need it here. These attributes are not part of the C standard, but that doesn’t particularly bother me so long as both GCC and Clang support them.)

Now we only call dlsym(3) once per wrapped function and only at the very beginning of the target program’s execution, with no changes in result:

1
2
3
4
5
$ gcc -D_GNU_SOURCE -shared -fPIC fake_read.c -o libfakeread.so -ldl
$ LD_PRELOAD=./libfakeread.so cat /etc/hostname
wrapped read: 8 bytes read
mercury
wrapped read: 0 bytes read

You can find a practical application of this in WGOtW, a socket tool that I’ve been writing as a learning experiment (read: an experiment in re-learning and writing practical C code).

Update: You can avoid the dlsym(3) repetition in standard C too, by doing a null check:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>
#include <sys/types.h>
#include <dlfcn.h>

static ssize_t (*__real_read)(int fd, void *buf, size_t count) = NULL;

ssize_t read(int fd, void *buf, size_t count)
{
	if (__real_read == NULL) {
		__real_read = dlsym(RTLD_NEXT, "read");
	}

	int nread = __real_read(fd, buf, count);

	printf("wrapped read: %d bytes read\n", nread);

	return nread;
}

I prefer the constructor version because it separates the wrapping phase from the actual invocation, but this this is just as good in terms of performance (and is more standard). Many thanks to resync for suggesting this.

Second Update: As /u/quicknir has pointed out, this null-checking version has a race condition in it.