Warning: Constant ABSPATH already defined in /home/public/blog/wp-config.php on line 23

Warning: Cannot modify header information - headers already sent by (output started at /home/public/blog/wp-config.php:23) in /home/public/blog/wp-includes/feed-atom.php on line 8
Page not found – Dysfunctional Programming /* You are not expected to understand this */ 2015-10-17T21:12:23Z https://www.joshmatthews.net/blog/feed/atom/ WordPress Josh Matthews http://joshmatthews.net <![CDATA[Creating a C API for a Rust library]]> http://www.joshmatthews.net/blog/?p=208 2015-10-17T21:12:23Z 2015-10-17T21:01:30Z Yoric has been doing great work porting Firefox’s client backend to Rust for use in Servo (see telemetry.rs), so I decided to create a C API to allow using it from other contexts. You can observe the general direction of my problem-solving by looking at the resulting commits in the PR, but I drew a lot of inspiration from html5ever’s C API.

There are three main problems that require solving for any new C API to a Rust library:

  • writing low-level bindings to call whatever methods are necessary on high-level Rust types
  • writing equivalent C header files for the low-level bindings
  • adding automated tests to ensure that all of the pieces work

Low-level bindings:

Having never used telemetry.rs before, I wrote my bindings by looking at the reference commit for integrating the library into Servo, as well as examples/main.rs. This worked out well for me for this afternoon hack, since I didn’t need to spend any time reading through the internals of the library to figure out what was important to expose. This did bite me later when I realized that the implementation for Servo is significantly more complicated than the example code, which caused me to redesign several aspects of the API to require explicitly passing around global state (as can be seen in this commit).

My workflow here was to categorize the library usage that I saw in Servo, which yielded three main areas that I needed to expose through C: initialization/shutdown, definition and recording of histogram types, and serialization. In each case I sketched out some functions that made sense (eg. count_histogram.record(1) became unsafe extern "C" fn telemetry_count_record(count: *mut count_t, value: libc::uint)) and wrote placeholder structures to verify that everything compiled.

Next I implemented type constructors and destructors, and decided not to expose any Rust structure members to C. This allowed me to use types like Vec<T> in my implementation of the global telemetry state, which both improved the resulting API ergonomically (many fewer _free functions are required) and allowed me to write more concise and correct code. This decision also allowed me to define a destructor on a type exposed to C; this would usually be forbidden due to changing the low-level representation of the type in ways visible to C if the structure members were exposed. Generally these API methods took the form of the following:

#[no_mangle]
pub unsafe extern "C" fn telemetry_new_something(telemetry: *mut telemetry_t, ...) -> *mut something_t {
    let something = Box::new(something::new(...));
    Box::into_raw(something)
}

#[no_mangle]
pub unsafe extern "C" fn telemetry_free_something(something: *mut something_t) {
    let something = Box::from_raw(something);
    drop(something);
}

The use of Box in this code places the enclosed value on the heap, rather than the stack, which allows us to return to the caller without the value being deallocated. However, because C deals in raw pointers rather than Rust’s Box type, we are forced to convert (ie. reinterpret_cast) the box into a pointer that it can understand. This also means that the memory pointed to will not be deallocated until the Rust code explicitly asks for it, which is accomplished by converting the pointer back into an owned Box upon request.

Once I had meaningful types, I filled out the API methods that were used for operating on them. The types used in the public API are very thin wrappers around the full-featured Rust types, so the step was mostly boilerplate like this:

#[no_mangle]
pub unsafe extern "C" fn telemetry_record_flag(flag: *mut flag_t) {
    (*flag).inner.record();
}

The code for serialization was the most interesting part, since it required some choices. The Rust API requires passing a Sender<JSON> and allows the developer to retrieve the serialized results at any point in the future through the JSON API. In the interests of minimizing the amount of work required on a Saturday afternoon, I chose to expose a synchronous API that waits on the result from the receiver and returns the pretty-printed result in a string, rather than attempting to model any kind of Sender/Receiver or JSON APIs. Even this ended up causing some trouble, since telemetry.rs supports stable, beta and nightly versions of Rust right now. Rust 1.5 contains some nice ffi::CString APIs for passing around C string representations of Rust strings, but these are named differently in Rust 1.4 and don’t exist in Rust 1.3. To solve this problem, I ended up defining an opaque type named serialization_t which wraps a CString value, along with a telemetry_borrow_string API function to extract a C string from it. The resulting API works across all versions of Rust, even if it feels a bit clunky.

C header files

The next step was writing a header file that matched the public types and functions exposed by my low-level bindings (like an inverse bindgen). This was a straightforward application of writing out function prototypes that match, since all of the types I expose are opaque structs (ie. struct telemetry_t;).

The most interesting part of this step was writing a C program that linked against my Rust library and included my new header file. I ported a simple Rust test from one I added earlier to the low-level bindings, then wrote a straightforward Makefile to get the linking right:

CC ?= gcc
CFLAGS += -I../../capi/include/
LDFLAGS += -L../../capi/target/debug/ -ltelemetry_capi
OBJ := telemetry.o

%.o: %.c
	$(CC) -c -o $@ $< $(CFLAGS)

telemetry: $(OBJ)
	$(CC) -o $@ $^ $(LDFLAGS)

This worked! Running the resulting binary yielded the same output as the test that used the C API from Rust, which seemed like a successful result.

Automated testing

Following html5ever's example, my prior work defined a separate crate for the C API (libtelemetry_capi), which meant that the default cargo test would not exercise it. Until this point I had been running Cargo from the capi subdirectory, and running the makefile from the examples/capi subdirectory. Continuing to steal from html5ever's prior art, I created a script that would run the full suite of Cargo tests for the non-C API parts, then run cargo test --manifest-path capi/Cargo.toml, followed by make -C examples/capi, and made Travis use that as its script to run all tests.

These changes led me to discover a problem with my Makefile - any changes to the Rust code for the C API would not cause the example to be rebuilt, so I didn't actually have effective local continuous integration (the changes still would have been picked up on Travis). Accordingly, I added a DEPS variable to the makefile that looked like this:
DEPS := ../../capi/include/telemetry.h ../../capi/target/debug/libtelemetry_capi.a Makefile
which causes the example to be rebuilt any time any of the C header, the underlying static library, or the Makefile itself are changed. The result is that whenever I'm modifying telemetry.rs, I can now make changes, run ./travis-build.sh and feel confident that I haven't inadvertently broken the C API.

]]>
9