Learning the Haskell FFI with C2HS

September 22, 2013

I’ve recently started helping with the maintenance of C2HS, a tool for generating Haskell foreign function interface (FFI) bindings from C header files. I started using C2HS because of a Haskell library I was writing to read Unidata NetCDF files. I didn’t fancy writing all of the bindings to the hundreds of functions in the C NetCDF library by hand and C2HS seemed like a good way to get started with the Haskell FFI.

The Haskell FFI is well-documented in the Language Report and the Haddock documentation for the Foreign modules that define various helper types and marshalling functions (Foreign and Foreign.C are good starting points). However, even with the documentation, the FFI is pretty complicated, and there are lots of choices for marshalling and for allocating memory for communication between Haskell code and C code: there are Ptrs and ForeignPtrs, there are various functions for allocating memory with different lifetimes, there are lots of types floating around, and it’s quite confusing when you’re getting started.

This is where I found starting with C2HS really useful. It’s very easy to write the C2HS specification for a C function, then you can run the C2HS tool over your code and look at the Haskell marshalling code it produces. For instance, for the C function

where n is an input parameter and status is used as an output parameter, the C2HS specification looks like:

The annotations around the second parameter to foo are used to indicate that some memory needs to be allocated for the status pointer parameter and that the value of this pointer needs to be accessed and returned monadically (the C2HS documentation is pretty good about explaining these things). After running C2HS on this definition, you get some Haskell code that looks like this (after cleaning up a little and renaming some things):

As well as the foreign import of the foo function from its C header file, this has a foo function that deals with all of the type conversion and argument and result marshalling between Haskell and C: conversion between Haskell Int and Double types and CInts and CDoubles for communicating with C functions is done using fromIntegral and realToFrac, allocation of memory and result extraction from the integer pointer status parameter is done, with the sequence of allocation, C function call and result extraction being ordered monadically.

As a more complicated example, consider this function from the NetCDF C library:

A suitable C2HS specification for this function is:

This function has a C string parameter as well as an integer array passed as an input parameter (which we’re going to represent as a Haskell list) and an integer pointer used as an output parameter. The withIntArray function is a marshalling helper. From this specification, C2HS produces the following Haskell code:

This gives a pretty good idea of how to deal with marshalling of input and output arguments of different types, including C strings (it uses the withCString function from Foreign.C.String) and arrays (the withIntArray function is just a specialisation of withArray from Foreign.Marshal.Array).

Once you’ve looked at a few examples like this (and more complex ones), it becomes much easier to figure out how to abstract these patterns for more complex cases. For most of the functions in my NetCDF library, I don’t use C2HS specifications directly, but use parameterised functions I wrote based on experiments done with C2HS. In the beginning, I would definitely have had a hard time writing this sort of parameterised FFI function without the examples provided by C2HS. From that perspective, C2HS is really a pretty neat tool for learning how to use the Haskell FFI (you could probably do just the same kind of thing with hsc2hs or similar tools).