Creating Cross-Language APIs

This week’s blog is a guest post from Layl Bongers, who tried our pre-alpha and wanted to share some experiences. Thank you so much Layl!

One of the biggest reasons I’m excited about The Machinery is its powerful plugin system. Most engines have some sort of plugin system that allows you to write code on top of the engine. Often this is done through scripting languages like C# or Lua. Sometimes engines allow you to write in the language the engine is written in, which is usually C++. Rarely you get a C interface that you can call into from other languages.

The Machinery takes this a step further by designing the entire engine around a plugin-friendly architecture. On top of that, everything publicly accessible restricts itself to C, a dream for cross-language support!

So, when I got access to The Machinery pre-alpha, the first thing I did was see if I could make a Hello World plugin in two of my favorite languages, Rust and C#. Since the entire API’s accessible through C interfaces, it should be easy right? Well as it turns out, it’s not as straightforward as that.

As I explored implementing language bindings, I started writing down some issues I encountered. The people working on The Machinery asked me to write a blog post on this, and I’m happy to do so! I hope this helps people create APIs that enable game developers to work more with their languages of choice.

Why C?

When making a cross-language interface, you pretty much have only one choice, creating a public C interface to your code. The C ABI (Application Binary Interface) has taken the de facto place as the way to communicate between languages. This is because the C ABI is universal and standardized.

For C, how to call functions, the way data is laid out in memory, what should be where on the stack, what should be in the registers, it’s all been defined and standardized decades ago. This is not the case for most other languages, like C++, where these things can differ from compiler to compiler. Even different versions of the same compiler can be incompatible.

For this reason, C has become the de facto standard for talking between languages. All current major operating systems expose a C API to applications running on it. Most languages have some method of directly working with a C API, even languages far removed from the low level like C#. This means that, if you want broad language support you want your engine or library to expose a C API.

So, what’s the issue?

Unfortunately, just exposing a C API is not enough to provide a decent or even remotely usable API for most languages. There’s a few big issues you’ll quickly run into.

  • Direct C bindings are almost never enough to create a nice-to-use API in other languages. Languages like C# and Rust don’t even allow you to call FFI functions without an unsafe block. If you want to support these languages you will need native wrappers around the raw C bindings.

  • C code is often a lot looser with memory lifetime requirements. A simple function that registers a callback needs that callback to continue to be valid until an indeterminate amount of time. This conflicts with Garbage Collected languages, which try to hide memory management by doing it automatically.

  • Certain ways of working and some features in C may not work at all in other languages. In C you can create global structures that will stay valid for as long as your DLL will continue to stay loaded, where this does exist in other languages it’s heavily discouraged. Untagged unions are basically non-existant outside C and C++ and heavily discouraged where they do exist. Variadic functions aren’t supported at all by some foreign function interfaces.

  • Writing bindings is tedious, really really tedious. Especially if you have a large API. Especially if you want broad language support. These bindings can also easily and silently go out of date.

Additionally, The Machinery’s extreme modularity exposes another issue. If someone writes a plugin to The Machinery, it too will need API bindings for other languages if someone using those languages wants to use that plugin.

Solving some problems

These issues are solvable, but you’ll have to create some rules for your API. I recommend you create a document with strict guidelines around your public APIs.

Limit the public API around a subset of C that’s friendly to other languages. Specifically, stick to C89 as this is more broadly supported. Research which features of C89 actually work with the Foreign Function Interface of the languages you’re interested in supporting. Avoid constructs not friendly to other languages such as untagged unions. As a rule of thumb, stick to plain data structures where you can.

Create strict rules around the lifetime requirements of pointers. Where you can, only allow pointers to be used in the function call itself, copy out data if it’s necessary later. This may not be possible for everything, function pointer callbacks for example may need to live a lot longer, and you can’t copy out a function’s implementation. Limit this just to where it’s strictly necessary, and document the exceptions. This will lower the surface area for mistakes a lot.

As an extension to the previous rule of thumb, avoid pointers in data structures. This is prone to error and often unnecessary. In this case, prefer big data structures over many smaller ones. Of course you can’t always avoid this, it’s necessary for arrays and function pointers for example. Where you do need pointers, have them follow the same lifetime rules as function calls.

Lots of other libraries follow these rules already either implicitly or explicitly. These few changes will already make it much easier to create broad language support for your project, but I have one more suggestion to take it to the next level…

Define your API as a Specification

Unsurprisingly, there’s been other projects that have encountered the issue of broad language support, creating friendly bindings for many languages, and keeping them up-to-date for a large non-trivial API. Quite a few of these projects have decided to define their API in a machine-readable format, from which they can generate headers for both implementations and bindings libraries.

A great example of this is Vulkan. The entirety of Vulkan’s API is available in XML format which they call their “API Registry”.

Vulkan defines an API Registry for the core API and extensions, formally defining command prototypes, structures, enumerants, and many other aspects of the API and extension mechanisms. The Vulkan Registry is used for many more purposes than most other Khronos API registries, and is the basis for generating the header files; AsciiDoc include files used in the Specification, and reference pages for interface definitions, parameter and member validity language, and synchronization language; and more.

This means that to support the entirety of the Vulkan API, including extensions, and all future updates, an implementer only has to write a code generator that generates code based on the API Registry. Having this additional information also means the generator can create wrappers on top of the raw bindings, providing a more language-friendly API. You can see how this works in action in libraries such as ash for Rust and VulkanSharp for C#.

Basing your plugin system around your specification solves a lot of issues.

  • You can enforce the subset of the C API that you want to use by simply making it impossible to express anything else.

  • You can create higher level abstractions beyond just what C has, and automatically implement friendly language-specific wrappers for those abstractions. For example, you can create an interface construct, which when generated to C# creates a wrapper class with member functions for the function pointers.

  • The cost of updating your public API becomes much cheaper. When your interfaces change, you can immediately automatically update all bindings with the new changes, for free.

  • Plugins too can provide a public API spec using the same system. This means anyone who writes a plugin immediately benefits from the broad cross-language support of the existing bindings generators.

A side-note on IDLs

Specifications in XML or JSON may be unnecessarily verbose and unfriendly to write. You may want to consider creating an IDL (Interface Definition Language) for your API specification. An IDL essentially works like a high-level Domain Specific Language for defining your API in. It can cut down the amount of work you need to do to define your specification, but you trade this in for the additional cost of writing a parser.

For example, to define an interface, instead of writing:

{
    "interfaces": {
        "logger": {
            "functions": {
                "print": {
                    "parameters": [
                        {
                            "name": "text",
                            "type": "string",
                        }
                    ]
                }
            }
        }
    }
}

You could have this instead:

interface logger {
    fn print(text: string);
}

Admittedly this is an extreme example.

In summary

To hook up my Rust and C# plugins to The Machinery, I decided to go with an API specification. I created a specification of the registry and logging APIs in YAML, which I then created a parser and generator tools for. I did eventually decide afterwards that a YAML specification was too time-intensive to write and not very human-readable, and switched to a custom IDL instead. The bindings I generated already had functions wrapping around the raw function-pointers, so it was easy to add in automatic conversions for types such as strings as well. All in all, it took me just a few days to get a Hello World plugin working in both Rust and C# on these generated API bindings:

Console output in The Machinery’s editor from Hello World plugins in Rust and C#.

The ideal for bindings generators like this is if the specification is already included with The Machinery. This would save a lot of time for people trying to make bindings for their favorite languages, as well as avoiding user-error and bindings getting out of date. However in this situation it was easy enough to create a spec that could generate the bindings I needed, with only some minor hiccups.

Hopefully this gave you some more insights into the challenges and solutions to writing APIs with broad language support. The Machinery is an extreme case, not many engines have a public API this large and available as a native C interface. This however has me personally excited for the future to come, I feel like this will open up a lot more possibilities for experimentation and new paradigms in game development. This extreme non-compromises approach is what got me interested in The Machinery in the first place, and I hope it will continue to push the boundaries of what’s possible.