Michael Maclean

MFFI: A new foreign function interface for PHP

In PHP, if you want to make use of a library written in C, you generally have to go and write a new PHP extension. This isn’t that hard if you’ve done it before, but it does require a bit of work and some getting used to. There’s also a fair bit of boilerplate code that is required to create functions, classes, and methods, and to handle parameter parsing. The documentation for this side of things isn’t that great right now, though there are efforts to change this. One advantage is that there are quite a number of people who have done it before, and you can use tools like PHP’s OpenGrok to find examples and inspiration.

It would be nice to make this sort of thing simpler to do. Perhaps, we could avoid having to write any C code. As I mentioned in my previous post, there are other tools in the works to make accelerating code written in PHP faster. I mentioned Anthony Ferrara’s Recki-CT, which takes PHP code and accelerates that. There is also the Zephir project which takes code written in a PHP-like language and can create extensions for you from that.

Both of these require a bit of time to set up and operate, though. They’re also not really aimed at using existing libraries – their goal is to accelerate self-contained high-level code rather than to provide access to lower-level libraries.

Some languages have a feature that allows the user to load C shared libraries and call functions from them in the same way you’d call any other function in that language. These are known as foreign function interfaces. Python has a number of these – ctypes is one example, and CFFI is another. Similarly, Ruby has the FFI library. Many other languages possess these. There has been one for PHP for a while, called ext/ffi, but it hasn’t had a release in a number of years.

MFFI

I decided it sounded like an interesting project, so I would like to present to the world MFFI.

It’s an attempt at making a new, intuitive, FFI for PHP. Rather than take the approach of attempting to parse C headers, like ext/ffi and Python’s CFFI do, it uses PHP to declare the types. This is easiest to show in an example:

<?php
$library = new MFFI\Library();
$puts = $library->bind('puts', [ MFFI\Type::TYPE_STRING ], MFFI\Type::TYPE_INT);
$puts('Hello world');

The code above binds to the PHP process itself, meaning that any functions available within the process can be called. It then creates a new MFFI\Func object to represent libc’s puts function. The second parameter to MFFI\Library::bind() is an array representing the types of the arguments (in this case just one, a string), and the final argument is the return type (an int in this case). Finally, it calls the function with a standard PHP string. Running this code does what you’d expect:

michael@morbo:mffi% php7 test.php
Hello world
michael@morbo:mffi%

It’s a fairly new extension, so currently it doesn’t have that many features. One thing it can do at the moment is handle custom C structs. You can see this in the code below.

<?php
use MFFI\Type;
use MFFI\Struct;
use MFFI\Library;

class TimeStruct extends Struct {
    static function definition() {
        return [
            'tm_sec' => Type::TYPE_INT,     /* seconds (0 - 60) */
            'tm_min' => Type::TYPE_INT,     /* minutes (0 - 59) */
            'tm_hour' => Type::TYPE_INT,    /* hours (0 - 23) */
            'tm_mday' => Type::TYPE_INT,    /* day of month (1 - 31) */
            'tm_mon' => Type::TYPE_INT,     /* month of year (0 - 11) */
            'tm_year' => Type::TYPE_INT,    /* year - 1900 */
            'tm_wday' => Type::TYPE_INT,    /* day of week (Sunday = 0) */
            'tm_yday' => Type::TYPE_INT,    /* day of year (0 - 365) */
            'tm_isdst' => Type::TYPE_INT,   /* is summer time in effect? */
            'tm_zone' => Type::TYPE_STRING,  /* abbreviation of timezone name */
        ];
    }
}

$tm = new TimeStruct();
$tm->tm_sec = 0;
$tm->tm_min = 30;
$tm->tm_hour = 15;
$tm->tm_mday = 5;
$tm->tm_mon = 3;
$tm->tm_year = 115;
$tm->tm_zone = "BST";

$lib = new Library();
$asctime = $lib->bind('asctime', [ TimeStruct::class ], Type::TYPE_STRING);
var_dump($asctime($tm));

We create a class called TimeStruct that extends MFFI\Struct. This has no methods other than a static one called definition(). This is called the first time one of these classes is instantiated, and returns an array containing the member names and their types. We then create one of these, and set the various parameters to a date. Finally, we bind to libc’s asctime function, which converts the time struct into an ASCII string. This looks like:

michael@morbo:mffi% php7 test2.php
string(25) "Sun Apr  5 15:30:00 2015
"
michael@morbo:mffi%

It gets a spare trailing newline, but it looks pretty good on the whole.

Binding to your own libraries

You can also write your own libraries and use those. Here’s a bit of C code that provides a function that reverses strings. It’s trivial, and PHP already provides this, but it’s just an example. The strrev function in the code accepts a string (as a char *), and returns another string (again a char *).

/* Taken from http://www8.cs.umu.se/~isak/snippets/strrev.c */
#include <string.h>

char *strrev(char *str)
{
    char *p1, *p2;

    if (! str || ! *str)
        return str;
    for (p1 = str, p2 = str + strlen(str) - 1; p2 > p1; ++p1, --p2)
    {
        *p1 ^= *p2;
        *p2 ^= *p1;
        *p1 ^= *p2;
    }
    return str;
}

Save this as rev.c, and compile using a command like gcc -o rev.so -g -fPIC -shared rev.c. You can then use the following PHP code to access the function:

<?php

$library = new MFFI\Library('rev.so');
$strrev = $library->bind('strrev', [ MFFI\Type::TYPE_STRING ], MFFI\Type::TYPE_STRING);
echo $strrev('Hello world');

The result seems to make sense:

michael@morbo:mffi% php7 test3.php
dlrow olleH
michael@morbo:mffi%

A word of warning

This is probably a very, very bad idea. The potential for security problems is monumental. If you get a definition even slightly wrong, you’ll either crash PHP straight away, which is probably the best result, or corrupt some memory somewhere and have unseen and weird effects. I would never advise anyone to use it unless it was a one-shot script or something running in a very controlled, non-public facing environment. If you use this, and it breaks, you get to keep all the bits!

Plans

This extension is only for PHP 7. It doesn’t work on PHP 5, and I don’t intend to change that. There’s a fairly big TODO list of features I want to add. I want to make it possible to define structs that contain other structs, which is a work in progress, and then handle large arrays of scalar types. This would make it possible to solve the Mandelbrot problem from the previous post fairly efficiently using a small C library that did the maths, without knowing anything about how to write PHP extensions.

I have not done any benchmarking yet. The translation through libffi will have a cost, but I have not yet worked out what that will be.