Using NumPy Arrays on the Frontend

Introduction

It’s true: it’s possible to use NumPy arrays directly on the frontend by using a combination of two technologies: ArrayBuffer and Typed Arrays.

This comes with a lot of caveats: for instance, the arrays are of a fixed size, and (with this approach) are limited to being one-dimensional. You also need to be aware of the endianness of both the backend and the browser’s JS VM, and you may need to do slightly more legwork surrounding data types.

The benefits are compelling, however:

This is something I’ve been playing with at work, where I’ve been working on a tool that can analyze big data interactively (such as time-series data from satellite downlinks).

An Extremely Brief Tour of Typed Arrays

In essence, typed arrays implement a sort of “view” for an underlying ArrayBuffer. ArrayBuffers themselves do not provide very much functionality at all, they are essentially just byte vectors. With a typed array, you can use the contents of an ArrayBuffer as an array.

// Create a buffer holding the underlying data (all zeroes)
const buf = new ArrayBuffer(1024) // bits

// Create a "view" of that data as 1024 8-bit unsigned ints
const u8_view = Uint8Array(buf);

// Create another view, but this time as 256 32-bit unsigned ints
const u32_view = Uint32Array(buf);

// because these two point at the same array, we can do some weird stuff

u8_view[1] = 1 // set the second byte of the buffer to 1 (0b00000001)
console.log(u32_view[0]) // this is now 256

An Even Briefer Tour of NumPy Internals

Under the hood, a numpy array is just a block of memory + an indexing scheme + a data type:

typedef struct PyArrayObject {
        PyObject_HEAD

        /* Block of memory */
        char *data;

        /* Data type descriptor */
        PyArray_Descr *descr;

        /* Indexing scheme */
        int nd;
        npy_intp *dimensions;
        npy_intp *strides;

        /* Other stuff */
        PyObject *base;
        int flags;
        PyObject *weakreflist;
} PyArrayObject;

(from Advanced Numpy)

We’re mainly interested in the data type and the data itself, and since we’re only interested in one-dimensional arrays, we can ignore most of the array object above.

Getting the data back out looks like:

>>> import numpy as np
>>> a = np.array([1,2,3], dtype=np.uint8)
>>> a.data
<memory at 0x7f946950fd08>
>>> bytes(a.data)
b'\x01\x02\x03'

This is the underlying memory region associated with this array. Note that regions can overlap, but this shouldn’t be an issue for our purposes.

An Incredibly Short Note about Endianness

With this approach, you need to keep track of the endianness of your data. You may find that you’ll need to swap the endianness of your dataset once it arrives at the browser.

You can determine the byte order of your server directly with the sys module:

>>> import sys
>>> sys.byteorder
'little'

On the frontend, determining endianness requires checking directly:

const endianness = () => {
    const buf = new ArrayBuffer(2)
    const u8  = new Uint8Array(buf) 
    const u16 = new Uint16Array(buf)

    u8[0] = 1
    if (u16[1] === 1) {
        return 'little'
    } else {
        return 'big'
    }
}

Swapping the byte order manually is kind of a pain, you can maybe use a DataView to swap bytes around or implement something inside of a web worker (because ArrayBuffers are transferable, this should be fast). Or you could even handle it on the backend.

Putting it all together

In this example, we’ll assume that the array is of 32-bit floats, and we’ll just assume that the endianness on the backend matches the frontend (not guaranteed!).

With the aiohttp web server:

from aiohttp import web

from my_analysis import get_dataset

async def serve_array(request):
    # load numpy array
    arr = get_dataset()

    # send off the bytes
    return web.Response(bytes=bytes(arr.data))


# start the webserver
app = web.Application()
app.add_routes([web.get('/get/array', handle)])

if __name__ == '__main__':
    web.run_app(app)

And then on the frontend:

fetch('https://api.example.com/get/array')
    .then(r => r.arrayBuffer())
    .then(buf => {
        const arr = new Float32Array(buf)

        console.log('got array from numpy!', arr)
    })