ExCubecl
ExCubecl is a GPU compute runtime for Elixir, powered by CubeCL via Rust NIFs.
It provides GPU buffer management, kernel execution, async command submission, and pipeline orchestration — designed for AI inference, media processing, and realtime GPU effects on mobile and desktop.
Architecture
┌─────────────────────────────────────────────┐
│ Elixir / BEAM │
│ ExCubecl.buffer(...) │
│ ExCubecl.run_kernel(:blur, ...) │
│ ExCubecl.pipeline() |> pipeline_run() │
├─────────────────────────────────────────────┤
│ ExCubecl.NIF (Elixir) │
│ - NIF function stubs │
├─────────────────────────────────────────────┤
│ Rust NIF (lib.rs) │
│ - GPU device management │
│ - Buffer pool / Texture pool │
│ - Kernel cache │
│ - Async command queue │
│ - Stream scheduler │
├─────────────────────────────────────────────┤
│ CubeCL Runtime │
│ - GPU kernel compilation │
│ - Buffer management │
│ - Dispatch execution │
│ - Synchronization │
├─────────────────────────────────────────────┤
│ C FFI (ex_cubecl.h) │
│ - Mobile platform interface │
│ - iOS / Android interop │
└─────────────────────────────────────────────┘Installation
Add ex_cubecl to your list of dependencies in mix.exs:
def deps do
[
{:ex_cubecl, "~> 0.2.0"}
]
endQuick Start
# Check device
ExCubecl.device_info()
%{name: "ExCubecl CPU (Rust NIF)", gpu: false, version: "0.2.0"}
# Create GPU buffers
a = ExCubecl.buffer([1.0, 2.0, 3.0], {3}, :f32)
b = ExCubecl.buffer([4.0, 5.0, 6.0], {3}, :f32)
# Inspect
ExCubecl.shape(a) # {3}
ExCubecl.dtype(a) # :f32
ExCubecl.size(a) # 12 (bytes)
# Read data back
data = ExCubecl.read(a)
# Run a kernel
output = ExCubecl.buffer([0.0, 0.0, 0.0], {3}, :f32)
ExCubecl.run_kernel(:elementwise_add, [a], output, %{})
# Async execution
cmd_id = ExCubecl.submit(%{op: :run_kernel, kernel: :relu, inputs: [a], output: output, params: %{}})
ExCubecl.poll(cmd_id) # :pending | :completed | {:error, reason}
ExCubecl.wait(cmd_id) # blocks until done
# Pipeline orchestration
pipeline = ExCubecl.pipeline()
pipeline
|> ExCubecl.pipeline_add(%{op: :run_kernel, kernel: :blur, inputs: [a], output: b, params: %{}})
|> ExCubecl.pipeline_add(%{op: :run_kernel, kernel: :relu, inputs: [b], output: output, params: %{}})
ExCubecl.pipeline_run(pipeline)
# Cleanup
ExCubecl.free(a)
ExCubecl.free(b)
ExCubecl.free(output)
ExCubecl.free_pipeline(pipeline)Supported Types
| Type | Description |
|---|---|
:f32 | 32-bit float |
:f64 | 64-bit float |
:s32 | 32-bit signed integer |
:s64 | 64-bit signed integer |
:u32 | 32-bit unsigned integer |
:u8 | 8-bit unsigned integer |
Mobile Integration (iOS / Android)
ExCubecl includes a C FFI layer for mobile platform integration.
iOS (Objective-C / Swift)
#include "ex_cubecl.h"
float data[] = {1.0f, 2.0f, 3.0f};
size_t shape[] = {3};
ex_cubecl_buffer_handle_t buf = ex_cubecl_buffer_new(
(const uint8_t*)data, shape, 1, EX_CUBECL_DTYPE_F32
);
float out[3];
ex_cubecl_buffer_read(buf, (uint8_t*)out, sizeof(out));
ex_cubecl_buffer_free(buf);Android (JNI)
#include "ex_cubecl.h"
#include <jni.h>
JNIEXPORT jlong JNICALL
Java_com_example_excubecl_ExCubeclBuffer_create(
JNIEnv *env, jobject thiz, jbyteArray data, jlongArray shape, jint dtype) {
jsize data_len = (*env)->GetArrayLength(env, data);
jbyte *data_ptr = (*env)->GetByteArrayElements(env, data, NULL);
jlong *shape_ptr = (*env)->GetLongArrayElements(env, shape, NULL);
jsize ndim = (*env)->GetArrayLength(env, shape);
ex_cubecl_buffer_handle_t handle = ex_cubecl_buffer_new(
(const uint8_t*)data_ptr, (const size_t*)shape_ptr, ndim, dtype
);
(*env)->ReleaseByteArrayElements(env, data, data_ptr, 0);
(*env)->ReleaseLongArrayElements(env, shape, shape_ptr, 0);
return (jlong)handle;
}
See native/ex_cubecl_nif/include/ex_cubecl.h for the full API reference.
Use Cases
GPU Image Processing
camera frame → GPU texture → CubeCL kernel → screen renderBlur, sharpen, denoise, beauty filters, LUT filters — all without CPU copies.
AI Inference
tensor → CubeCL kernels → predictionSegmentation, face landmarks, pose detection, embeddings — realtime camera AI.
Video Processing
video texture → GPU kernels → encoderCompositing, transitions, overlays, subtitles, color grading.
Livestream Effects
camera → AI segmentation → background replacement → stream encoderVirtual background, AR effects, realtime filters — all GPU-native.
Evolution Path
| Phase | Focus | Status |
|---|---|---|
| 1 | GPU compute runtime | ✅ Current |
| 2 | Media runtime (video/camera) | 🔜 Planned |
| 3 | AI runtime (inference) | 🔜 Planned |
| 4 | Nx integration (Axon/training) | 🔜 Planned |
License
Apache 2.0 — See LICENSE for details.