Ksc
An Elixir implementation of the Kaitai Struct compiler and runtime. Ksc:
- Compiles
.ksyformat descriptions into Elixir modules. - Parses binary data into structured maps with those modules.
- Writes back — serializes a parsed (and possibly modified) map into its binary form.
Installation
Add ksc to your dependencies in mix.exs:
def deps do
[
{:ksc, "~> 0.1.0"}
]
end
Quick Start
Given a Kaitai Struct format definition (hello_world.ksy):
meta:
id: hello_world
seq:
- id: one
type: u1
Compile it to an Elixir source file:
mix ksc.compile hello_world.ksy --output lib/formats
This writes lib/formats/hello_world.ex containing a Ksc.Compiled.HelloWorld module. You can also point it at a directory to compile all .ksy files at once:
mix ksc.compile my_formats/ --output lib/formats
Use --namespace to set a custom module prefix (default: Ksc.Compiled):
mix ksc.compile my_formats/ --output lib/formats --namespace MyApp.Formats
Then use the generated module to parse binary data:
result = Ksc.Compiled.HelloWorld.from_file("data.bin")
result.one
#=> 80
result = Ksc.Compiled.HelloWorld.from_binary(<<42>>)
result.one
#=> 42
Example: Parsing with Enums
# enum_0.ksy
meta:
id: enum_0
endian: le
seq:
- id: pet_1
type: u4
enum: animal
- id: pet_2
type: u4
enum: animal
enums:
animal:
4: dog
7: cat
12: chicken
{:ok, mod} = Ksc.compile_and_load("enum_0.ksy")
result = mod.from_binary(<<7, 0, 0, 0, 12, 0, 0, 0>>)
result.pet_1 #=> :cat
result.pet_2 #=> :chicken
Write-back
Ksc can also serialize a parsed map back into binary. Pass writer: true at
compile time to generate to_binary/1 and to_file/2 alongside the readers:
mix ksc.compile hello_world.ksy --output lib/formats --writer
or programmatically:
{:ok, mod} = Ksc.compile_and_load("hello_world.ksy", writer: true)
data = mod.from_binary(File.read!("in.bin"))
data = put_in(data, [:header, :version], 2)
File.write!("out.bin", mod.to_binary(data))
Length / count fields
When a size: or repeat-expr: reads from another seq field (a "controller"),
the writer overwrites that controller from the actual payload before emitting
bytes — so you can freely grow or shrink a controlled field without touching
the length field:
seq:
- id: name_len
type: u2
- id: name
size: name_len
m = mod.from_binary(<<5, 0, "hello">>)
mod.to_binary(%{m | name: "goodbye"}) #=> <<7, 0, "goodbye">>
# ^^ writer auto-updated name_len
Supported controller expressions: a bare field reference (size: foo) or a
single arithmetic op with an integer literal (size: foo + 8, size: 100 - foo,
size: foo * 2, size: foo / 4). Multiplicative/divisive forms raise
:non_invertible_controller if the actual length doesn't divide cleanly.
For non-simple expressions (size: header.x * 2, size: 16), the writer keeps
strict semantics: pads with pad-right (or zero) when the payload is shorter
than declared, raises :size_overflow when longer.
v1 limitations
- Encodings on write: UTF-8, ASCII, UTF-16LE, UTF-16BE. SJIS / IBM437 raise.
- Instances are not written. Value instances (computed from other fields) are recomputed on the next read. Positional instances are lost on write-back.
process: zlibwrites are semantically correct but not byte-identical (re-compression).- Custom
process:modules must implementencode/2for write-back. - Switch types with no
_case: rely on parser-stashed raw bytes in the map.
Running Tests
Ksc uses the official Kaitai Struct test suite for validation.
mix deps.get
mix test
Additional write-back test suites (opt-in via tag):
# Broad round-trip test: parse → to_binary → from_binary → assert equal
mix test --only writer_roundtrip
# Broad mutation test: parse → mutate every field → to_binary → from_binary → assert equal
mix test --only writer_mutation
# Reproduce a specific mutation seed
MUTATION_SEED=42 mix test --only writer_mutation