Core

Serialization

Guide to serializing data into compact binary format using Sia's chainable add methods.

Overview

Serialization in Sia means writing typed values into a byte buffer using add* methods. Every add* method follows the same pattern:

  1. Write the value at the current offset
  2. Advance the offset by the number of bytes written
  3. Return this (the Sia instance) for method chaining

This design produces compact binary output with zero overhead: no field names, no delimiters, no schema metadata. The reader must know the exact order and types of values to deserialize correctly.

Primitive Types

Integers

Sia supports both unsigned and signed integers in 8, 16, 32, and 64-bit widths. All multi-byte integers use little-endian byte order.

8-bit
import { Sia } from "@timeleap/sia";

const sia = new Sia();

// Unsigned: 0 to 255 (1 byte)
sia.addUInt8(255);

// Signed: -128 to 127 (1 byte)
sia.addInt8(-120);
16-bit
import { Sia } from "@timeleap/sia";

const sia = new Sia();

// Unsigned: 0 to 65,535 (2 bytes)
sia.addUInt16(65535);

// Signed: -32,768 to 32,767 (2 bytes)
sia.addInt16(-32768);
32-bit
import { Sia } from "@timeleap/sia";

const sia = new Sia();

// Unsigned: 0 to 4,294,967,295 (4 bytes)
sia.addUInt32(4294967295);

// Signed: -2,147,483,648 to 2,147,483,647 (4 bytes)
sia.addInt32(-2147483648);
64-bit
import { Sia } from "@timeleap/sia";

const sia = new Sia();

// Unsigned: 0 to 2^53 - 1 (8 bytes)
sia.addUInt64(Number.MAX_SAFE_INTEGER);

// Signed: -(2^53 - 1) to 2^53 - 1 (8 bytes)
sia.addInt64(Number.MIN_SAFE_INTEGER);
64-bit integers use JavaScript's number type, which is a 64-bit IEEE 754 float. Values are only accurate up to Number.MAX_SAFE_INTEGER (2^53 - 1). For values beyond this range, use addBigInt instead.

Integer Method Reference

MethodRangeBytesEndianness
addUInt80 to 2551N/A
addInt8-128 to 1271N/A
addUInt160 to 65,5352Little-endian
addInt16-32,768 to 32,7672Little-endian
addUInt320 to 4,294,967,2954Little-endian
addInt32-2,147,483,648 to 2^31 - 14Little-endian
addUInt640 to 2^53 - 18Little-endian
addInt64-(2^53 - 1) to 2^53 - 18Little-endian

How It Works

Here's the implementation of addUInt32 to illustrate the pattern:

addUInt32(n: number): Sia {
  // Write 4 bytes at the current offset (little-endian)
  this.dataView.setUint32(this.offset, n, true);
  // Advance the offset
  this.offset += 4;
  // Return this for chaining
  return this;
}

All integer methods use DataView for multi-byte operations, which handles byte order correctly across platforms.

Booleans

Booleans are stored as a single byte: 1 for true, 0 for false.

const sia = new Sia();

sia.addBool(true); // writes 0x01
sia.addBool(false); // writes 0x00

BigInt

For arbitrary-precision integers that exceed Number.MAX_SAFE_INTEGER, use addBigInt:

const sia = new Sia();

const bigValue = 123456789012345678901234567890n;
sia.addBigInt(bigValue);

Internally, addBigInt converts the BigInt to a hex string, packs it into a byte array, and writes it with an 8-bit length prefix using addByteArray8:

addBigInt(n: bigint): Sia {
  let hex = n.toString(16);
  if (hex.length % 2 === 1) {
    hex = "0" + hex;
  }

  const length = hex.length / 2;
  const bytes = new Uint8Array(length);

  for (let i = 0; i < length; i++) {
    bytes[i] = parseInt(hex.slice(i * 2, i * 2 + 2), 16);
  }

  if (length > 255) {
    throw new Error("BigInt too large for this simple implementation");
  }

  return this.addByteArray8(bytes);
}
BigInt serialization uses an 8-bit length prefix, so the maximum byte representation is 255 bytes. Values requiring more than 255 bytes will throw an error.

Strings

Sia provides multiple string encoding methods optimized for different use cases.

UTF-8 Strings

The standard string methods encode text as UTF-8 bytes with a length prefix. The number suffix indicates how many bits are used for the length prefix:

String8
const sia = new Sia();

// 8-bit length prefix: up to 255 bytes
sia.addString8("Hello, world!");
String16
const sia = new Sia();

// 16-bit length prefix: up to 65,535 bytes
sia.addString16("A longer string for descriptions or message bodies");
String32
const sia = new Sia();

// 32-bit length prefix: up to ~4 GB
sia.addString32("Very large content...");
String64
const sia = new Sia();

// 64-bit length prefix: up to 2^53 - 1 bytes
sia.addString64("Extremely large content...");

How It Works

Each string method encodes the string to UTF-8 using TextEncoder, then delegates to the corresponding addByteArray* method:

addString8(str: string): Sia {
  const encodedString = this.encoder.encode(str);
  return this.addByteArray8(encodedString);
}

The binary layout is: [length prefix][UTF-8 bytes]

ASCII Strings (Optimized)

For strings that contain only ASCII characters (codes 0--127), the addAscii* methods provide faster encoding by bypassing TextEncoder entirely:

const sia = new Sia();

sia.addAscii8("hello"); // 8-bit length prefix, max 255 chars
sia.addAscii8("HTTP/1.1");
sia.addAsciiN("USD"); // no length prefix (fixed-length field)
sia.addAscii16("longer-ascii-key"); // 16-bit length prefix

Like UTF-8 strings, ASCII methods come in variants by length-prefix size: addAscii8, addAscii16, addAscii32, addAscii64. There is also addAsciiN which writes the string with no length prefix at all: the reader must know the length in advance.

ASCII methods do not validate that the input is actually ASCII. Non-ASCII characters will produce incorrect output silently. Only use these methods when you are certain the string contains only ASCII characters.

UTFZ Strings (Compressed)

The UTFZ encoding compresses UTF-8 strings using the utfz-lib library. It's particularly effective for short multilingual strings:

const sia = new Sia();

sia.addUtfz("Hello, UTFZ!");
sia.addUtfz("Grüße");

UTFZ uses an 8-bit length prefix for the compressed byte count, so the compressed output must fit within 255 bytes.

Choosing the Right String Method

MethodEncodingMax LengthBest For
addString8UTF-8255 bytesMost strings (names, labels)
addString16UTF-865,535 bytesLonger text (descriptions, bodies)
addString32UTF-8~4 GBVery large text content
addString64UTF-82^53 - 1Extremely large content
addAsciiNASCIIFixedFixed-width ASCII fields
addAscii8ASCII255 charsKnown-ASCII, performance-critical
addAscii16ASCII65,535 charsLonger ASCII strings
addAscii32ASCII~4 billionLarge ASCII payloads
addUtfzUTFZ255 bytesShort multilingual strings

Byte Arrays

Raw byte data follows the same length-prefix pattern as strings:

With Length Prefix
const sia = new Sia();
const payload = new Uint8Array([0x01, 0x02, 0x03]);

sia.addByteArray8(payload); // 8-bit length prefix (max 255 bytes)
sia.addByteArray16(payload); // 16-bit length prefix (max 65,535 bytes)
sia.addByteArray32(payload); // 32-bit length prefix (max ~4 GB)
sia.addByteArray64(payload); // 64-bit length prefix (max 2^53 - 1)
Without Length Prefix
const sia = new Sia();
const fixedData = new Uint8Array([0xff, 0xfe, 0xfd]);

// Write raw bytes with no length prefix
// The reader must know the exact length
sia.addByteArrayN(fixedData);

Each addByteArray* method writes the length prefix first, then the raw bytes:

addByteArray8(bytes: Uint8Array): Sia {
  return this.addUInt8(bytes.length).addByteArrayN(bytes);
}

Arrays

Sia provides generic array methods that accept custom serializer functions. This lets you encode arrays of any type.

Basic Array Serialization

const sia = new Sia();

const scores = [100, 200, 300, 400, 500];

// Write array with 8-bit length prefix (max 255 items)
sia.addArray8(scores, (s, score) => s.addUInt16(score));

Complex Object Arrays

Define a serializer function for structured data:

interface Player {
  name: string;
  score: number;
  alive: boolean;
}

function writePlayer(sia: Sia, player: Player): void {
  sia.addString8(player.name).addUInt32(player.score).addBool(player.alive);
}

const sia = new Sia();
const players: Player[] = [
  { name: "Alice", score: 1500, alive: true },
  { name: "Bob", score: 900, alive: false },
];

sia.addArray8(players, writePlayer);

Array Method Variants

MethodMax ItemsLength Prefix
addArray82551 byte
addArray1665,5352 bytes
addArray32~4 billion4 bytes
addArray642^53 - 18 bytes

How It Works

addArray8<T>(arr: T[], fn: (s: Sia, item: T) => void): Sia {
  this.addUInt8(arr.length);
  arr.forEach((item) => fn(this, item));
  return this;
}

The method writes the array length as a prefix, then iterates over each item, calling your serializer function. The Sia instance is passed as the first argument, so your function can chain further writes.

Embedding Data

embedSia: Embed Another Sia Instance

Embed the serialized content of one Sia instance into another:

const header = new Sia();
header.addUInt8(1).addUInt16(42);

const body = new Sia();
body.addString8("payload data");

const packet = new Sia();
packet.embedSia(header).embedSia(body);

This copies the bytes from offset 0 to the current offset of the source Sia into the target.

embedBytes: Embed Raw Bytes

Embed a raw Uint8Array directly:

const raw = new Uint8Array([0xff, 0xfe, 0xfd]);
sia.embedBytes(raw);

Both embedSia and embedBytes write the bytes without any length prefix. The reader must know the exact layout to deserialize correctly.

Method Chaining

All add* methods return this, enabling fluent method chaining:

import { Sia } from "@timeleap/sia";

const sia = new Sia();

const bytes = sia
  .addUInt8(1) // version
  .addString8("Alice") // name
  .addUInt32(1000) // score
  .addBool(true) // active
  .addArray8(
    // inventory
    ["sword", "shield"],
    (s, item) => s.addString8(item),
  )
  .toUint8Array();

This is equivalent to calling each method separately but produces more concise, readable code.

Practical Example: Game State

Here's a complete example serializing a game state with nested objects:

import { Sia } from "@timeleap/sia";

interface GameState {
  version: number;
  timestamp: number;
  players: Player[];
  mapName: string;
  gameOver: boolean;
}

interface Player {
  id: number;
  name: string;
  x: number;
  y: number;
  health: number;
  inventory: string[];
}

function writePlayer(sia: Sia, player: Player): void {
  sia
    .addUInt16(player.id)
    .addString8(player.name)
    .addInt32(player.x)
    .addInt32(player.y)
    .addUInt8(player.health)
    .addArray8(player.inventory, (s, item) => s.addString8(item));
}

function serializeGameState(state: GameState): Uint8Array {
  const sia = new Sia();

  sia
    .addUInt8(state.version)
    .addUInt64(state.timestamp)
    .addArray16(state.players, writePlayer)
    .addString8(state.mapName)
    .addBool(state.gameOver);

  return sia.toUint8Array();
}

// Usage
const state: GameState = {
  version: 1,
  timestamp: Date.now(),
  players: [
    {
      id: 1,
      name: "Alice",
      x: 100,
      y: -50,
      health: 95,
      inventory: ["sword", "potion"],
    },
    {
      id: 2,
      name: "Bob",
      x: -200,
      y: 300,
      health: 60,
      inventory: ["bow", "arrow", "shield"],
    },
  ],
  mapName: "dungeon_01",
  gameOver: false,
};

const bytes = serializeGameState(state);
// Compact binary output -- tens of bytes instead of hundreds with JSON

Common Patterns

Best Practices

Choose the Smallest Type

Use the smallest integer width that fits your data. addUInt8 for values under 256, addUInt16 for values under 65,536, etc. This minimizes payload size.

Match Read and Write Order

Every add* call must have a corresponding read* call in exactly the same order. Sia has no field markers: order is the schema.

Use ASCII for Known Strings

When strings are guaranteed to be ASCII (method names, status codes, identifiers), addAscii8 is faster than addString8 because it skips TextEncoder.

Prefer Smaller Array Prefixes

Use addArray8 for collections with fewer than 256 items. The 1-byte length prefix saves space compared to addArray32's 4-byte prefix.