Guides

Data Types Guide

Guide to all data types supported by Sia, including integers, strings, byte arrays, and complex types.

Overview

Sia supports integers, strings, byte arrays, booleans, BigInts, and typed arrays. Each type has symmetric add* and read* methods, and the number suffix indicates the bit width used for the value itself (integers) or the length prefix (strings, byte arrays, arrays).

This guide covers every supported type, when to use it, and how to combine them for real-world serialization.

Integer Types

All multi-byte integers are stored in little-endian byte order.

Unsigned Integers

Unsigned integers store non-negative whole numbers. The suffix indicates the storage size in bits.

import { Sia } from "@timeleap/sia";

const sia = new Sia();

sia.addUInt8(255); // 1 byte  — range: 0 to 255
sia.addUInt16(65535); // 2 bytes — range: 0 to 65,535
sia.addUInt32(4294967295); // 4 bytes — range: 0 to 4,294,967,295
sia.addUInt64(Date.now()); // 8 bytes — range: 0 to 2^53 - 1*

sia.seek(0);

const a = sia.readUInt8(); // 255
const b = sia.readUInt16(); // 65535
const c = sia.readUInt32(); // 4294967295
const d = sia.readUInt64(); // timestamp

64-bit integers use JavaScript number (a 64-bit float), so values are accurate only up to Number.MAX_SAFE_INTEGER (2^53 - 1). For larger values, use addBigInt.

Signed Integers

Signed integers store values that can be negative. They use two's complement encoding.

const sia = new Sia();

sia.addInt8(-128); // 1 byte  — range: -128 to 127
sia.addInt16(-32768); // 2 bytes — range: -32,768 to 32,767
sia.addInt32(-2147483648); // 4 bytes — range: -2^31 to 2^31 - 1
sia.addInt64(-1000000000); // 8 bytes — range: -(2^53 - 1) to 2^53 - 1*

sia.seek(0);

const a = sia.readInt8(); // -128
const b = sia.readInt16(); // -32768
const c = sia.readInt32(); // -2147483648
const d = sia.readInt64(); // -1000000000

When to Use Each Integer Size

Type	Bytes	Range	Use Case
`UInt8` / `Int8`	1	0–255 / -128–127	Flags, enum values, small counters
`UInt16` / `Int16`	2	0–65,535 / -32,768–32,767	Ports, medium counters, character codes
`UInt32` / `Int32`	4	0–4.29B / -2.15B–2.15B	IDs, timestamps (seconds), counts
`UInt64` / `Int64`	8	0–2^53 / -(2^53)–2^53	Timestamps (ms), large counters, offsets

BigInt Type

For arbitrary-precision integers that exceed Number.MAX_SAFE_INTEGER, use addBigInt and readBigInt.

const sia = new Sia();

const largeValue = 123456789012345678901234567890n;
sia.addBigInt(largeValue);

sia.seek(0);
const result = sia.readBigInt(); // 123456789012345678901234567890n

BigInt values are stored as a byte array with an 8-bit length prefix (the hex representation is converted to bytes).

BigInt serialization is limited to values that fit in 255 bytes (the hex representation). Values exceeding this limit will throw an error. For most use cases (cryptographic hashes, blockchain values, large counters), this is more than sufficient.

Boolean Type

Booleans are stored as a single byte: 1 for true, 0 for false.

const sia = new Sia();

sia.addBool(true);
sia.addBool(false);

sia.seek(0);
sia.readBool(); // true
sia.readBool(); // false

Booleans are commonly used as flags in serialized structures:

// Serialize a record with optional fields
sia.addBool(hasEmail);
if (hasEmail) {
  sia.addString8(email);
}

String Types

Sia offers multiple string encoding methods, each suited to different data profiles.

ASCII

The fastest encoding option. Supports only ASCII characters (code points 0–127). The string is prefixed with an 8-bit length.

const sia = new Sia();

sia.addAscii8("hello");
sia.seek(0);
sia.readAscii8(); // "hello"

Max length: 255 characters.

UTFZ (Compressed UTF-8)

A compressed UTF-8 encoding via the utfz-lib library. Produces smaller output than standard UTF-8 for many strings, especially short multilingual text. The encoded data is prefixed with an 8-bit length.

const sia = new Sia();

sia.addUtfz("Hallo Welt");
sia.seek(0);
sia.readUtfz(); // "Hallo Welt"

Max encoded length: 255 bytes.

String8 / String16 / String32 / String64

Standard UTF-8 encoding with different length prefix sizes. The suffix indicates the bit width of the length prefix, which determines the maximum string size.

const sia = new Sia();

sia.addString8("short"); // 8-bit prefix → up to 255 bytes
sia.addString16("medium length"); // 16-bit prefix → up to 65,535 bytes
sia.addString32("long text..."); // 32-bit prefix → up to ~4 GB
sia.addString64("huge..."); // 64-bit prefix → up to 2^53 bytes

sia.seek(0);
sia.readString8(); // "short"
sia.readString16(); // "medium length"
sia.readString32(); // "long text..."
sia.readString64(); // "huge..."

When to Use Each String Type

Method	Prefix	Max Size	Best For
`addAsciiN`	None	Fixed	Fixed-width codes (currency, country)
`addAscii8`	8-bit	255 chars	Identifiers, keys, enum-like values (ASCII only)
`addAscii16`	16-bit	65,535	Longer ASCII strings
`addUtfz`	8-bit	255 bytes	Short multilingual strings, labels
`addString8`	8-bit	255 bytes	Typical short strings (names, titles)
`addString16`	16-bit	64 KB	Descriptions, message bodies
`addString32`	32-bit	~4 GB	Large text content, documents
`addString64`	64-bit	~2^53 bytes	Extremely large text (rare)

Prefer addAscii8 for strings you know are ASCII-only (identifiers, enum values, HTTP headers). It skips TextEncoder and does a direct byte copy. Use addAsciiN for fixed-width fields where the length is known at compile time. Use addString8 as the general-purpose default.

Byte Array Types

Byte arrays store raw binary data (Uint8Array) with a length prefix. The suffix indicates the length prefix size.

const sia = new Sia();
const payload = new Uint8Array([0x01, 0x02, 0x03, 0x04]);

sia.addByteArray8(payload); // 8-bit length prefix
sia.addByteArray16(payload); // 16-bit length prefix
sia.addByteArray32(payload); // 32-bit length prefix
sia.addByteArray64(payload); // 64-bit length prefix

sia.seek(0);
sia.readByteArray8(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray16(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray32(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray64(); // Uint8Array [1, 2, 3, 4]

Raw Byte Arrays (No Length Prefix)

Use addByteArrayN when the length is known at read time (e.g., fixed-size fields like hashes or keys):

const hash = new Uint8Array(32); // SHA-256 hash is always 32 bytes
sia.addByteArrayN(hash);

// Reader knows the length
sia.seek(0);
const readHash = sia.readByteArrayN(32);

Zero-Copy Reads with `asReference`

By default, readByteArray* copies the bytes out of the buffer. Pass true to get a zero-copy subarray reference instead:

// Copy (safe, independent of source buffer)
const copy = sia.readByteArray32();

// Reference (zero-copy, shares memory with source)
const ref = sia.readByteArray32(true);

References share memory with the Sia buffer. If the buffer is reused or overwritten, the reference will see the changes. Only use references for data you will consume immediately.

Array Types

Sia supports typed arrays with element serialization callbacks. The suffix indicates the bit width of the element count prefix.

Simple Arrays

const sia = new Sia();
const numbers = [10, 20, 30, 40, 50];

// Serialize: provide an array and a callback to write each element
sia.addArray8(numbers, (s, n) => s.addUInt8(n));

sia.seek(0);

// Deserialize: provide a callback to read each element
const result = sia.readArray8((s) => s.readUInt8());
// [10, 20, 30, 40, 50]

Complex Object Arrays

Array callbacks work with any data type, including nested objects:

interface User {
  name: string;
  age: number;
  active: boolean;
}

const users: User[] = [
  { name: "Alice", age: 30, active: true },
  { name: "Bob", age: 25, active: false },
];

const sia = new Sia();

// Serialize
sia.addArray8(users, (s, user) => {
  s.addString8(user.name).addUInt8(user.age).addBool(user.active);
});

// Deserialize
sia.seek(0);
const decoded = sia.readArray8((s) => ({
  name: s.readString8(),
  age: s.readUInt8(),
  active: s.readBool(),
}));
// [{ name: "Alice", age: 30, active: true }, { name: "Bob", age: 25, active: false }]

Array Size Variants

Method	Count Prefix	Max Elements
`addArray8` / `readArray8`	8-bit	255
`addArray16` / `readArray16`	16-bit	65,535
`addArray32` / `readArray32`	32-bit	~4.29 billion
`addArray64` / `readArray64`	64-bit	~2^53

Choosing the Right Type

When serializing data, you want to pick the smallest type that fits your data. Every extra byte in the length prefix or value is overhead multiplied by every record you serialize.

Size vs. Overhead Trade-Off

Data	Naive Choice	Optimal Choice	Savings per Record
Age (0–150)	`UInt32` (4 bytes)	`UInt8` (1 byte)	3 bytes
Country code	`String32` (4 + N bytes)	`addAscii8` (1 + N bytes)	3 bytes
Short name	`String32` (4 + N bytes)	`String8` (1 + N bytes)	3 bytes
Enum (0–10)	`UInt16` (2 bytes)	`UInt8` (1 byte)	1 byte
Timestamp (ms)	`String32` (JSON-style)	`UInt64` (8 bytes)	~5 bytes
Small list (< 256)	`Array32` (4-byte prefix)	`Array8` (1-byte prefix)	3 bytes

These savings add up quickly. In a message containing 1,000 user records, saving 10 bytes per record saves 10 KB per message.

Best Practices

Use the smallest integer type that fits your range. Don't use UInt32 for a value that never exceeds 255.
Use addAscii8 for ASCII-only strings. It is faster than UTF-8 encoding and produces the same output for ASCII content.
Use addByteArrayN for fixed-size data. When the length is known at read time (hashes, keys, UUIDs), skip the length prefix.
Pass true to readByteArray* for temporary data. Zero-copy reads avoid allocations in hot paths.
Use addArray8 for small collections. Most arrays in practice have fewer than 256 elements. Save 1–7 bytes on the count prefix.
Store timestamps as UInt64. Millisecond timestamps fit comfortably in 8 bytes and are much smaller than ISO string representations.
Use booleans as presence flags for optional fields. Prefix optional data with addBool so the reader knows whether to expect the field.
Match add* and read* calls exactly. Every serialization must have a matching deserialization in the same order with the same types. There is no schema to catch mismatches at compile time.

Real-World Example: User Profile Serialization

Here is a complete example combining multiple data types to serialize and deserialize a user profile:

import { Sia } from "@timeleap/sia";

interface UserProfile {
  id: number;
  username: string;
  email: string;
  age: number;
  isVerified: boolean;
  bio: string | null;
  permissions: number[];
  avatar: Uint8Array | null;
  createdAt: number;
}

function serializeProfile(profile: UserProfile): Uint8Array {
  const sia = new Sia();

  sia
    .addUInt32(profile.id) // 4 bytes: numeric ID
    .addAscii8(profile.username) // 1 + N bytes: ASCII username
    .addString8(profile.email) // 1 + N bytes: UTF-8 email
    .addUInt8(profile.age) // 1 byte: age 0–255
    .addBool(profile.isVerified) // 1 byte: boolean flag
    .addBool(profile.bio !== null); // 1 byte: presence flag

  if (profile.bio !== null) {
    sia.addString16(profile.bio); // 2 + N bytes: bio text (may be long)
  }

  sia.addArray8(profile.permissions, (s, perm) => s.addUInt8(perm));

  sia.addBool(profile.avatar !== null);
  if (profile.avatar !== null) {
    sia.addByteArray16(profile.avatar); // 2 + N bytes: avatar data
  }

  sia.addUInt64(profile.createdAt); // 8 bytes: timestamp in ms

  return sia.toUint8Array();
}

function deserializeProfile(data: Uint8Array): UserProfile {
  const sia = new Sia(data);

  const id = sia.readUInt32();
  const username = sia.readAscii8();
  const email = sia.readString8();
  const age = sia.readUInt8();
  const isVerified = sia.readBool();

  const hasBio = sia.readBool();
  const bio = hasBio ? sia.readString16() : null;

  const permissions = sia.readArray8((s) => s.readUInt8());

  const hasAvatar = sia.readBool();
  const avatar = hasAvatar ? sia.readByteArray16() : null;

  const createdAt = sia.readUInt64();

  return {
    id,
    username,
    email,
    age,
    isVerified,
    bio,
    permissions,
    avatar,
    createdAt,
  };
}

This serializes a user profile into roughly 20–30 bytes of overhead plus the variable-length fields, compared to hundreds of bytes for the equivalent JSON representation.

Edit this pageorReport an issue

VS Code Extension

Syntax highlighting for .sia schema files in VS Code.

Memory Management

Understand Sia's buffer allocation strategies, zero-copy reads, and how to optimize memory usage.