Guides

Data Types Guide

Guide to all data types supported by Sia, including integers, strings, byte arrays, and complex types.

Overview

Sia supports integers, strings, byte arrays, booleans, BigInts, and typed arrays. Each type has symmetric add* and read* methods, and the number suffix indicates the bit width used for the value itself (integers) or the length prefix (strings, byte arrays, arrays).

This guide covers every supported type, when to use it, and how to combine them for real-world serialization.

Integer Types

All multi-byte integers are stored in little-endian byte order.

Unsigned Integers

Unsigned integers store non-negative whole numbers. The suffix indicates the storage size in bits.

import { Sia } from "@timeleap/sia";

const sia = new Sia();

sia.addUInt8(255); // 1 byte  — range: 0 to 255
sia.addUInt16(65535); // 2 bytes — range: 0 to 65,535
sia.addUInt32(4294967295); // 4 bytes — range: 0 to 4,294,967,295
sia.addUInt64(Date.now()); // 8 bytes — range: 0 to 2^53 - 1*

sia.seek(0);

const a = sia.readUInt8(); // 255
const b = sia.readUInt16(); // 65535
const c = sia.readUInt32(); // 4294967295
const d = sia.readUInt64(); // timestamp
64-bit integers use JavaScript number (a 64-bit float), so values are accurate only up to Number.MAX_SAFE_INTEGER (2^53 - 1). For larger values, use addBigInt.

Signed Integers

Signed integers store values that can be negative. They use two's complement encoding.

const sia = new Sia();

sia.addInt8(-128); // 1 byte  — range: -128 to 127
sia.addInt16(-32768); // 2 bytes — range: -32,768 to 32,767
sia.addInt32(-2147483648); // 4 bytes — range: -2^31 to 2^31 - 1
sia.addInt64(-1000000000); // 8 bytes — range: -(2^53 - 1) to 2^53 - 1*

sia.seek(0);

const a = sia.readInt8(); // -128
const b = sia.readInt16(); // -32768
const c = sia.readInt32(); // -2147483648
const d = sia.readInt64(); // -1000000000

When to Use Each Integer Size

TypeBytesRangeUse Case
UInt8 / Int810–255 / -128–127Flags, enum values, small counters
UInt16 / Int1620–65,535 / -32,768–32,767Ports, medium counters, character codes
UInt32 / Int3240–4.29B / -2.15B–2.15BIDs, timestamps (seconds), counts
UInt64 / Int6480–2^53 / -(2^53)–2^53Timestamps (ms), large counters, offsets

BigInt Type

For arbitrary-precision integers that exceed Number.MAX_SAFE_INTEGER, use addBigInt and readBigInt.

const sia = new Sia();

const largeValue = 123456789012345678901234567890n;
sia.addBigInt(largeValue);

sia.seek(0);
const result = sia.readBigInt(); // 123456789012345678901234567890n

BigInt values are stored as a byte array with an 8-bit length prefix (the hex representation is converted to bytes).

BigInt serialization is limited to values that fit in 255 bytes (the hex representation). Values exceeding this limit will throw an error. For most use cases (cryptographic hashes, blockchain values, large counters), this is more than sufficient.

Boolean Type

Booleans are stored as a single byte: 1 for true, 0 for false.

const sia = new Sia();

sia.addBool(true);
sia.addBool(false);

sia.seek(0);
sia.readBool(); // true
sia.readBool(); // false

Booleans are commonly used as flags in serialized structures:

// Serialize a record with optional fields
sia.addBool(hasEmail);
if (hasEmail) {
  sia.addString8(email);
}

String Types

Sia offers multiple string encoding methods, each suited to different data profiles.

ASCII

The fastest encoding option. Supports only ASCII characters (code points 0–127). The string is prefixed with an 8-bit length.

const sia = new Sia();

sia.addAscii8("hello");
sia.seek(0);
sia.readAscii8(); // "hello"

Max length: 255 characters.

UTFZ (Compressed UTF-8)

A compressed UTF-8 encoding via the utfz-lib library. Produces smaller output than standard UTF-8 for many strings, especially short multilingual text. The encoded data is prefixed with an 8-bit length.

const sia = new Sia();

sia.addUtfz("Hallo Welt");
sia.seek(0);
sia.readUtfz(); // "Hallo Welt"

Max encoded length: 255 bytes.

String8 / String16 / String32 / String64

Standard UTF-8 encoding with different length prefix sizes. The suffix indicates the bit width of the length prefix, which determines the maximum string size.

const sia = new Sia();

sia.addString8("short"); // 8-bit prefix → up to 255 bytes
sia.addString16("medium length"); // 16-bit prefix → up to 65,535 bytes
sia.addString32("long text..."); // 32-bit prefix → up to ~4 GB
sia.addString64("huge..."); // 64-bit prefix → up to 2^53 bytes

sia.seek(0);
sia.readString8(); // "short"
sia.readString16(); // "medium length"
sia.readString32(); // "long text..."
sia.readString64(); // "huge..."

When to Use Each String Type

MethodPrefixMax SizeBest For
addAsciiNNoneFixedFixed-width codes (currency, country)
addAscii88-bit255 charsIdentifiers, keys, enum-like values (ASCII only)
addAscii1616-bit65,535Longer ASCII strings
addUtfz8-bit255 bytesShort multilingual strings, labels
addString88-bit255 bytesTypical short strings (names, titles)
addString1616-bit64 KBDescriptions, message bodies
addString3232-bit~4 GBLarge text content, documents
addString6464-bit~2^53 bytesExtremely large text (rare)
Prefer addAscii8 for strings you know are ASCII-only (identifiers, enum values, HTTP headers). It skips TextEncoder and does a direct byte copy. Use addAsciiN for fixed-width fields where the length is known at compile time. Use addString8 as the general-purpose default.

Byte Array Types

Byte arrays store raw binary data (Uint8Array) with a length prefix. The suffix indicates the length prefix size.

const sia = new Sia();
const payload = new Uint8Array([0x01, 0x02, 0x03, 0x04]);

sia.addByteArray8(payload); // 8-bit length prefix
sia.addByteArray16(payload); // 16-bit length prefix
sia.addByteArray32(payload); // 32-bit length prefix
sia.addByteArray64(payload); // 64-bit length prefix

sia.seek(0);
sia.readByteArray8(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray16(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray32(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray64(); // Uint8Array [1, 2, 3, 4]

Raw Byte Arrays (No Length Prefix)

Use addByteArrayN when the length is known at read time (e.g., fixed-size fields like hashes or keys):

const hash = new Uint8Array(32); // SHA-256 hash is always 32 bytes
sia.addByteArrayN(hash);

// Reader knows the length
sia.seek(0);
const readHash = sia.readByteArrayN(32);

Zero-Copy Reads with asReference

By default, readByteArray* copies the bytes out of the buffer. Pass true to get a zero-copy subarray reference instead:

// Copy (safe, independent of source buffer)
const copy = sia.readByteArray32();

// Reference (zero-copy, shares memory with source)
const ref = sia.readByteArray32(true);
References share memory with the Sia buffer. If the buffer is reused or overwritten, the reference will see the changes. Only use references for data you will consume immediately.

Array Types

Sia supports typed arrays with element serialization callbacks. The suffix indicates the bit width of the element count prefix.

Simple Arrays

const sia = new Sia();
const numbers = [10, 20, 30, 40, 50];

// Serialize: provide an array and a callback to write each element
sia.addArray8(numbers, (s, n) => s.addUInt8(n));

sia.seek(0);

// Deserialize: provide a callback to read each element
const result = sia.readArray8((s) => s.readUInt8());
// [10, 20, 30, 40, 50]

Complex Object Arrays

Array callbacks work with any data type, including nested objects:

interface User {
  name: string;
  age: number;
  active: boolean;
}

const users: User[] = [
  { name: "Alice", age: 30, active: true },
  { name: "Bob", age: 25, active: false },
];

const sia = new Sia();

// Serialize
sia.addArray8(users, (s, user) => {
  s.addString8(user.name).addUInt8(user.age).addBool(user.active);
});

// Deserialize
sia.seek(0);
const decoded = sia.readArray8((s) => ({
  name: s.readString8(),
  age: s.readUInt8(),
  active: s.readBool(),
}));
// [{ name: "Alice", age: 30, active: true }, { name: "Bob", age: 25, active: false }]

Array Size Variants

MethodCount PrefixMax Elements
addArray8 / readArray88-bit255
addArray16 / readArray1616-bit65,535
addArray32 / readArray3232-bit~4.29 billion
addArray64 / readArray6464-bit~2^53

Choosing the Right Type

When serializing data, you want to pick the smallest type that fits your data. Every extra byte in the length prefix or value is overhead multiplied by every record you serialize.

Size vs. Overhead Trade-Off

DataNaive ChoiceOptimal ChoiceSavings per Record
Age (0–150)UInt32 (4 bytes)UInt8 (1 byte)3 bytes
Country codeString32 (4 + N bytes)addAscii8 (1 + N bytes)3 bytes
Short nameString32 (4 + N bytes)String8 (1 + N bytes)3 bytes
Enum (0–10)UInt16 (2 bytes)UInt8 (1 byte)1 byte
Timestamp (ms)String32 (JSON-style)UInt64 (8 bytes)~5 bytes
Small list (< 256)Array32 (4-byte prefix)Array8 (1-byte prefix)3 bytes

These savings add up quickly. In a message containing 1,000 user records, saving 10 bytes per record saves 10 KB per message.

Best Practices

  1. Use the smallest integer type that fits your range. Don't use UInt32 for a value that never exceeds 255.
  2. Use addAscii8 for ASCII-only strings. It is faster than UTF-8 encoding and produces the same output for ASCII content.
  3. Use addByteArrayN for fixed-size data. When the length is known at read time (hashes, keys, UUIDs), skip the length prefix.
  4. Pass true to readByteArray* for temporary data. Zero-copy reads avoid allocations in hot paths.
  5. Use addArray8 for small collections. Most arrays in practice have fewer than 256 elements. Save 1–7 bytes on the count prefix.
  6. Store timestamps as UInt64. Millisecond timestamps fit comfortably in 8 bytes and are much smaller than ISO string representations.
  7. Use booleans as presence flags for optional fields. Prefix optional data with addBool so the reader knows whether to expect the field.
  8. Match add* and read* calls exactly. Every serialization must have a matching deserialization in the same order with the same types. There is no schema to catch mismatches at compile time.

Real-World Example: User Profile Serialization

Here is a complete example combining multiple data types to serialize and deserialize a user profile:

import { Sia } from "@timeleap/sia";

interface UserProfile {
  id: number;
  username: string;
  email: string;
  age: number;
  isVerified: boolean;
  bio: string | null;
  permissions: number[];
  avatar: Uint8Array | null;
  createdAt: number;
}

function serializeProfile(profile: UserProfile): Uint8Array {
  const sia = new Sia();

  sia
    .addUInt32(profile.id) // 4 bytes: numeric ID
    .addAscii8(profile.username) // 1 + N bytes: ASCII username
    .addString8(profile.email) // 1 + N bytes: UTF-8 email
    .addUInt8(profile.age) // 1 byte: age 0–255
    .addBool(profile.isVerified) // 1 byte: boolean flag
    .addBool(profile.bio !== null); // 1 byte: presence flag

  if (profile.bio !== null) {
    sia.addString16(profile.bio); // 2 + N bytes: bio text (may be long)
  }

  sia.addArray8(profile.permissions, (s, perm) => s.addUInt8(perm));

  sia.addBool(profile.avatar !== null);
  if (profile.avatar !== null) {
    sia.addByteArray16(profile.avatar); // 2 + N bytes: avatar data
  }

  sia.addUInt64(profile.createdAt); // 8 bytes: timestamp in ms

  return sia.toUint8Array();
}

function deserializeProfile(data: Uint8Array): UserProfile {
  const sia = new Sia(data);

  const id = sia.readUInt32();
  const username = sia.readAscii8();
  const email = sia.readString8();
  const age = sia.readUInt8();
  const isVerified = sia.readBool();

  const hasBio = sia.readBool();
  const bio = hasBio ? sia.readString16() : null;

  const permissions = sia.readArray8((s) => s.readUInt8());

  const hasAvatar = sia.readBool();
  const avatar = hasAvatar ? sia.readByteArray16() : null;

  const createdAt = sia.readUInt64();

  return {
    id,
    username,
    email,
    age,
    isVerified,
    bio,
    permissions,
    avatar,
    createdAt,
  };
}

This serializes a user profile into roughly 20–30 bytes of overhead plus the variable-length fields, compared to hundreds of bytes for the equivalent JSON representation.