Data Types Guide
Overview
Sia supports integers, strings, byte arrays, booleans, BigInts, and typed arrays. Each type has symmetric add* and read* methods, and the number suffix indicates the bit width used for the value itself (integers) or the length prefix (strings, byte arrays, arrays).
This guide covers every supported type, when to use it, and how to combine them for real-world serialization.
Integer Types
All multi-byte integers are stored in little-endian byte order.
Unsigned Integers
Unsigned integers store non-negative whole numbers. The suffix indicates the storage size in bits.
import { Sia } from "@timeleap/sia";
const sia = new Sia();
sia.addUInt8(255); // 1 byte — range: 0 to 255
sia.addUInt16(65535); // 2 bytes — range: 0 to 65,535
sia.addUInt32(4294967295); // 4 bytes — range: 0 to 4,294,967,295
sia.addUInt64(Date.now()); // 8 bytes — range: 0 to 2^53 - 1*
sia.seek(0);
const a = sia.readUInt8(); // 255
const b = sia.readUInt16(); // 65535
const c = sia.readUInt32(); // 4294967295
const d = sia.readUInt64(); // timestamp
number (a 64-bit float), so values are
accurate only up to Number.MAX_SAFE_INTEGER (2^53 - 1). For larger values,
use addBigInt.Signed Integers
Signed integers store values that can be negative. They use two's complement encoding.
const sia = new Sia();
sia.addInt8(-128); // 1 byte — range: -128 to 127
sia.addInt16(-32768); // 2 bytes — range: -32,768 to 32,767
sia.addInt32(-2147483648); // 4 bytes — range: -2^31 to 2^31 - 1
sia.addInt64(-1000000000); // 8 bytes — range: -(2^53 - 1) to 2^53 - 1*
sia.seek(0);
const a = sia.readInt8(); // -128
const b = sia.readInt16(); // -32768
const c = sia.readInt32(); // -2147483648
const d = sia.readInt64(); // -1000000000
When to Use Each Integer Size
| Type | Bytes | Range | Use Case |
|---|---|---|---|
UInt8 / Int8 | 1 | 0–255 / -128–127 | Flags, enum values, small counters |
UInt16 / Int16 | 2 | 0–65,535 / -32,768–32,767 | Ports, medium counters, character codes |
UInt32 / Int32 | 4 | 0–4.29B / -2.15B–2.15B | IDs, timestamps (seconds), counts |
UInt64 / Int64 | 8 | 0–2^53 / -(2^53)–2^53 | Timestamps (ms), large counters, offsets |
BigInt Type
For arbitrary-precision integers that exceed Number.MAX_SAFE_INTEGER, use addBigInt and readBigInt.
const sia = new Sia();
const largeValue = 123456789012345678901234567890n;
sia.addBigInt(largeValue);
sia.seek(0);
const result = sia.readBigInt(); // 123456789012345678901234567890n
BigInt values are stored as a byte array with an 8-bit length prefix (the hex representation is converted to bytes).
Boolean Type
Booleans are stored as a single byte: 1 for true, 0 for false.
const sia = new Sia();
sia.addBool(true);
sia.addBool(false);
sia.seek(0);
sia.readBool(); // true
sia.readBool(); // false
Booleans are commonly used as flags in serialized structures:
// Serialize a record with optional fields
sia.addBool(hasEmail);
if (hasEmail) {
sia.addString8(email);
}
String Types
Sia offers multiple string encoding methods, each suited to different data profiles.
ASCII
The fastest encoding option. Supports only ASCII characters (code points 0–127). The string is prefixed with an 8-bit length.
const sia = new Sia();
sia.addAscii8("hello");
sia.seek(0);
sia.readAscii8(); // "hello"
Max length: 255 characters.
UTFZ (Compressed UTF-8)
A compressed UTF-8 encoding via the utfz-lib library. Produces smaller output than standard UTF-8 for many strings, especially short multilingual text. The encoded data is prefixed with an 8-bit length.
const sia = new Sia();
sia.addUtfz("Hallo Welt");
sia.seek(0);
sia.readUtfz(); // "Hallo Welt"
Max encoded length: 255 bytes.
String8 / String16 / String32 / String64
Standard UTF-8 encoding with different length prefix sizes. The suffix indicates the bit width of the length prefix, which determines the maximum string size.
const sia = new Sia();
sia.addString8("short"); // 8-bit prefix → up to 255 bytes
sia.addString16("medium length"); // 16-bit prefix → up to 65,535 bytes
sia.addString32("long text..."); // 32-bit prefix → up to ~4 GB
sia.addString64("huge..."); // 64-bit prefix → up to 2^53 bytes
sia.seek(0);
sia.readString8(); // "short"
sia.readString16(); // "medium length"
sia.readString32(); // "long text..."
sia.readString64(); // "huge..."
When to Use Each String Type
| Method | Prefix | Max Size | Best For |
|---|---|---|---|
addAsciiN | None | Fixed | Fixed-width codes (currency, country) |
addAscii8 | 8-bit | 255 chars | Identifiers, keys, enum-like values (ASCII only) |
addAscii16 | 16-bit | 65,535 | Longer ASCII strings |
addUtfz | 8-bit | 255 bytes | Short multilingual strings, labels |
addString8 | 8-bit | 255 bytes | Typical short strings (names, titles) |
addString16 | 16-bit | 64 KB | Descriptions, message bodies |
addString32 | 32-bit | ~4 GB | Large text content, documents |
addString64 | 64-bit | ~2^53 bytes | Extremely large text (rare) |
addAscii8 for strings you know are ASCII-only (identifiers, enum
values, HTTP headers). It skips TextEncoder and does a direct byte copy.
Use addAsciiN for fixed-width fields where the length is known at compile time.
Use addString8 as the general-purpose default.Byte Array Types
Byte arrays store raw binary data (Uint8Array) with a length prefix. The suffix indicates the length prefix size.
const sia = new Sia();
const payload = new Uint8Array([0x01, 0x02, 0x03, 0x04]);
sia.addByteArray8(payload); // 8-bit length prefix
sia.addByteArray16(payload); // 16-bit length prefix
sia.addByteArray32(payload); // 32-bit length prefix
sia.addByteArray64(payload); // 64-bit length prefix
sia.seek(0);
sia.readByteArray8(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray16(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray32(); // Uint8Array [1, 2, 3, 4]
sia.readByteArray64(); // Uint8Array [1, 2, 3, 4]
Raw Byte Arrays (No Length Prefix)
Use addByteArrayN when the length is known at read time (e.g., fixed-size fields like hashes or keys):
const hash = new Uint8Array(32); // SHA-256 hash is always 32 bytes
sia.addByteArrayN(hash);
// Reader knows the length
sia.seek(0);
const readHash = sia.readByteArrayN(32);
Zero-Copy Reads with asReference
By default, readByteArray* copies the bytes out of the buffer. Pass true to get a zero-copy subarray reference instead:
// Copy (safe, independent of source buffer)
const copy = sia.readByteArray32();
// Reference (zero-copy, shares memory with source)
const ref = sia.readByteArray32(true);
Array Types
Sia supports typed arrays with element serialization callbacks. The suffix indicates the bit width of the element count prefix.
Simple Arrays
const sia = new Sia();
const numbers = [10, 20, 30, 40, 50];
// Serialize: provide an array and a callback to write each element
sia.addArray8(numbers, (s, n) => s.addUInt8(n));
sia.seek(0);
// Deserialize: provide a callback to read each element
const result = sia.readArray8((s) => s.readUInt8());
// [10, 20, 30, 40, 50]
Complex Object Arrays
Array callbacks work with any data type, including nested objects:
interface User {
name: string;
age: number;
active: boolean;
}
const users: User[] = [
{ name: "Alice", age: 30, active: true },
{ name: "Bob", age: 25, active: false },
];
const sia = new Sia();
// Serialize
sia.addArray8(users, (s, user) => {
s.addString8(user.name).addUInt8(user.age).addBool(user.active);
});
// Deserialize
sia.seek(0);
const decoded = sia.readArray8((s) => ({
name: s.readString8(),
age: s.readUInt8(),
active: s.readBool(),
}));
// [{ name: "Alice", age: 30, active: true }, { name: "Bob", age: 25, active: false }]
Array Size Variants
| Method | Count Prefix | Max Elements |
|---|---|---|
addArray8 / readArray8 | 8-bit | 255 |
addArray16 / readArray16 | 16-bit | 65,535 |
addArray32 / readArray32 | 32-bit | ~4.29 billion |
addArray64 / readArray64 | 64-bit | ~2^53 |
Choosing the Right Type
When serializing data, you want to pick the smallest type that fits your data. Every extra byte in the length prefix or value is overhead multiplied by every record you serialize.
Size vs. Overhead Trade-Off
| Data | Naive Choice | Optimal Choice | Savings per Record |
|---|---|---|---|
| Age (0–150) | UInt32 (4 bytes) | UInt8 (1 byte) | 3 bytes |
| Country code | String32 (4 + N bytes) | addAscii8 (1 + N bytes) | 3 bytes |
| Short name | String32 (4 + N bytes) | String8 (1 + N bytes) | 3 bytes |
| Enum (0–10) | UInt16 (2 bytes) | UInt8 (1 byte) | 1 byte |
| Timestamp (ms) | String32 (JSON-style) | UInt64 (8 bytes) | ~5 bytes |
| Small list (< 256) | Array32 (4-byte prefix) | Array8 (1-byte prefix) | 3 bytes |
These savings add up quickly. In a message containing 1,000 user records, saving 10 bytes per record saves 10 KB per message.
Best Practices
- Use the smallest integer type that fits your range. Don't use
UInt32for a value that never exceeds 255. - Use
addAscii8for ASCII-only strings. It is faster than UTF-8 encoding and produces the same output for ASCII content. - Use
addByteArrayNfor fixed-size data. When the length is known at read time (hashes, keys, UUIDs), skip the length prefix. - Pass
truetoreadByteArray*for temporary data. Zero-copy reads avoid allocations in hot paths. - Use
addArray8for small collections. Most arrays in practice have fewer than 256 elements. Save 1–7 bytes on the count prefix. - Store timestamps as
UInt64. Millisecond timestamps fit comfortably in 8 bytes and are much smaller than ISO string representations. - Use booleans as presence flags for optional fields. Prefix optional data with
addBoolso the reader knows whether to expect the field. - Match
add*andread*calls exactly. Every serialization must have a matching deserialization in the same order with the same types. There is no schema to catch mismatches at compile time.
Real-World Example: User Profile Serialization
Here is a complete example combining multiple data types to serialize and deserialize a user profile:
import { Sia } from "@timeleap/sia";
interface UserProfile {
id: number;
username: string;
email: string;
age: number;
isVerified: boolean;
bio: string | null;
permissions: number[];
avatar: Uint8Array | null;
createdAt: number;
}
function serializeProfile(profile: UserProfile): Uint8Array {
const sia = new Sia();
sia
.addUInt32(profile.id) // 4 bytes: numeric ID
.addAscii8(profile.username) // 1 + N bytes: ASCII username
.addString8(profile.email) // 1 + N bytes: UTF-8 email
.addUInt8(profile.age) // 1 byte: age 0–255
.addBool(profile.isVerified) // 1 byte: boolean flag
.addBool(profile.bio !== null); // 1 byte: presence flag
if (profile.bio !== null) {
sia.addString16(profile.bio); // 2 + N bytes: bio text (may be long)
}
sia.addArray8(profile.permissions, (s, perm) => s.addUInt8(perm));
sia.addBool(profile.avatar !== null);
if (profile.avatar !== null) {
sia.addByteArray16(profile.avatar); // 2 + N bytes: avatar data
}
sia.addUInt64(profile.createdAt); // 8 bytes: timestamp in ms
return sia.toUint8Array();
}
function deserializeProfile(data: Uint8Array): UserProfile {
const sia = new Sia(data);
const id = sia.readUInt32();
const username = sia.readAscii8();
const email = sia.readString8();
const age = sia.readUInt8();
const isVerified = sia.readBool();
const hasBio = sia.readBool();
const bio = hasBio ? sia.readString16() : null;
const permissions = sia.readArray8((s) => s.readUInt8());
const hasAvatar = sia.readBool();
const avatar = hasAvatar ? sia.readByteArray16() : null;
const createdAt = sia.readUInt64();
return {
id,
username,
email,
age,
isVerified,
bio,
permissions,
avatar,
createdAt,
};
}
This serializes a user profile into roughly 20–30 bytes of overhead plus the variable-length fields, compared to hundreds of bytes for the equivalent JSON representation.