Skip to Content

quantizeEmbedding

quantizeEmbedding(v: number[] | Float32Array<ArrayBufferLike>): QuantizedEmbedding

Defined in: src/lib/memoryEngine/quantization.ts:53 

Quantize a Float32 embedding (or number[]) into an Int8 vector + scale.

The scale is the maximum absolute value across the input; all other values are mapped linearly into [-127, 127] and rounded. A zero vector yields a zero Int8Array and a scale of 0.

Parameters

ParameterTypeDescription

v

number[] | Float32Array<ArrayBufferLike>

The embedding to quantize. Either a Float32Array (typical for on-device caches) or a plain number[] (typical for values fresh out of JSON.parse). Plain numbers are read directly without copying into a Float32Array first.

Returns

QuantizedEmbedding

The quantized data + scale. The returned data.length === v.length.

Last updated on