Class ScalarQuantizer


  • public class ScalarQuantizer
    extends java.lang.Object
    Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

    How Scalar Quantization Works

    The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

       byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
       float = (maxQuantile - minQuantile)/127 * byte + minQuantile
     

    This then means to multiply two float values together (e.g. dot_product) we can do the following:

       float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
       float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
       let alpha = (maxQuantile - minQuantile)/127
       float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
     

    The expansion for square distance is much simpler:

      square_distance = (float1 - float2)^2
      (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
      = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
      this can be simplified to:
      = alpha^2 (byte1 - byte2)^2
     
    • Field Detail

      • SCALAR_QUANTIZATION_SAMPLE_SIZE

        public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
        See Also:
        Constant Field Values
      • alpha

        private final float alpha
      • scale

        private final float scale
      • minQuantile

        private final float minQuantile
      • maxQuantile

        private final float maxQuantile
      • confidenceInterval

        private final float confidenceInterval
      • random

        private static final java.util.Random random
    • Constructor Detail

      • ScalarQuantizer

        public ScalarQuantizer​(float minQuantile,
                               float maxQuantile,
                               float confidenceInterval)
        Parameters:
        minQuantile - the lower quantile of the distribution
        maxQuantile - the upper quantile of the distribution
        confidenceInterval - The configured confidence interval used to calculate the quantiles.
    • Method Detail

      • quantize

        public float quantize​(float[] src,
                              byte[] dest,
                              VectorSimilarityFunction similarityFunction)
        Quantize a float vector into a byte vector
        Parameters:
        src - the source vector
        dest - the destination vector
        similarityFunction - the similarity function used to calculate the quantile
        Returns:
        the corrective offset that needs to be applied to the score
      • recalculateCorrectiveOffset

        public float recalculateCorrectiveOffset​(byte[] quantizedVector,
                                                 ScalarQuantizer oldQuantizer,
                                                 VectorSimilarityFunction similarityFunction)
        Recalculate the old score corrective value given new current quantiles
        Parameters:
        quantizedVector - the old vector
        oldQuantizer - the old quantizer
        similarityFunction - the similarity function used to calculate the quantile
        Returns:
        the new offset
      • deQuantize

        public void deQuantize​(byte[] src,
                               float[] dest)
        Dequantize a byte vector into a float vector
        Parameters:
        src - the source vector
        dest - the destination vector
      • getLowerQuantile

        public float getLowerQuantile()
      • getUpperQuantile

        public float getUpperQuantile()
      • getConfidenceInterval

        public float getConfidenceInterval()
      • getConstantMultiplier

        public float getConstantMultiplier()
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • reservoirSampleIndices

        static int[] reservoirSampleIndices​(int numFloatVecs,
                                            int sampleSize)
      • sampleVectors

        static float[] sampleVectors​(FloatVectorValues floatVectorValues,
                                     int[] vectorsToTake)
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • fromVectors

        public static ScalarQuantizer fromVectors​(FloatVectorValues floatVectorValues,
                                                  float confidenceInterval,
                                                  int totalVectorCount)
                                           throws java.io.IOException
        This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
        Parameters:
        floatVectorValues - the float vector values from which to calculate the quantiles
        confidenceInterval - the confidence interval used to calculate the quantiles
        totalVectorCount - the total number of live float vectors in the index. This is vital for accounting for deleted documents when calculating the quantiles.
        Returns:
        A new ScalarQuantizer instance
        Throws:
        java.io.IOException - if there is an error reading the float vector values
      • fromVectors

        static ScalarQuantizer fromVectors​(FloatVectorValues floatVectorValues,
                                           float confidenceInterval,
                                           int totalVectorCount,
                                           int quantizationSampleSize)
                                    throws java.io.IOException
        Throws:
        java.io.IOException
      • getUpperAndLowerQuantile

        static float[] getUpperAndLowerQuantile​(float[] arr,
                                                float confidenceInterval)
        Takes an array of floats, sorted or not, and returns a minimum and maximum value. These values are such that they reside on the `(1 - confidenceInterval)/2` and `confidenceInterval/2` percentiles. Example: providing floats `[0..100]` and asking for `90` quantiles will return `5` and `95`.
        Parameters:
        arr - array of floats
        confidenceInterval - the configured confidence interval
        Returns:
        lower and upper quantile values