Module org.apache.lucene.core
Class Lucene99ScalarQuantizedVectorsWriter
- java.lang.Object
-
- org.apache.lucene.codecs.FlatVectorsWriter
-
- org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsWriter
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,Accountable
public final class Lucene99ScalarQuantizedVectorsWriter extends FlatVectorsWriter
Writes quantized vector values and metadata to index segments.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
Lucene99ScalarQuantizedVectorsWriter.FieldWriter
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.FloatVectorWrapper
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.MergedQuantizedVectorValues
Returns a merged view over all the segment'sQuantizedByteVectorValues
.private static class
Lucene99ScalarQuantizedVectorsWriter.OffsetCorrectedQuantizedByteVectorValues
private static class
Lucene99ScalarQuantizedVectorsWriter.QuantizedByteVectorValueSub
private static class
Lucene99ScalarQuantizedVectorsWriter.QuantizedFloatVectorValues
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier
-
Field Summary
Fields Modifier and Type Field Description private java.lang.Float
confidenceInterval
private java.util.List<Lucene99ScalarQuantizedVectorsWriter.FieldWriter>
fields
private boolean
finished
private IndexOutput
meta
private static float
QUANTILE_RECOMPUTE_LIMIT
private IndexOutput
quantizedVectorData
private FlatVectorsWriter
rawVectorDelegate
private static float
REQUANTIZATION_LIMIT
private SegmentWriteState
segmentWriteState
private static long
SHALLOW_RAM_BYTES_USED
-
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
-
Constructor Summary
Constructors Constructor Description Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, java.lang.Float confidenceInterval, FlatVectorsWriter rawVectorDelegate)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FlatFieldVectorsWriter<?>
addField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter)
Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.void
close()
void
finish()
Called once at the end before closevoid
flush(int maxDoc, Sorter.DocMap sortMap)
Flush all buffered data on disk *private static QuantizedVectorsReader
getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, java.lang.String fieldName)
private static ScalarQuantizer
getQuantizedState(KnnVectorsReader vectorsReader, java.lang.String fieldName)
(package private) static ScalarQuantizer
mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, float confidenceInterval)
void
mergeOneField(FieldInfo fieldInfo, MergeState mergeState)
Write field for mergingCloseableRandomVectorScorerSupplier
mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState)
Write the field for merging, providing a scorer over the newly merged flat vectors.private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier
mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState)
(package private) static ScalarQuantizer
mergeQuantiles(java.util.List<ScalarQuantizer> quantizationStates, java.util.List<java.lang.Integer> segmentSizes, float confidenceInterval)
private ScalarQuantizer
mergeQuantiles(FieldInfo fieldInfo, MergeState mergeState)
long
ramBytesUsed()
Return the memory usage of this object in bytes.(package private) static boolean
shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, java.util.List<ScalarQuantizer> quantizationStates)
Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.(package private) static boolean
shouldRequantize(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles)
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state.private void
writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc)
private void
writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, java.lang.Float confidenceInterval, java.lang.Float lowerQuantile, java.lang.Float upperQuantile, DocsWithFieldSet docsWithField)
private static DocsWithFieldSet
writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues)
Writes the vector values to the output and returns a set of documents that contains vectors.private void
writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData)
private void
writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap)
private void
writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
-
-
-
Field Detail
-
SHALLOW_RAM_BYTES_USED
private static final long SHALLOW_RAM_BYTES_USED
-
QUANTILE_RECOMPUTE_LIMIT
private static final float QUANTILE_RECOMPUTE_LIMIT
- See Also:
- Constant Field Values
-
REQUANTIZATION_LIMIT
private static final float REQUANTIZATION_LIMIT
- See Also:
- Constant Field Values
-
segmentWriteState
private final SegmentWriteState segmentWriteState
-
fields
private final java.util.List<Lucene99ScalarQuantizedVectorsWriter.FieldWriter> fields
-
meta
private final IndexOutput meta
-
quantizedVectorData
private final IndexOutput quantizedVectorData
-
confidenceInterval
private final java.lang.Float confidenceInterval
-
rawVectorDelegate
private final FlatVectorsWriter rawVectorDelegate
-
finished
private boolean finished
-
-
Constructor Detail
-
Lucene99ScalarQuantizedVectorsWriter
Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, java.lang.Float confidenceInterval, FlatVectorsWriter rawVectorDelegate) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
addField
public FlatFieldVectorsWriter<?> addField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.- Specified by:
addField
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to addindexWriter
- the writer to delegate to, can be null- Returns:
- a writer for the field
- Throws:
java.io.IOException
- if an I/O error occurs when adding the field
-
mergeOneField
public void mergeOneField(FieldInfo fieldInfo, MergeState mergeState) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Write field for merging- Overrides:
mergeOneField
in classFlatVectorsWriter
- Throws:
java.io.IOException
-
mergeOneFieldToIndex
public CloseableRandomVectorScorerSupplier mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Write the field for merging, providing a scorer over the newly merged flat vectors. This way any additional merging logic can be implemented by the user of this class.- Specified by:
mergeOneFieldToIndex
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to mergemergeState
- mergeState of the segments to merge- Returns:
- a scorer over the newly merged flat vectors, which should be closed as it holds a temporary file handle to read over the newly merged vectors
- Throws:
java.io.IOException
- if an I/O error occurs when merging
-
flush
public void flush(int maxDoc, Sorter.DocMap sortMap) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Flush all buffered data on disk *- Specified by:
flush
in classFlatVectorsWriter
- Throws:
java.io.IOException
-
finish
public void finish() throws java.io.IOException
Description copied from class:FlatVectorsWriter
Called once at the end before close- Specified by:
finish
in classFlatVectorsWriter
- Throws:
java.io.IOException
-
ramBytesUsed
public long ramBytesUsed()
Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.
-
writeField
private void writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc) throws java.io.IOException
- Throws:
java.io.IOException
-
writeMeta
private void writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, java.lang.Float confidenceInterval, java.lang.Float lowerQuantile, java.lang.Float upperQuantile, DocsWithFieldSet docsWithField) throws java.io.IOException
- Throws:
java.io.IOException
-
writeQuantizedVectors
private void writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData) throws java.io.IOException
- Throws:
java.io.IOException
-
writeSortingField
private void writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap) throws java.io.IOException
- Throws:
java.io.IOException
-
writeSortedQuantizedVectors
private void writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap) throws java.io.IOException
- Throws:
java.io.IOException
-
mergeQuantiles
private ScalarQuantizer mergeQuantiles(FieldInfo fieldInfo, MergeState mergeState) throws java.io.IOException
- Throws:
java.io.IOException
-
mergeOneFieldToIndex
private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) throws java.io.IOException
- Throws:
java.io.IOException
-
mergeQuantiles
static ScalarQuantizer mergeQuantiles(java.util.List<ScalarQuantizer> quantizationStates, java.util.List<java.lang.Integer> segmentSizes, float confidenceInterval)
-
shouldRecomputeQuantiles
static boolean shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, java.util.List<ScalarQuantizer> quantizationStates)
Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.- Parameters:
mergedQuantizationState
- The merged quantization statequantizationStates
- The quantization states of the individual segments- Returns:
- true if the quantiles should be recomputed
-
getQuantizedKnnVectorsReader
private static QuantizedVectorsReader getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, java.lang.String fieldName)
-
getQuantizedState
private static ScalarQuantizer getQuantizedState(KnnVectorsReader vectorsReader, java.lang.String fieldName)
-
mergeAndRecalculateQuantiles
static ScalarQuantizer mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, float confidenceInterval) throws java.io.IOException
- Throws:
java.io.IOException
-
shouldRequantize
static boolean shouldRequantize(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles)
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state. This would imply that floating point values would slightly shift quantization buckets.- Parameters:
existingQuantiles
- The existing quantiles for a segmentnewQuantiles
- The new quantiles for a segment, could be merged, or fully re-calculated- Returns:
- true if the floating point values should be requantized
-
writeQuantizedVectorData
private static DocsWithFieldSet writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues) throws java.io.IOException
Writes the vector values to the output and returns a set of documents that contains vectors.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Throws:
java.io.IOException
-
-