Package org.apache.lucene.codecs
The Codec API allows you to customise the way the following pieces of index information are stored:
- Postings lists - see
PostingsFormat
- DocValues - see
DocValuesFormat
- Stored fields - see
StoredFieldsFormat
- Term vectors - see
TermVectorsFormat
- Points - see
PointsFormat
- FieldInfos - see
FieldInfosFormat
- SegmentInfo - see
SegmentInfoFormat
- Norms - see
NormsFormat
- Live documents - see
LiveDocsFormat
Codecs are identified by name through the Java Service Provider Interface. To create your own
codec, extend Codec
and pass the new codec's name to the super()
constructor:
public class MyCodec extends Codec { public MyCodec() { super("MyCodecName"); } ... }You will need to register the Codec class so that the
ServiceLoader
can find it, by including a META-INF/services/org.apache.lucene.codecs.Codec file
on your classpath that contains the package-qualified name of your codec.
If you just want to customise the PostingsFormat
, or use
different postings formats for different fields, then you can register your custom postings
format in the same way (in META-INF/services/org.apache.lucene.codecs.PostingsFormat), and then
extend the default codec and override
org.apache.lucene.codecs.luceneMN.LuceneMNCodec#getPostingsFormatForField(String)
to return your
custom postings format.
Similarly, if you just want to customise the DocValuesFormat
per-field, have a look at LuceneMNCodec.getDocValuesFormatForField(String)
.
-
Interface Summary Interface Description HnswGraphProvider An interface that provides an HNSW graph. -
Class Summary Class Description BlockTermState Holds all state required forPostingsReaderBase
to produce aPostingsEnum
without re-seeking the terms dict.BufferingKnnVectorsWriter Buffers up pending vector value(s) per doc, then flushes when segment flushes.BufferingKnnVectorsWriter.BufferedByteVectorValues BufferingKnnVectorsWriter.BufferedFloatVectorValues BufferingKnnVectorsWriter.FieldWriter<T> BufferingKnnVectorsWriter.SortingByteVectorValues Sorting FloatVectorValues that iterate over documents in the order of the provided sortMapBufferingKnnVectorsWriter.SortingFloatVectorValues Sorting FloatVectorValues that iterate over documents in the order of the provided sortMapCodec Encodes/decodes an inverted index segment.Codec.Holder This static holder class prevents classloading deadlock by delaying init of default codecs and available codecs until needed.CodecUtil Utility class for reading and writing versioned headers.CompetitiveImpactAccumulator This class accumulates the (freq, norm) pairs that may produce competitive scores.CompoundDirectory A read-onlyDirectory
that consists of a view over a compound file.CompoundFormat Encodes/decodes compound filesDocValuesConsumer Abstract API that consumes numeric, binary and sorted docvalues.DocValuesConsumer.BinaryDocValuesSub Tracks state of one binary sub-reader that we are mergingDocValuesConsumer.BitsFilteredTermsEnum DocValuesConsumer.MergedTermsEnum A mergedTermsEnum
.DocValuesConsumer.NumericDocValuesSub Tracks state of one numeric sub-reader that we are mergingDocValuesConsumer.SortedDocValuesSub Tracks state of one sorted sub-reader that we are mergingDocValuesConsumer.SortedNumericDocValuesSub Tracks state of one sorted numeric sub-reader that we are mergingDocValuesConsumer.SortedSetDocValuesSub Tracks state of one sorted set sub-reader that we are mergingDocValuesFormat Encodes/decodes per-document values.DocValuesFormat.Holder This static holder class prevents classloading deadlock by delaying init of doc values formats until needed.DocValuesProducer Abstract API that produces numeric, binary, sorted, sortedset, and sortednumeric docvalues.FieldInfosFormat Encodes/decodesFieldInfos
FieldsConsumer Abstract API that consumes terms, doc, freq, prox, offset and payloads postings.FieldsProducer Abstract API that produces terms, doc, freq, prox, offset and payloads postings.FilterCodec A codec that forwards all its method calls to another codec.FlatFieldVectorsWriter<T> Vectors' writer for a fieldFlatVectorsFormat Encodes/decodes per-document vectorsFlatVectorsReader Reads vectors from an index.FlatVectorsWriter Vectors' writer for a field that allows additional indexing logic to be implemented by the callerKnnFieldVectorsWriter<T> Vectors' writer for a fieldKnnVectorsFormat Encodes/decodes per-document vector and any associated indexing structures required to support nearest-neighbor searchKnnVectorsFormat.Holder This static holder class prevents classloading deadlock by delaying init of doc values formats until needed.KnnVectorsReader Reads vectors from an index.KnnVectorsWriter Writes vectors to an index.KnnVectorsWriter.ByteVectorValuesSub KnnVectorsWriter.MergedVectorValues View over multiple vector values supporting iterator-style access via DocIdMerger.KnnVectorsWriter.MergedVectorValues.MergedByteVectorValues KnnVectorsWriter.MergedVectorValues.MergedFloat32VectorValues KnnVectorsWriter.VectorValuesSub Tracks state of one sub-reader that we are mergingLiveDocsFormat Format for live/deleted documentsMultiLevelSkipListReader This abstract class reads skip lists with multiple levels.MultiLevelSkipListWriter This abstract class writes skip lists with multiple levels.MutablePointTree One leafPointValues.PointTree
whose order of points can be changed.NormsConsumer Abstract API that consumes normalization values.NormsConsumer.NumericDocValuesSub Tracks state of one numeric sub-reader that we are mergingNormsFormat Encodes/decodes per-document score normalization values.NormsProducer Abstract API that produces field normalization valuesPointsFormat Encodes/decodes indexed points.PointsReader Abstract API to visit point values.PointsWriter Abstract API to write pointsPostingsFormat Encodes/decodes terms, postings, and proximity data.PostingsFormat.Holder This static holder class prevents classloading deadlock by delaying init of postings formats until needed.PostingsReaderBase The core terms dictionaries (BlockTermsReader, BlockTreeTermsReader) interact with a single instance of this class to manage creation ofPostingsEnum
andPostingsEnum
instances.PostingsWriterBase Class that plugs into term dictionaries, such asLucene90BlockTreeTermsWriter
, and handles writing postings.PushPostingsWriterBase Extension ofPostingsWriterBase
, adding a push API for writing each element of the postings.SegmentInfoFormat Expert: Controls the format of theSegmentInfo
(segment metadata file).StoredFieldsFormat Controls the format of stored fieldsStoredFieldsReader Codec API for reading stored fields.StoredFieldsWriter Codec API for writing stored fields: For every document,StoredFieldsWriter.startDocument()
is called, informing the Codec that a new document has started.StoredFieldsWriter.StoredFieldsMergeSub TermStats Holder for per-term statistics.TermVectorsFormat Controls the format of term vectorsTermVectorsReader Codec API for reading term vectors:TermVectorsWriter Codec API for writing term vectors: For every document,TermVectorsWriter.startDocument(int)
is called, informing the Codec how many fields will be written.TermVectorsWriter.TermVectorsMergeSub