Class IntersectBlockReader

    • Field Detail

      • NUM_CONSECUTIVELY_REJECTED_TERMS_THRESHOLD

        protected final int NUM_CONSECUTIVELY_REJECTED_TERMS_THRESHOLD
        Threshold that controls when to attempt to jump to a block away.

        This counter is 0 when entering a block. It is incremented each time a term is rejected by the automaton. When the counter is greater than or equal to this threshold, then we compute the next term accepted by the automaton, with IntersectBlockReader.AutomatonNextTermCalculator, and we jump to a block away if the next term accepted is greater than the immediate next term in the block.

        A low value, for example 1, improves the performance of automatons requiring many jumps, for example FuzzyQuery and most WildcardQuery. A higher value improves the performance of automatons with less or no jump, for example PrefixQuery. A threshold of 4 seems to be a good balance.

        See Also:
        Constant Field Values
      • automaton

        protected final Automaton automaton
      • finite

        protected final boolean finite
      • commonSuffix

        protected final BytesRef commonSuffix
      • minTermLength

        protected final int minTermLength
      • seekTerm

        protected BytesRef seekTerm
        Set this when our current mode is seeking to this term. Set to null after.
      • numMatchedBytes

        protected int numMatchedBytes
        Number of bytes accepted by the automaton when validating the current term.
      • states

        protected int[] states
        Automaton states reached when validating the current term, from 0 to numMatchedBytes - 1.
      • numConsecutivelyRejectedTerms

        protected int numConsecutivelyRejectedTerms
        Counter of the number of consecutively rejected terms. Depending on NUM_CONSECUTIVELY_REJECTED_TERMS_THRESHOLD, this may trigger a jump to a block away.
    • Method Detail

      • getMinTermLength

        protected int getMinTermLength()
        Computes the minimal length of the terms accepted by the automaton. This speeds up the term scanning for automatons accepting a finite language.
      • next

        public BytesRef next()
                      throws java.io.IOException
        Description copied from interface: BytesRefIterator
        Increments the iteration to the next BytesRef in the iterator. Returns the resulting BytesRef or null if the end of the iterator is reached. The returned BytesRef may be re-used across calls to next. After this method returns null, do not call it again: the results are undefined.
        Specified by:
        next in interface BytesRefIterator
        Overrides:
        next in class BlockReader
        Returns:
        the next BytesRef in the iterator or null if the end of the iterator is reached.
        Throws:
        java.io.IOException - If there is a low-level I/O error.
      • seekFirstBlock

        protected boolean seekFirstBlock()
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • nextTermInBlockMatching

        protected BytesRef nextTermInBlockMatching()
                                            throws java.io.IOException
        Finds the next block line that matches (accepted by the automaton), or null when at end of block.
        Returns:
        The next term in the current block that is accepted by the automaton; or null if none.
        Throws:
        java.io.IOException
      • endsWithCommonSuffix

        protected boolean endsWithCommonSuffix​(byte[] termBytes,
                                               int termLength)
        Indicates whether the given term ends with the automaton common suffix. This allows to quickly skip terms that the automaton would reject eventually.
      • nextBlock

        protected boolean nextBlock()
                             throws java.io.IOException
        Opens the next block. Depending on the blockIteration order, it may be the very next block, or a block away that may contain seekTerm.
        Returns:
        true if the next block is opened; false if there is no blocks anymore and the iteration is over.
        Throws:
        java.io.IOException
      • seekExact

        public boolean seekExact​(BytesRef text)
        Description copied from class: TermsEnum
        Attempts to seek to the exact term, returning true if the term is found. If this returns false, the enum is unpositioned. For some codecs, seekExact may be substantially faster than TermsEnum.seekCeil(org.apache.lucene.util.BytesRef).
        Overrides:
        seekExact in class BlockReader
        Returns:
        true if the term is found; return false if the enum is unpositioned.
      • seekCeil

        public TermsEnum.SeekStatus seekCeil​(BytesRef text)
        Description copied from class: TermsEnum
        Seeks to the specified term, if it exists, or to the next (ceiling) term. Returns SeekStatus to indicate whether exact term was found, a different term was found, or EOF was hit. The target term may be before or after the current term. If this returns SeekStatus.END, the enum is unpositioned.
        Overrides:
        seekCeil in class BlockReader