Class StringMatcher

  • All Implemented Interfaces:
    UnicodeMatcher, UnicodeReplacer

    class StringMatcher
    extends java.lang.Object
    implements UnicodeMatcher, UnicodeReplacer
    An object that matches a fixed input string, implementing the UnicodeMatcher API. This object also implements the UnicodeReplacer API, allowing it to emit the matched text as output. Since the match text may contain flexible match elements, such as UnicodeSets, the emitted text is not the match pattern, but instead a substring of the actual matched text. Following convention, the output text is the leftmost match seen up to this point. A StringMatcher may represent a segment, in which case it has a positive segment number. This affects how the matcher converts itself to a pattern but does not otherwise affect its function. A StringMatcher that is not a segment should not be used as a UnicodeReplacer.
    • Constructor Summary

      Constructors 
      Constructor Description
      StringMatcher​(java.lang.String theString, int start, int limit, int segmentNum, RuleBasedTransliterator.Data theData)
      Construct a matcher that matches a substring of the given pattern string.
      StringMatcher​(java.lang.String theString, int segmentNum, RuleBasedTransliterator.Data theData)
      Construct a matcher that matches the given pattern string.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void addMatchSetTo​(UnicodeSet toUnionTo)
      Implementation of UnicodeMatcher API.
      void addReplacementSetTo​(UnicodeSet toUnionTo)
      Union the set of all characters that may output by this object into the given set.
      int matches​(Replaceable text, int[] offset, int limit, boolean incremental)
      Implement UnicodeMatcher
      boolean matchesIndexValue​(int v)
      Implement UnicodeMatcher
      int replace​(Replaceable text, int start, int limit, int[] cursor)
      UnicodeReplacer API
      void resetMatch()
      Remove any match data.
      java.lang.String toPattern​(boolean escapeUnprintable)
      Implement UnicodeMatcher
      java.lang.String toReplacerPattern​(boolean escapeUnprintable)
      UnicodeReplacer API
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • pattern

        private java.lang.String pattern
        The text to be matched.
      • matchStart

        private int matchStart
        Start offset, in the match text, of the rightmost match.
      • matchLimit

        private int matchLimit
        Limit offset, in the match text, of the rightmost match.
      • segmentNumber

        private int segmentNumber
        The segment number, 1-based, or 0 if not a segment.
    • Constructor Detail

      • StringMatcher

        public StringMatcher​(java.lang.String theString,
                             int segmentNum,
                             RuleBasedTransliterator.Data theData)
        Construct a matcher that matches the given pattern string.
        Parameters:
        theString - the pattern to be matched, possibly containing stand-ins that represent nested UnicodeMatcher objects.
        segmentNum - the segment number from 1..n, or 0 if this is not a segment.
        theData - context object mapping stand-ins to UnicodeMatcher objects.
      • StringMatcher

        public StringMatcher​(java.lang.String theString,
                             int start,
                             int limit,
                             int segmentNum,
                             RuleBasedTransliterator.Data theData)
        Construct a matcher that matches a substring of the given pattern string.
        Parameters:
        theString - the pattern to be matched, possibly containing stand-ins that represent nested UnicodeMatcher objects.
        start - first character of theString to be matched
        limit - index after the last character of theString to be matched.
        segmentNum - the segment number from 1..n, or 0 if this is not a segment.
        theData - context object mapping stand-ins to UnicodeMatcher objects.
    • Method Detail

      • matches

        public int matches​(Replaceable text,
                           int[] offset,
                           int limit,
                           boolean incremental)
        Implement UnicodeMatcher
        Specified by:
        matches in interface UnicodeMatcher
        Parameters:
        text - the text to be matched
        offset - on input, the index into text at which to begin matching. On output, the limit of the matched text. The number of matched characters is the output value of offset minus the input value. Offset should always point to the HIGH SURROGATE (leading code unit) of a pair of surrogates, both on entry and upon return.
        limit - the limit index of text to be matched. Greater than offset for a forward direction match, less than offset for a backward direction match. The last character to be considered for matching will be text.charAt(limit-1) in the forward direction or text.charAt(limit+1) in the backward direction.
        incremental - if true, then assume further characters may be inserted at limit and check for partial matching. Otherwise assume the text as given is complete.
        Returns:
        a match degree value indicating a full match, a partial match, or a mismatch. If incremental is false then U_PARTIAL_MATCH should never be returned.
      • toPattern

        public java.lang.String toPattern​(boolean escapeUnprintable)
        Implement UnicodeMatcher
        Specified by:
        toPattern in interface UnicodeMatcher
        Parameters:
        escapeUnprintable - if true then convert unprintable character to their hex escape representations, \\uxxxx or \\Uxxxxxxxx. Unprintable characters are those other than U+000A, U+0020..U+007E.
      • addMatchSetTo

        public void addMatchSetTo​(UnicodeSet toUnionTo)
        Implementation of UnicodeMatcher API. Union the set of all characters that may be matched by this object into the given set.
        Specified by:
        addMatchSetTo in interface UnicodeMatcher
        Parameters:
        toUnionTo - the set into which to union the source characters
      • replace

        public int replace​(Replaceable text,
                           int start,
                           int limit,
                           int[] cursor)
        UnicodeReplacer API
        Specified by:
        replace in interface UnicodeReplacer
        Parameters:
        text - the text to be matched
        start - inclusive start index of text to be replaced
        limit - exclusive end index of text to be replaced; must be greater than or equal to start
        cursor - output parameter for the cursor position. Not all replacer objects will update this, but in a complete tree of replacer objects, representing the entire output side of a transliteration rule, at least one must update it.
        Returns:
        the number of 16-bit code units in the text replacing the characters at offsets start..(limit-1) in text
      • toReplacerPattern

        public java.lang.String toReplacerPattern​(boolean escapeUnprintable)
        UnicodeReplacer API
        Specified by:
        toReplacerPattern in interface UnicodeReplacer
        Parameters:
        escapeUnprintable - if true then convert unprintable character to their hex escape representations, \\uxxxx or \\Uxxxxxxxx. Unprintable characters are defined by Utility.isUnprintable().
      • resetMatch

        public void resetMatch()
        Remove any match data. This must be called before performing a set of matches with this segment.
      • addReplacementSetTo

        public void addReplacementSetTo​(UnicodeSet toUnionTo)
        Union the set of all characters that may output by this object into the given set.
        Specified by:
        addReplacementSetTo in interface UnicodeReplacer
        Parameters:
        toUnionTo - the set into which to union the output characters