Package com.ibm.icu.text
Class StringMatcher
- java.lang.Object
-
- com.ibm.icu.text.StringMatcher
-
- All Implemented Interfaces:
UnicodeMatcher
,UnicodeReplacer
class StringMatcher extends java.lang.Object implements UnicodeMatcher, UnicodeReplacer
An object that matches a fixed input string, implementing the UnicodeMatcher API. This object also implements the UnicodeReplacer API, allowing it to emit the matched text as output. Since the match text may contain flexible match elements, such as UnicodeSets, the emitted text is not the match pattern, but instead a substring of the actual matched text. Following convention, the output text is the leftmost match seen up to this point. A StringMatcher may represent a segment, in which case it has a positive segment number. This affects how the matcher converts itself to a pattern but does not otherwise affect its function. A StringMatcher that is not a segment should not be used as a UnicodeReplacer.
-
-
Field Summary
Fields Modifier and Type Field Description private RuleBasedTransliterator.Data
data
Context object that maps stand-ins to matcher and replacer objects.private int
matchLimit
Limit offset, in the match text, of the rightmost match.private int
matchStart
Start offset, in the match text, of the rightmost match.private java.lang.String
pattern
The text to be matched.private int
segmentNumber
The segment number, 1-based, or 0 if not a segment.-
Fields inherited from interface com.ibm.icu.text.UnicodeMatcher
ETHER, U_MATCH, U_MISMATCH, U_PARTIAL_MATCH
-
-
Constructor Summary
Constructors Constructor Description StringMatcher(java.lang.String theString, int start, int limit, int segmentNum, RuleBasedTransliterator.Data theData)
Construct a matcher that matches a substring of the given pattern string.StringMatcher(java.lang.String theString, int segmentNum, RuleBasedTransliterator.Data theData)
Construct a matcher that matches the given pattern string.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addMatchSetTo(UnicodeSet toUnionTo)
Implementation of UnicodeMatcher API.void
addReplacementSetTo(UnicodeSet toUnionTo)
Union the set of all characters that may output by this object into the given set.int
matches(Replaceable text, int[] offset, int limit, boolean incremental)
Implement UnicodeMatcherboolean
matchesIndexValue(int v)
Implement UnicodeMatcherint
replace(Replaceable text, int start, int limit, int[] cursor)
UnicodeReplacer APIvoid
resetMatch()
Remove any match data.java.lang.String
toPattern(boolean escapeUnprintable)
Implement UnicodeMatcherjava.lang.String
toReplacerPattern(boolean escapeUnprintable)
UnicodeReplacer API
-
-
-
Field Detail
-
pattern
private java.lang.String pattern
The text to be matched.
-
matchStart
private int matchStart
Start offset, in the match text, of the rightmost match.
-
matchLimit
private int matchLimit
Limit offset, in the match text, of the rightmost match.
-
segmentNumber
private int segmentNumber
The segment number, 1-based, or 0 if not a segment.
-
data
private final RuleBasedTransliterator.Data data
Context object that maps stand-ins to matcher and replacer objects.
-
-
Constructor Detail
-
StringMatcher
public StringMatcher(java.lang.String theString, int segmentNum, RuleBasedTransliterator.Data theData)
Construct a matcher that matches the given pattern string.- Parameters:
theString
- the pattern to be matched, possibly containing stand-ins that represent nested UnicodeMatcher objects.segmentNum
- the segment number from 1..n, or 0 if this is not a segment.theData
- context object mapping stand-ins to UnicodeMatcher objects.
-
StringMatcher
public StringMatcher(java.lang.String theString, int start, int limit, int segmentNum, RuleBasedTransliterator.Data theData)
Construct a matcher that matches a substring of the given pattern string.- Parameters:
theString
- the pattern to be matched, possibly containing stand-ins that represent nested UnicodeMatcher objects.start
- first character of theString to be matchedlimit
- index after the last character of theString to be matched.segmentNum
- the segment number from 1..n, or 0 if this is not a segment.theData
- context object mapping stand-ins to UnicodeMatcher objects.
-
-
Method Detail
-
matches
public int matches(Replaceable text, int[] offset, int limit, boolean incremental)
Implement UnicodeMatcher- Specified by:
matches
in interfaceUnicodeMatcher
- Parameters:
text
- the text to be matchedoffset
- on input, the index into text at which to begin matching. On output, the limit of the matched text. The number of matched characters is the output value of offset minus the input value. Offset should always point to the HIGH SURROGATE (leading code unit) of a pair of surrogates, both on entry and upon return.limit
- the limit index of text to be matched. Greater than offset for a forward direction match, less than offset for a backward direction match. The last character to be considered for matching will be text.charAt(limit-1) in the forward direction or text.charAt(limit+1) in the backward direction.incremental
- if true, then assume further characters may be inserted at limit and check for partial matching. Otherwise assume the text as given is complete.- Returns:
- a match degree value indicating a full match, a partial match, or a mismatch. If incremental is false then U_PARTIAL_MATCH should never be returned.
-
toPattern
public java.lang.String toPattern(boolean escapeUnprintable)
Implement UnicodeMatcher- Specified by:
toPattern
in interfaceUnicodeMatcher
- Parameters:
escapeUnprintable
- if true then convert unprintable character to their hex escape representations, \\uxxxx or \\Uxxxxxxxx. Unprintable characters are those other than U+000A, U+0020..U+007E.
-
matchesIndexValue
public boolean matchesIndexValue(int v)
Implement UnicodeMatcher- Specified by:
matchesIndexValue
in interfaceUnicodeMatcher
-
addMatchSetTo
public void addMatchSetTo(UnicodeSet toUnionTo)
Implementation of UnicodeMatcher API. Union the set of all characters that may be matched by this object into the given set.- Specified by:
addMatchSetTo
in interfaceUnicodeMatcher
- Parameters:
toUnionTo
- the set into which to union the source characters
-
replace
public int replace(Replaceable text, int start, int limit, int[] cursor)
UnicodeReplacer API- Specified by:
replace
in interfaceUnicodeReplacer
- Parameters:
text
- the text to be matchedstart
- inclusive start index of text to be replacedlimit
- exclusive end index of text to be replaced; must be greater than or equal to startcursor
- output parameter for the cursor position. Not all replacer objects will update this, but in a complete tree of replacer objects, representing the entire output side of a transliteration rule, at least one must update it.- Returns:
- the number of 16-bit code units in the text replacing the characters at offsets start..(limit-1) in text
-
toReplacerPattern
public java.lang.String toReplacerPattern(boolean escapeUnprintable)
UnicodeReplacer API- Specified by:
toReplacerPattern
in interfaceUnicodeReplacer
- Parameters:
escapeUnprintable
- if true then convert unprintable character to their hex escape representations, \\uxxxx or \\Uxxxxxxxx. Unprintable characters are defined by Utility.isUnprintable().
-
resetMatch
public void resetMatch()
Remove any match data. This must be called before performing a set of matches with this segment.
-
addReplacementSetTo
public void addReplacementSetTo(UnicodeSet toUnionTo)
Union the set of all characters that may output by this object into the given set.- Specified by:
addReplacementSetTo
in interfaceUnicodeReplacer
- Parameters:
toUnionTo
- the set into which to union the output characters
-
-