Package com.ibm.icu.impl
Class PatternTokenizer
- java.lang.Object
-
- com.ibm.icu.impl.PatternTokenizer
-
public class PatternTokenizer extends java.lang.Object
A simple parsing class for patterns and rules. Handles '...' quotations, \\uxxxx and \\Uxxxxxxxx, and symple syntax. The '' (two quotes) is treated as a single quote, inside or outside a quote- Any ignorable characters are ignored in parsing.
- Any syntax characters are broken into separate tokens
- Quote characters can be specified: '...', "...", and \x
- Other characters are treated as literals
-
-
Field Summary
Fields Modifier and Type Field Description private static int
AFTER_QUOTE
static char
BACK_SLASH
static int
BROKEN_ESCAPE
static int
BROKEN_QUOTE
static int
DONE
private UnicodeSet
escapeCharacters
private UnicodeSet
extraQuotingCharacters
private static int
HEX
private UnicodeSet
ignorableCharacters
private static int
IN_QUOTE
private int
limit
static int
LITERAL
private UnicodeSet
needingQuoteCharacters
private static int
NO_QUOTE
private static int
NONE
private static int
NORMAL_QUOTE
private java.lang.String
pattern
static char
SINGLE_QUOTE
private static int
SLASH_START
private int
start
private static int
START_QUOTE
static int
SYNTAX
private UnicodeSet
syntaxCharacters
static int
UNKNOWN
private boolean
usingQuote
private boolean
usingSlash
-
Constructor Summary
Constructors Constructor Description PatternTokenizer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
appendEscaped(java.lang.StringBuffer result, int cp)
UnicodeSet
getEscapeCharacters()
UnicodeSet
getExtraQuotingCharacters()
UnicodeSet
getIgnorableCharacters()
int
getLimit()
int
getStart()
UnicodeSet
getSyntaxCharacters()
boolean
isUsingQuote()
boolean
isUsingSlash()
int
next(java.lang.StringBuffer buffer)
java.lang.String
normalize()
java.lang.String
quoteLiteral(java.lang.CharSequence string)
java.lang.String
quoteLiteral(java.lang.String string)
Quote a literal string, using the available settings.PatternTokenizer
setEscapeCharacters(UnicodeSet escapeCharacters)
Set characters to be escaped in literals, in quoteLiteral and normalize, eg new UnicodeSet("[^\\u0020-\\u007E]");PatternTokenizer
setExtraQuotingCharacters(UnicodeSet syntaxCharacters)
Sets the extra characters to be quoted in literalsPatternTokenizer
setIgnorableCharacters(UnicodeSet ignorableCharacters)
Sets the characters to be ignored in parsing, eg new UnicodeSet("[:pattern_whitespace:]");PatternTokenizer
setLimit(int limit)
PatternTokenizer
setPattern(java.lang.CharSequence pattern)
PatternTokenizer
setPattern(java.lang.String pattern)
PatternTokenizer
setStart(int start)
PatternTokenizer
setSyntaxCharacters(UnicodeSet syntaxCharacters)
Sets the characters to be interpreted as syntax characters in parsing, eg new UnicodeSet("[:pattern_syntax:]")PatternTokenizer
setUsingQuote(boolean usingQuote)
PatternTokenizer
setUsingSlash(boolean usingSlash)
-
-
-
Field Detail
-
ignorableCharacters
private UnicodeSet ignorableCharacters
-
syntaxCharacters
private UnicodeSet syntaxCharacters
-
extraQuotingCharacters
private UnicodeSet extraQuotingCharacters
-
escapeCharacters
private UnicodeSet escapeCharacters
-
usingSlash
private boolean usingSlash
-
usingQuote
private boolean usingQuote
-
needingQuoteCharacters
private transient UnicodeSet needingQuoteCharacters
-
start
private int start
-
limit
private int limit
-
pattern
private java.lang.String pattern
-
SINGLE_QUOTE
public static final char SINGLE_QUOTE
- See Also:
- Constant Field Values
-
BACK_SLASH
public static final char BACK_SLASH
- See Also:
- Constant Field Values
-
NO_QUOTE
private static int NO_QUOTE
-
IN_QUOTE
private static int IN_QUOTE
-
DONE
public static final int DONE
- See Also:
- Constant Field Values
-
SYNTAX
public static final int SYNTAX
- See Also:
- Constant Field Values
-
LITERAL
public static final int LITERAL
- See Also:
- Constant Field Values
-
BROKEN_QUOTE
public static final int BROKEN_QUOTE
- See Also:
- Constant Field Values
-
BROKEN_ESCAPE
public static final int BROKEN_ESCAPE
- See Also:
- Constant Field Values
-
UNKNOWN
public static final int UNKNOWN
- See Also:
- Constant Field Values
-
AFTER_QUOTE
private static final int AFTER_QUOTE
- See Also:
- Constant Field Values
-
NONE
private static final int NONE
- See Also:
- Constant Field Values
-
START_QUOTE
private static final int START_QUOTE
- See Also:
- Constant Field Values
-
NORMAL_QUOTE
private static final int NORMAL_QUOTE
- See Also:
- Constant Field Values
-
SLASH_START
private static final int SLASH_START
- See Also:
- Constant Field Values
-
HEX
private static final int HEX
- See Also:
- Constant Field Values
-
-
Method Detail
-
getIgnorableCharacters
public UnicodeSet getIgnorableCharacters()
-
setIgnorableCharacters
public PatternTokenizer setIgnorableCharacters(UnicodeSet ignorableCharacters)
Sets the characters to be ignored in parsing, eg new UnicodeSet("[:pattern_whitespace:]");- Parameters:
ignorableCharacters
- Characters to be ignored.- Returns:
- A PatternTokenizer object in which characters are specified as ignored characters.
-
getSyntaxCharacters
public UnicodeSet getSyntaxCharacters()
-
getExtraQuotingCharacters
public UnicodeSet getExtraQuotingCharacters()
-
setSyntaxCharacters
public PatternTokenizer setSyntaxCharacters(UnicodeSet syntaxCharacters)
Sets the characters to be interpreted as syntax characters in parsing, eg new UnicodeSet("[:pattern_syntax:]")- Parameters:
syntaxCharacters
- Characters to be set as syntax characters.- Returns:
- A PatternTokenizer object in which characters are specified as syntax characters.
-
setExtraQuotingCharacters
public PatternTokenizer setExtraQuotingCharacters(UnicodeSet syntaxCharacters)
Sets the extra characters to be quoted in literals- Parameters:
syntaxCharacters
- Characters to be set as extra quoting characters.- Returns:
- A PatternTokenizer object in which characters are specified as extra quoting characters.
-
getEscapeCharacters
public UnicodeSet getEscapeCharacters()
-
setEscapeCharacters
public PatternTokenizer setEscapeCharacters(UnicodeSet escapeCharacters)
Set characters to be escaped in literals, in quoteLiteral and normalize, eg new UnicodeSet("[^\\u0020-\\u007E]");- Parameters:
escapeCharacters
- Characters to be set as escape characters.- Returns:
- A PatternTokenizer object in which characters are specified as escape characters.
-
isUsingQuote
public boolean isUsingQuote()
-
setUsingQuote
public PatternTokenizer setUsingQuote(boolean usingQuote)
-
isUsingSlash
public boolean isUsingSlash()
-
setUsingSlash
public PatternTokenizer setUsingSlash(boolean usingSlash)
-
getLimit
public int getLimit()
-
setLimit
public PatternTokenizer setLimit(int limit)
-
getStart
public int getStart()
-
setStart
public PatternTokenizer setStart(int start)
-
setPattern
public PatternTokenizer setPattern(java.lang.CharSequence pattern)
-
setPattern
public PatternTokenizer setPattern(java.lang.String pattern)
-
quoteLiteral
public java.lang.String quoteLiteral(java.lang.CharSequence string)
-
quoteLiteral
public java.lang.String quoteLiteral(java.lang.String string)
Quote a literal string, using the available settings. Thus syntax characters, quote characters, and ignorable characters will be put into quotes.- Parameters:
string
- String passed to quote a literal string.- Returns:
- A string using the available settings will place syntax, quote, or ignorable characters into quotes.
-
appendEscaped
private void appendEscaped(java.lang.StringBuffer result, int cp)
-
normalize
public java.lang.String normalize()
-
next
public int next(java.lang.StringBuffer buffer)
-
-