Package com.ibm.icu.text
Class TransliteratorIDParser
java.lang.Object
com.ibm.icu.text.TransliteratorIDParser
Parsing component for transliterator IDs. This class contains only
static members; it cannot be instantiated. Methods in this class
parse various ID formats, including the following:
A basic ID, which contains source, target, and variant, but no
filter and no explicit inverse. Examples include
"Latin-Greek/UNGEGN" and "Null".
A single ID, which is a basic ID plus optional filter and optional
explicit inverse. Examples include "[a-zA-Z] Latin-Greek" and
"Lower (Upper)".
A compound ID, which is a sequence of one or more single IDs,
separated by semicolons, with optional forward and reverse global
filters. The global filters are UnicodeSet patterns prepended or
appended to the IDs, separated by semicolons. An appended filter
must be enclosed in parentheses and applies in the reverse
direction.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class
A structure containing the canonicalized data of a filtered ID, that is, a basic ID optionally with a filter.private static class
A structure containing the parsed data of a filtered ID, that is, a basic ID optionally with a filter. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final String
private static final char
private static final int
private static final char
private static final char
private static final int
private static final Map<CaseInsensitiveString,
String> private static final char
private static final char
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic String[]
Parse an ID into pieces.(package private) static List<Transliterator>
Returns the list of Transliterator objects for the given list of SingleID objects.static boolean
parseCompoundID
(String id, int dir, StringBuffer canonID, List<TransliteratorIDParser.SingleID> list, UnicodeSet[] globalFilter) Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter.parseFilterID
(String id, int[] pos) Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.private static TransliteratorIDParser.Specs
parseFilterID
(String id, int[] pos, boolean allowFilter) Parse an ID into component pieces.static UnicodeSet
parseGlobalFilter
(String id, int[] pos, int dir, int[] withParens, StringBuffer canonID) Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.parseSingleID
(String id, int[] pos, int dir) Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.static void
registerSpecialInverse
(String target, String inverseTarget, boolean bidirectional) Register two targets as being inverses of one another.private static TransliteratorIDParser.SingleID
specsToID
(TransliteratorIDParser.Specs specs, int dir) Givens a Spec object, convert it to a SingleID object.private static TransliteratorIDParser.SingleID
Given a Specs object, return a SingleID representing the special inverse of that ID.static String
Given source, target, and variant strings, concatenate them into a full ID.
-
Field Details
-
ID_DELIM
private static final char ID_DELIM- See Also:
-
TARGET_SEP
private static final char TARGET_SEP- See Also:
-
VARIANT_SEP
private static final char VARIANT_SEP- See Also:
-
OPEN_REV
private static final char OPEN_REV- See Also:
-
CLOSE_REV
private static final char CLOSE_REV- See Also:
-
ANY
- See Also:
-
FORWARD
private static final int FORWARD- See Also:
-
REVERSE
private static final int REVERSE- See Also:
-
SPECIAL_INVERSES
-
-
Constructor Details
-
TransliteratorIDParser
TransliteratorIDParser()
-
-
Method Details
-
parseFilterID
Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.- Parameters:
id
- the id to be parsedpos
- INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.- Returns:
- a SingleID object or null if the parse fails
-
parseSingleID
Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.- Parameters:
id
- the id to be parsedpos
- INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.dir
- the direction. If the direction is REVERSE then the SingleID is constructed for the reverse direction.- Returns:
- a SingleID object or null
-
parseGlobalFilter
public static UnicodeSet parseGlobalFilter(String id, int[] pos, int dir, int[] withParens, StringBuffer canonID) Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.- Parameters:
id
- the pattern the parsepos
- INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.dir
- the direction.withParens
- INPUT-OUTPUT parameter. On entry, if withParens[0] is 0, then parens are disallowed. If it is 1, then parens are requires. If it is -1, then parens are optional, and the return result will be set to 0 or 1.canonID
- OUTPUT parameter. The pattern for the filter added to the canonID, either at the end, if dir is FORWARD, or at the start, if dir is REVERSE. The pattern will be enclosed in parentheses if appropriate, and will be suffixed with an ID_DELIM character. May be null.- Returns:
- a UnicodeSet object or null. A non-null results indicates a successful parse, regardless of whether the filter applies to the given direction. The caller should discard it if withParens != (dir == REVERSE).
-
parseCompoundID
public static boolean parseCompoundID(String id, int dir, StringBuffer canonID, List<TransliteratorIDParser.SingleID> list, UnicodeSet[] globalFilter) Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter. The separator is a semicolon. The global filters are UnicodeSet patterns. The reverse global filter must be enclosed in parentheses.- Parameters:
id
- the pattern the parsedir
- the direction.canonID
- OUTPUT parameter that receives the canonical ID, consisting of canonical IDs for all elements, as returned by parseSingleID(), separated by semicolons. Previous contents are discarded.list
- OUTPUT parameter that receives a list of SingleID objects representing the parsed IDs. Previous contents are discarded.globalFilter
- OUTPUT parameter that receives a pointer to a newly created global filter for this ID in this direction, or null if there is none.- Returns:
- true if the parse succeeds, that is, if the entire id is consumed without syntax error.
-
instantiateList
Returns the list of Transliterator objects for the given list of SingleID objects.- Parameters:
ids
- list vector of SingleID objects.- Returns:
- Actual transliterators for the list of SingleIDs
-
IDtoSTV
Parse an ID into pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.- Parameters:
id
- the id string, in any of several forms- Returns:
- an array of 4 strings: source, target, variant, and isSourcePresent. If the source is not present, ANY will be given as the source, and isSourcePresent will be null. Otherwise isSourcePresent will be non-null. The target may be empty if the id is not well-formed. The variant may be empty.
-
STVtoID
Given source, target, and variant strings, concatenate them into a full ID. If the source is empty, then "Any" will be used for the source, so the ID will always be of the form s-t/v or s-t. -
registerSpecialInverse
public static void registerSpecialInverse(String target, String inverseTarget, boolean bidirectional) Register two targets as being inverses of one another. For example, calling registerSpecialInverse("NFC", "NFD", true) causes Transliterator to form the following inverse relationships:NFC => NFD Any-NFC => Any-NFD NFD => NFC Any-NFD => Any-NFC
(Without the special inverse registration, the inverse of NFC would be NFC-Any.) Note that NFD is shorthand for Any-NFD, but that the presence or absence of "Any-" is preserved.The relationship is symmetrical; registering (a, b) is equivalent to registering (b, a).
The relevant IDs must still be registered separately as factories or classes.
Only the targets are specified. Special inverses always have the form Any-Target1 <=> Any-Target2. The target should have canonical casing (the casing desired to be produced when an inverse is formed) and should contain no whitespace or other extraneous characters.
- Parameters:
target
- the target against which to register the inverseinverseTarget
- the inverse of target, that is Any-target.getInverse() => Any-inverseTargetbidirectional
- if true, register the reverse relation as well, that is, Any-inverseTarget.getInverse() => Any-target
-
parseFilterID
private static TransliteratorIDParser.Specs parseFilterID(String id, int[] pos, boolean allowFilter) Parse an ID into component pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.- Parameters:
id
- the id string, in any of several formspos
- INPUT-OUTPUT parameter. On input, pos[0] is the offset of the first character to parse in id. On output, pos[0] is the offset after the last parsed character. If the parse failed, pos[0] will be unchanged.allowFilter
- if true, a UnicodeSet pattern is allowed at any location between specs or delimiters, and is returned as the fifth string in the array.- Returns:
- a Specs object, or null if the parse failed. If neither source nor target was seen in the parsed id, then the parse fails. If allowFilter is true, then the parsed filter pattern is returned in the Specs object, otherwise the returned filter reference is null. If the parse fails for any reason null is returned.
-
specsToID
private static TransliteratorIDParser.SingleID specsToID(TransliteratorIDParser.Specs specs, int dir) Givens a Spec object, convert it to a SingleID object. The Spec object is a more unprocessed parse result. The SingleID object contains information about canonical and basic IDs.- Returns:
- a SingleID; never returns null. Returned object always has 'filter' field of null.
-
specsToSpecialInverse
private static TransliteratorIDParser.SingleID specsToSpecialInverse(TransliteratorIDParser.Specs specs) Given a Specs object, return a SingleID representing the special inverse of that ID. If there is no special inverse then return null.- Returns:
- a SingleID or null. Returned object always has 'filter' field of null.
-