Package com.ibm.icu.text
Class TransliteratorIDParser
- java.lang.Object
-
- com.ibm.icu.text.TransliteratorIDParser
-
class TransliteratorIDParser extends java.lang.Object
Parsing component for transliterator IDs. This class contains only static members; it cannot be instantiated. Methods in this class parse various ID formats, including the following: A basic ID, which contains source, target, and variant, but no filter and no explicit inverse. Examples include "Latin-Greek/UNGEGN" and "Null". A single ID, which is a basic ID plus optional filter and optional explicit inverse. Examples include "[a-zA-Z] Latin-Greek" and "Lower (Upper)". A compound ID, which is a sequence of one or more single IDs, separated by semicolons, with optional forward and reverse global filters. The global filters are UnicodeSet patterns prepended or appended to the IDs, separated by semicolons. An appended filter must be enclosed in parentheses and applies in the reverse direction.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
TransliteratorIDParser.SingleID
A structure containing the canonicalized data of a filtered ID, that is, a basic ID optionally with a filter.private static class
TransliteratorIDParser.Specs
A structure containing the parsed data of a filtered ID, that is, a basic ID optionally with a filter.
-
Field Summary
Fields Modifier and Type Field Description private static java.lang.String
ANY
private static char
CLOSE_REV
private static int
FORWARD
private static char
ID_DELIM
private static char
OPEN_REV
private static int
REVERSE
private static java.util.Map<CaseInsensitiveString,java.lang.String>
SPECIAL_INVERSES
private static char
TARGET_SEP
private static char
VARIANT_SEP
-
Constructor Summary
Constructors Constructor Description TransliteratorIDParser()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.lang.String[]
IDtoSTV(java.lang.String id)
Parse an ID into pieces.(package private) static java.util.List<Transliterator>
instantiateList(java.util.List<TransliteratorIDParser.SingleID> ids)
Returns the list of Transliterator objects for the given list of SingleID objects.static boolean
parseCompoundID(java.lang.String id, int dir, java.lang.StringBuffer canonID, java.util.List<TransliteratorIDParser.SingleID> list, UnicodeSet[] globalFilter)
Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter.static TransliteratorIDParser.SingleID
parseFilterID(java.lang.String id, int[] pos)
Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.private static TransliteratorIDParser.Specs
parseFilterID(java.lang.String id, int[] pos, boolean allowFilter)
Parse an ID into component pieces.static UnicodeSet
parseGlobalFilter(java.lang.String id, int[] pos, int dir, int[] withParens, java.lang.StringBuffer canonID)
Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.static TransliteratorIDParser.SingleID
parseSingleID(java.lang.String id, int[] pos, int dir)
Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.static void
registerSpecialInverse(java.lang.String target, java.lang.String inverseTarget, boolean bidirectional)
Register two targets as being inverses of one another.private static TransliteratorIDParser.SingleID
specsToID(TransliteratorIDParser.Specs specs, int dir)
Givens a Spec object, convert it to a SingleID object.private static TransliteratorIDParser.SingleID
specsToSpecialInverse(TransliteratorIDParser.Specs specs)
Given a Specs object, return a SingleID representing the special inverse of that ID.static java.lang.String
STVtoID(java.lang.String source, java.lang.String target, java.lang.String variant)
Given source, target, and variant strings, concatenate them into a full ID.
-
-
-
Field Detail
-
ID_DELIM
private static final char ID_DELIM
- See Also:
- Constant Field Values
-
TARGET_SEP
private static final char TARGET_SEP
- See Also:
- Constant Field Values
-
VARIANT_SEP
private static final char VARIANT_SEP
- See Also:
- Constant Field Values
-
OPEN_REV
private static final char OPEN_REV
- See Also:
- Constant Field Values
-
CLOSE_REV
private static final char CLOSE_REV
- See Also:
- Constant Field Values
-
ANY
private static final java.lang.String ANY
- See Also:
- Constant Field Values
-
FORWARD
private static final int FORWARD
- See Also:
- Constant Field Values
-
REVERSE
private static final int REVERSE
- See Also:
- Constant Field Values
-
SPECIAL_INVERSES
private static final java.util.Map<CaseInsensitiveString,java.lang.String> SPECIAL_INVERSES
-
-
Method Detail
-
parseFilterID
public static TransliteratorIDParser.SingleID parseFilterID(java.lang.String id, int[] pos)
Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.- Parameters:
id
- the id to be parsedpos
- INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.- Returns:
- a SingleID object or null if the parse fails
-
parseSingleID
public static TransliteratorIDParser.SingleID parseSingleID(java.lang.String id, int[] pos, int dir)
Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.- Parameters:
id
- the id to be parsedpos
- INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.dir
- the direction. If the direction is REVERSE then the SingleID is constructed for the reverse direction.- Returns:
- a SingleID object or null
-
parseGlobalFilter
public static UnicodeSet parseGlobalFilter(java.lang.String id, int[] pos, int dir, int[] withParens, java.lang.StringBuffer canonID)
Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.- Parameters:
id
- the pattern the parsepos
- INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.dir
- the direction.withParens
- INPUT-OUTPUT parameter. On entry, if withParens[0] is 0, then parens are disallowed. If it is 1, then parens are requires. If it is -1, then parens are optional, and the return result will be set to 0 or 1.canonID
- OUTPUT parameter. The pattern for the filter added to the canonID, either at the end, if dir is FORWARD, or at the start, if dir is REVERSE. The pattern will be enclosed in parentheses if appropriate, and will be suffixed with an ID_DELIM character. May be null.- Returns:
- a UnicodeSet object or null. A non-null results indicates a successful parse, regardless of whether the filter applies to the given direction. The caller should discard it if withParens != (dir == REVERSE).
-
parseCompoundID
public static boolean parseCompoundID(java.lang.String id, int dir, java.lang.StringBuffer canonID, java.util.List<TransliteratorIDParser.SingleID> list, UnicodeSet[] globalFilter)
Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter. The separator is a semicolon. The global filters are UnicodeSet patterns. The reverse global filter must be enclosed in parentheses.- Parameters:
id
- the pattern the parsedir
- the direction.canonID
- OUTPUT parameter that receives the canonical ID, consisting of canonical IDs for all elements, as returned by parseSingleID(), separated by semicolons. Previous contents are discarded.list
- OUTPUT parameter that receives a list of SingleID objects representing the parsed IDs. Previous contents are discarded.globalFilter
- OUTPUT parameter that receives a pointer to a newly created global filter for this ID in this direction, or null if there is none.- Returns:
- true if the parse succeeds, that is, if the entire id is consumed without syntax error.
-
instantiateList
static java.util.List<Transliterator> instantiateList(java.util.List<TransliteratorIDParser.SingleID> ids)
Returns the list of Transliterator objects for the given list of SingleID objects.- Parameters:
ids
- list vector of SingleID objects.- Returns:
- Actual transliterators for the list of SingleIDs
-
IDtoSTV
public static java.lang.String[] IDtoSTV(java.lang.String id)
Parse an ID into pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.- Parameters:
id
- the id string, in any of several forms- Returns:
- an array of 4 strings: source, target, variant, and isSourcePresent. If the source is not present, ANY will be given as the source, and isSourcePresent will be null. Otherwise isSourcePresent will be non-null. The target may be empty if the id is not well-formed. The variant may be empty.
-
STVtoID
public static java.lang.String STVtoID(java.lang.String source, java.lang.String target, java.lang.String variant)
Given source, target, and variant strings, concatenate them into a full ID. If the source is empty, then "Any" will be used for the source, so the ID will always be of the form s-t/v or s-t.
-
registerSpecialInverse
public static void registerSpecialInverse(java.lang.String target, java.lang.String inverseTarget, boolean bidirectional)
Register two targets as being inverses of one another. For example, calling registerSpecialInverse("NFC", "NFD", true) causes Transliterator to form the following inverse relationships:NFC => NFD Any-NFC => Any-NFD NFD => NFC Any-NFD => Any-NFC
(Without the special inverse registration, the inverse of NFC would be NFC-Any.) Note that NFD is shorthand for Any-NFD, but that the presence or absence of "Any-" is preserved.The relationship is symmetrical; registering (a, b) is equivalent to registering (b, a).
The relevant IDs must still be registered separately as factories or classes.
Only the targets are specified. Special inverses always have the form Any-Target1 <=> Any-Target2. The target should have canonical casing (the casing desired to be produced when an inverse is formed) and should contain no whitespace or other extraneous characters.
- Parameters:
target
- the target against which to register the inverseinverseTarget
- the inverse of target, that is Any-target.getInverse() => Any-inverseTargetbidirectional
- if true, register the reverse relation as well, that is, Any-inverseTarget.getInverse() => Any-target
-
parseFilterID
private static TransliteratorIDParser.Specs parseFilterID(java.lang.String id, int[] pos, boolean allowFilter)
Parse an ID into component pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.- Parameters:
id
- the id string, in any of several formspos
- INPUT-OUTPUT parameter. On input, pos[0] is the offset of the first character to parse in id. On output, pos[0] is the offset after the last parsed character. If the parse failed, pos[0] will be unchanged.allowFilter
- if true, a UnicodeSet pattern is allowed at any location between specs or delimiters, and is returned as the fifth string in the array.- Returns:
- a Specs object, or null if the parse failed. If neither source nor target was seen in the parsed id, then the parse fails. If allowFilter is true, then the parsed filter pattern is returned in the Specs object, otherwise the returned filter reference is null. If the parse fails for any reason null is returned.
-
specsToID
private static TransliteratorIDParser.SingleID specsToID(TransliteratorIDParser.Specs specs, int dir)
Givens a Spec object, convert it to a SingleID object. The Spec object is a more unprocessed parse result. The SingleID object contains information about canonical and basic IDs.- Returns:
- a SingleID; never returns null. Returned object always has 'filter' field of null.
-
specsToSpecialInverse
private static TransliteratorIDParser.SingleID specsToSpecialInverse(TransliteratorIDParser.Specs specs)
Given a Specs object, return a SingleID representing the special inverse of that ID. If there is no special inverse then return null.- Returns:
- a SingleID or null. Returned object always has 'filter' field of null.
-
-