Class TransliteratorIDParser


  • class TransliteratorIDParser
    extends java.lang.Object
    Parsing component for transliterator IDs. This class contains only static members; it cannot be instantiated. Methods in this class parse various ID formats, including the following: A basic ID, which contains source, target, and variant, but no filter and no explicit inverse. Examples include "Latin-Greek/UNGEGN" and "Null". A single ID, which is a basic ID plus optional filter and optional explicit inverse. Examples include "[a-zA-Z] Latin-Greek" and "Lower (Upper)". A compound ID, which is a sequence of one or more single IDs, separated by semicolons, with optional forward and reverse global filters. The global filters are UnicodeSet patterns prepended or appended to the IDs, separated by semicolons. An appended filter must be enclosed in parentheses and applies in the reverse direction.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      (package private) static class  TransliteratorIDParser.SingleID
      A structure containing the canonicalized data of a filtered ID, that is, a basic ID optionally with a filter.
      private static class  TransliteratorIDParser.Specs
      A structure containing the parsed data of a filtered ID, that is, a basic ID optionally with a filter.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String[] IDtoSTV​(java.lang.String id)
      Parse an ID into pieces.
      (package private) static java.util.List<Transliterator> instantiateList​(java.util.List<TransliteratorIDParser.SingleID> ids)
      Returns the list of Transliterator objects for the given list of SingleID objects.
      static boolean parseCompoundID​(java.lang.String id, int dir, java.lang.StringBuffer canonID, java.util.List<TransliteratorIDParser.SingleID> list, UnicodeSet[] globalFilter)
      Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter.
      static TransliteratorIDParser.SingleID parseFilterID​(java.lang.String id, int[] pos)
      Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.
      private static TransliteratorIDParser.Specs parseFilterID​(java.lang.String id, int[] pos, boolean allowFilter)
      Parse an ID into component pieces.
      static UnicodeSet parseGlobalFilter​(java.lang.String id, int[] pos, int dir, int[] withParens, java.lang.StringBuffer canonID)
      Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.
      static TransliteratorIDParser.SingleID parseSingleID​(java.lang.String id, int[] pos, int dir)
      Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.
      static void registerSpecialInverse​(java.lang.String target, java.lang.String inverseTarget, boolean bidirectional)
      Register two targets as being inverses of one another.
      private static TransliteratorIDParser.SingleID specsToID​(TransliteratorIDParser.Specs specs, int dir)
      Givens a Spec object, convert it to a SingleID object.
      private static TransliteratorIDParser.SingleID specsToSpecialInverse​(TransliteratorIDParser.Specs specs)
      Given a Specs object, return a SingleID representing the special inverse of that ID.
      static java.lang.String STVtoID​(java.lang.String source, java.lang.String target, java.lang.String variant)
      Given source, target, and variant strings, concatenate them into a full ID.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • TransliteratorIDParser

        TransliteratorIDParser()
    • Method Detail

      • parseFilterID

        public static TransliteratorIDParser.SingleID parseFilterID​(java.lang.String id,
                                                                    int[] pos)
        Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.
        Parameters:
        id - the id to be parsed
        pos - INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.
        Returns:
        a SingleID object or null if the parse fails
      • parseSingleID

        public static TransliteratorIDParser.SingleID parseSingleID​(java.lang.String id,
                                                                    int[] pos,
                                                                    int dir)
        Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.
        Parameters:
        id - the id to be parsed
        pos - INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.
        dir - the direction. If the direction is REVERSE then the SingleID is constructed for the reverse direction.
        Returns:
        a SingleID object or null
      • parseGlobalFilter

        public static UnicodeSet parseGlobalFilter​(java.lang.String id,
                                                   int[] pos,
                                                   int dir,
                                                   int[] withParens,
                                                   java.lang.StringBuffer canonID)
        Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.
        Parameters:
        id - the pattern the parse
        pos - INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.
        dir - the direction.
        withParens - INPUT-OUTPUT parameter. On entry, if withParens[0] is 0, then parens are disallowed. If it is 1, then parens are requires. If it is -1, then parens are optional, and the return result will be set to 0 or 1.
        canonID - OUTPUT parameter. The pattern for the filter added to the canonID, either at the end, if dir is FORWARD, or at the start, if dir is REVERSE. The pattern will be enclosed in parentheses if appropriate, and will be suffixed with an ID_DELIM character. May be null.
        Returns:
        a UnicodeSet object or null. A non-null results indicates a successful parse, regardless of whether the filter applies to the given direction. The caller should discard it if withParens != (dir == REVERSE).
      • parseCompoundID

        public static boolean parseCompoundID​(java.lang.String id,
                                              int dir,
                                              java.lang.StringBuffer canonID,
                                              java.util.List<TransliteratorIDParser.SingleID> list,
                                              UnicodeSet[] globalFilter)
        Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter. The separator is a semicolon. The global filters are UnicodeSet patterns. The reverse global filter must be enclosed in parentheses.
        Parameters:
        id - the pattern the parse
        dir - the direction.
        canonID - OUTPUT parameter that receives the canonical ID, consisting of canonical IDs for all elements, as returned by parseSingleID(), separated by semicolons. Previous contents are discarded.
        list - OUTPUT parameter that receives a list of SingleID objects representing the parsed IDs. Previous contents are discarded.
        globalFilter - OUTPUT parameter that receives a pointer to a newly created global filter for this ID in this direction, or null if there is none.
        Returns:
        true if the parse succeeds, that is, if the entire id is consumed without syntax error.
      • instantiateList

        static java.util.List<Transliterator> instantiateList​(java.util.List<TransliteratorIDParser.SingleID> ids)
        Returns the list of Transliterator objects for the given list of SingleID objects.
        Parameters:
        ids - list vector of SingleID objects.
        Returns:
        Actual transliterators for the list of SingleIDs
      • IDtoSTV

        public static java.lang.String[] IDtoSTV​(java.lang.String id)
        Parse an ID into pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.
        Parameters:
        id - the id string, in any of several forms
        Returns:
        an array of 4 strings: source, target, variant, and isSourcePresent. If the source is not present, ANY will be given as the source, and isSourcePresent will be null. Otherwise isSourcePresent will be non-null. The target may be empty if the id is not well-formed. The variant may be empty.
      • STVtoID

        public static java.lang.String STVtoID​(java.lang.String source,
                                               java.lang.String target,
                                               java.lang.String variant)
        Given source, target, and variant strings, concatenate them into a full ID. If the source is empty, then "Any" will be used for the source, so the ID will always be of the form s-t/v or s-t.
      • registerSpecialInverse

        public static void registerSpecialInverse​(java.lang.String target,
                                                  java.lang.String inverseTarget,
                                                  boolean bidirectional)
        Register two targets as being inverses of one another. For example, calling registerSpecialInverse("NFC", "NFD", true) causes Transliterator to form the following inverse relationships:
        NFC => NFD
         Any-NFC => Any-NFD
         NFD => NFC
         Any-NFD => Any-NFC
        (Without the special inverse registration, the inverse of NFC would be NFC-Any.) Note that NFD is shorthand for Any-NFD, but that the presence or absence of "Any-" is preserved.

        The relationship is symmetrical; registering (a, b) is equivalent to registering (b, a).

        The relevant IDs must still be registered separately as factories or classes.

        Only the targets are specified. Special inverses always have the form Any-Target1 <=> Any-Target2. The target should have canonical casing (the casing desired to be produced when an inverse is formed) and should contain no whitespace or other extraneous characters.

        Parameters:
        target - the target against which to register the inverse
        inverseTarget - the inverse of target, that is Any-target.getInverse() => Any-inverseTarget
        bidirectional - if true, register the reverse relation as well, that is, Any-inverseTarget.getInverse() => Any-target
      • parseFilterID

        private static TransliteratorIDParser.Specs parseFilterID​(java.lang.String id,
                                                                  int[] pos,
                                                                  boolean allowFilter)
        Parse an ID into component pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.
        Parameters:
        id - the id string, in any of several forms
        pos - INPUT-OUTPUT parameter. On input, pos[0] is the offset of the first character to parse in id. On output, pos[0] is the offset after the last parsed character. If the parse failed, pos[0] will be unchanged.
        allowFilter - if true, a UnicodeSet pattern is allowed at any location between specs or delimiters, and is returned as the fifth string in the array.
        Returns:
        a Specs object, or null if the parse failed. If neither source nor target was seen in the parsed id, then the parse fails. If allowFilter is true, then the parsed filter pattern is returned in the Specs object, otherwise the returned filter reference is null. If the parse fails for any reason null is returned.
      • specsToID

        private static TransliteratorIDParser.SingleID specsToID​(TransliteratorIDParser.Specs specs,
                                                                 int dir)
        Givens a Spec object, convert it to a SingleID object. The Spec object is a more unprocessed parse result. The SingleID object contains information about canonical and basic IDs.
        Returns:
        a SingleID; never returns null. Returned object always has 'filter' field of null.
      • specsToSpecialInverse

        private static TransliteratorIDParser.SingleID specsToSpecialInverse​(TransliteratorIDParser.Specs specs)
        Given a Specs object, return a SingleID representing the special inverse of that ID. If there is no special inverse then return null.
        Returns:
        a SingleID or null. Returned object always has 'filter' field of null.