Class Utility


  • public final class Utility
    extends java.lang.Object
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private static char APOSTROPHE  
      private static char BACKSLASH  
      (package private) static char[] DIGITS  
      private static char ESCAPE
      The ESCAPE character is used during run-length encoding.
      (package private) static byte ESCAPE_BYTE
      The ESCAPE_BYTE character is used during run-length encoding.
      (package private) static char[] HEX_DIGIT  
      static java.lang.String LINE_SEPARATOR  
      private static int MAGIC_UNSIGNED  
      private static char[] UNESCAPE_MAP  
    • Constructor Summary

      Constructors 
      Constructor Description
      Utility()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      private static int _digit16​(int c)  
      private static int _digit8​(int c)  
      static int addExact​(int x, int y)
      This implementation is equivalent to Java 8+ Math#addExact(int, int)
      private static <T extends java.lang.Appendable>
      void
      appendEncodedByte​(T buffer, byte value, byte[] state)
      Append a byte to the given Appendable, packing two bytes into each character.
      private static <T extends java.lang.Appendable>
      void
      appendInt​(T buffer, int value)  
      static <T extends java.lang.Appendable>
      T
      appendNumber​(T result, int n, int radix, int minDigits)
      Append a number to the given Appendable in the given radix.
      static <A extends java.lang.Appendable>
      A
      appendTo​(java.lang.CharSequence string, A appendable)
      Appends a CharSequence to an Appendable, converting IOException to ICUUncheckedIOException.
      static void appendToRule​(java.lang.StringBuffer rule, int c, boolean isLiteral, boolean escapeUnprintable, java.lang.StringBuffer quoteBuf)
      Append a character to a rule that is being built up.
      static void appendToRule​(java.lang.StringBuffer rule, UnicodeMatcher matcher, boolean escapeUnprintable, java.lang.StringBuffer quoteBuf)
      Given a matcher reference, which may be null, append its pattern as a literal to the given rule.
      static void appendToRule​(java.lang.StringBuffer rule, java.lang.String text, boolean isLiteral, boolean escapeUnprintable, java.lang.StringBuffer quoteBuf)
      Append the given string to the rule.
      static boolean arrayEquals​(byte[] source, java.lang.Object target)  
      static boolean arrayEquals​(double[] source, java.lang.Object target)
      Convenience utility to compare two double[]s Ought to be in System
      static boolean arrayEquals​(int[] source, java.lang.Object target)
      Convenience utility to compare two int[]s Ought to be in System
      static boolean arrayEquals​(java.lang.Object[] source, java.lang.Object target)
      Convenience utility to compare two Object[]s.
      static boolean arrayEquals​(java.lang.Object source, java.lang.Object target)
      Convenience utility to compare two Object[]s Ought to be in System
      static boolean arrayRegionMatches​(byte[] source, int sourceStart, byte[] target, int targetStart, int len)  
      static boolean arrayRegionMatches​(char[] source, int sourceStart, char[] target, int targetStart, int len)
      Convenience utility to compare two Object[]s Ought to be in System.
      static boolean arrayRegionMatches​(double[] source, int sourceStart, double[] target, int targetStart, int len)
      Convenience utility to compare two arrays of doubles.
      static boolean arrayRegionMatches​(int[] source, int sourceStart, int[] target, int targetStart, int len)
      Convenience utility to compare two int[]s.
      static boolean arrayRegionMatches​(java.lang.Object[] source, int sourceStart, java.lang.Object[] target, int targetStart, int len)
      Convenience utility to compare two Object[]s Ought to be in System.
      static java.lang.String arrayToRLEString​(byte[] a)
      Construct a string representing a byte array.
      static java.lang.String arrayToRLEString​(char[] a)
      Construct a string representing a char array.
      static java.lang.String arrayToRLEString​(int[] a)
      Construct a string representing an int array.
      static java.lang.String arrayToRLEString​(short[] a)
      Construct a string representing a short array.
      static boolean charSequenceEquals​(java.lang.CharSequence a, java.lang.CharSequence b)
      Returns whether the chars in the two CharSequences are equal.
      static int charSequenceHashCode​(java.lang.CharSequence value)
      Returns a hash code for a CharSequence that is equivalent to calling charSequence.toString().hashCode()
      static <T extends java.lang.Comparable<T>>
      int
      checkCompare​(T a, T b)
      Convenience utility.
      static int checkHash​(java.lang.Object a)
      Convenience utility.
      private static int codePointAndLength​(int c, int length)  
      private static int codePointAndLength​(int c, int start, int limit)  
      static int compareUnsigned​(int source, int target)
      Compares 2 unsigned integers
      static int cpFromCodePointAndLength​(int cpAndLength)  
      private static <T extends java.lang.Appendable>
      void
      encodeRun​(T buffer, byte value, int length, byte[] state)
      Encode a run, possibly a degenerate run (of < 4 values).
      private static <T extends java.lang.Appendable>
      void
      encodeRun​(T buffer, int value, int length)
      Encode a run, possibly a degenerate run (of < 4 values).
      private static <T extends java.lang.Appendable>
      void
      encodeRun​(T buffer, short value, int length)
      Encode a run, possibly a degenerate run (of < 4 values).
      static java.lang.String escape​(java.lang.String s)
      Convert characters outside the range U+0020 to U+007F to Unicode escapes, and convert backslash to a double backslash.
      static <T extends java.lang.Appendable>
      T
      escape​(T result, int c)
      Escapes one code point using uxxxx notation for U+0000 to U+FFFF and Uxxxxxxxx for U+10000 and above.
      static <T extends java.lang.Appendable>
      boolean
      escapeUnprintable​(T result, int c)
      Escapes one unprintable code point using uxxxx notation for U+0000 to U+FFFF and Uxxxxxxxx for U+10000 and above.
      static java.lang.String format1ForSource​(java.lang.String s)
      Format a String for representation in a source file.
      static java.lang.String formatForSource​(java.lang.String s)
      Format a String for representation in a source file.
      static java.lang.String fromHex​(java.lang.String string, int minLength, java.lang.String separator)
      Parse a list of hex numbers and return a string
      static java.lang.String fromHex​(java.lang.String string, int minLength, java.util.regex.Pattern separator)
      Parse a list of hex numbers and return a string
      (package private) static int getInt​(java.lang.String s, int i)  
      static java.lang.String hex​(byte[] o, int start, int end, java.lang.String separator)  
      static java.lang.String hex​(long ch)
      Convert a char to 4 hex uppercase digits.
      static java.lang.String hex​(long i, int places)
      Supplies a zero-padded hex representation of an integer (without 0x)
      static java.lang.String hex​(java.lang.CharSequence s)
      Convert a string to comma-separated groups of 4 hex uppercase digits.
      static <S extends java.lang.CharSequence>
      java.lang.String
      hex​(S s, int width, S separator)
      Convert a string to comma-separated groups of 4 hex uppercase digits.
      static <S extends java.lang.CharSequence,​U extends java.lang.CharSequence,​T extends java.lang.Appendable>
      T
      hex​(S s, int width, U separator, boolean useCodePoints, T result)
      Convert a string to separated groups of hex uppercase digits.
      static byte highBit​(int n)
      Find the highest bit in a positive integer.
      static boolean isUnprintable​(int c)
      Return true if the character is NOT printable ASCII.
      static java.lang.String joinStrings​(java.lang.CharSequence delimiter, java.lang.Iterable<? extends java.lang.CharSequence> elements)
      Java 8+ String#join(CharSequence, Iterable) compatible method for Java 7 env.
      static int lengthFromCodePointAndLength​(int cpAndLength)  
      static int lookup​(java.lang.String source, java.lang.String[] target)
      Look up a given string in a string array.
      static boolean parseChar​(java.lang.String id, int[] pos, char ch)
      Parse a single non-whitespace character 'ch', optionally preceded by whitespace.
      static int parseInteger​(java.lang.String rule, int[] pos, int limit)
      Parse an integer at pos, either of the form \d+ or of the form 0x[0-9A-Fa-f]+ or 0[0-7]+, that is, in standard decimal, hex, or octal format.
      static int parseNumber​(java.lang.String text, int[] pos, int radix)
      Parse an unsigned 31-bit integer at the given offset.
      static int parsePattern​(java.lang.String rule, int pos, int limit, java.lang.String pattern, int[] parsedInts)
      Parse a pattern string starting at offset pos.
      static int parsePattern​(java.lang.String pat, Replaceable text, int index, int limit)
      Parse a pattern string within the given Replaceable and a parsing pattern.
      static java.lang.String parseUnicodeIdentifier​(java.lang.String str, int[] pos)
      Parse a Unicode identifier from the given string at the given position.
      static int quotedIndexOf​(java.lang.String text, int start, int limit, java.lang.String setOfChars)
      Returns the index of the first character in a set, ignoring quoted text.
      private static <T extends java.lang.Appendable>
      void
      recursiveAppendNumber​(T result, int n, int radix, int minDigits)
      Append the digits of a positive integer to the given Appendable in the given radix.
      static java.lang.String repeat​(java.lang.String s, int count)
      Utility to duplicate a string count times
      static byte[] RLEStringToByteArray​(java.lang.String s)
      Construct an array of bytes from a run-length encoded string.
      static char[] RLEStringToCharArray​(java.lang.String s)
      Construct an array of shorts from a run-length encoded string.
      static int[] RLEStringToIntArray​(java.lang.String s)
      Construct an array of ints from a run-length encoded string.
      static short[] RLEStringToShortArray​(java.lang.String s)
      Construct an array of shorts from a run-length encoded string.
      static boolean sameObjects​(java.lang.Object a, java.lang.Object b)
      Trivial reference equality.
      static boolean shouldAlwaysBeEscaped​(int c)  
      static java.lang.String[] split​(java.lang.String s, char divider)
      Split a string into pieces based on the given divider character
      static void split​(java.lang.String s, char divider, java.lang.String[] output)
      Split a string into pieces based on the given divider character
      static java.lang.String[] splitString​(java.lang.String src, java.lang.String target)  
      static java.lang.String[] splitWhitespace​(java.lang.String src)
      Split the string at runs of ascii whitespace characters.
      static java.lang.String unescape​(java.lang.CharSequence s)
      Convert all escapes in a given string using unescapeAndLengthAt().
      static int unescapeAndLengthAt​(java.lang.CharSequence s, int offset)
      Converts an escape to a code point value.
      private static int unescapeAndLengthAt​(java.lang.CharSequence s, int offset, int length)  
      static java.lang.String unescapeLeniently​(java.lang.CharSequence s)
      Convert all escapes in a given string using unescapeAndLengthAt().
      static java.lang.String valueOf​(int[] source)
      Utility method to take a int[] containing codepoints and return a string representation with code units.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • ESCAPE

        private static final char ESCAPE
        The ESCAPE character is used during run-length encoding. It signals a run of identical chars.
        See Also:
        Constant Field Values
      • ESCAPE_BYTE

        static final byte ESCAPE_BYTE
        The ESCAPE_BYTE character is used during run-length encoding. It signals a run of identical bytes.
        See Also:
        Constant Field Values
      • LINE_SEPARATOR

        public static java.lang.String LINE_SEPARATOR
      • HEX_DIGIT

        static final char[] HEX_DIGIT
      • UNESCAPE_MAP

        private static final char[] UNESCAPE_MAP
      • DIGITS

        static final char[] DIGITS
    • Constructor Detail

      • Utility

        public Utility()
    • Method Detail

      • arrayEquals

        public static final boolean arrayEquals​(java.lang.Object[] source,
                                                java.lang.Object target)
        Convenience utility to compare two Object[]s. Ought to be in System
      • arrayEquals

        public static final boolean arrayEquals​(int[] source,
                                                java.lang.Object target)
        Convenience utility to compare two int[]s Ought to be in System
      • arrayEquals

        public static final boolean arrayEquals​(double[] source,
                                                java.lang.Object target)
        Convenience utility to compare two double[]s Ought to be in System
      • arrayEquals

        public static final boolean arrayEquals​(byte[] source,
                                                java.lang.Object target)
      • arrayEquals

        public static final boolean arrayEquals​(java.lang.Object source,
                                                java.lang.Object target)
        Convenience utility to compare two Object[]s Ought to be in System
      • arrayRegionMatches

        public static final boolean arrayRegionMatches​(java.lang.Object[] source,
                                                       int sourceStart,
                                                       java.lang.Object[] target,
                                                       int targetStart,
                                                       int len)
        Convenience utility to compare two Object[]s Ought to be in System.
        Parameters:
        len - the length to compare. The start indices and start+len must be valid.
      • arrayRegionMatches

        public static final boolean arrayRegionMatches​(char[] source,
                                                       int sourceStart,
                                                       char[] target,
                                                       int targetStart,
                                                       int len)
        Convenience utility to compare two Object[]s Ought to be in System.
        Parameters:
        len - the length to compare. The start indices and start+len must be valid.
      • arrayRegionMatches

        public static final boolean arrayRegionMatches​(int[] source,
                                                       int sourceStart,
                                                       int[] target,
                                                       int targetStart,
                                                       int len)
        Convenience utility to compare two int[]s.
        Parameters:
        len - the length to compare. The start indices and start+len must be valid. Ought to be in System
      • arrayRegionMatches

        public static final boolean arrayRegionMatches​(double[] source,
                                                       int sourceStart,
                                                       double[] target,
                                                       int targetStart,
                                                       int len)
        Convenience utility to compare two arrays of doubles.
        Parameters:
        len - the length to compare. The start indices and start+len must be valid. Ought to be in System
      • arrayRegionMatches

        public static final boolean arrayRegionMatches​(byte[] source,
                                                       int sourceStart,
                                                       byte[] target,
                                                       int targetStart,
                                                       int len)
      • sameObjects

        public static final boolean sameObjects​(java.lang.Object a,
                                                java.lang.Object b)
        Trivial reference equality. This method should help document that we really want == not equals(), and to have a single place to suppress warnings from static analysis tools.
      • checkCompare

        public static <T extends java.lang.Comparable<T>> int checkCompare​(T a,
                                                                           T b)
        Convenience utility. Does null checks on objects, then calls compare.
      • checkHash

        public static int checkHash​(java.lang.Object a)
        Convenience utility. Does null checks on object, then calls hashCode.
      • arrayToRLEString

        public static final java.lang.String arrayToRLEString​(int[] a)
        Construct a string representing an int array. Use run-length encoding. A character represents itself, unless it is the ESCAPE character. Then the following notations are possible: ESCAPE ESCAPE ESCAPE literal ESCAPE n c n instances of character c Since an encoded run occupies 3 characters, we only encode runs of 4 or more characters. Thus we have n > 0 and n != ESCAPE and n <= 0xFFFF. If we encounter a run where n == ESCAPE, we represent this as: c ESCAPE n-1 c The ESCAPE value is chosen so as not to collide with commonly seen values.
      • arrayToRLEString

        public static final java.lang.String arrayToRLEString​(short[] a)
        Construct a string representing a short array. Use run-length encoding. A character represents itself, unless it is the ESCAPE character. Then the following notations are possible: ESCAPE ESCAPE ESCAPE literal ESCAPE n c n instances of character c Since an encoded run occupies 3 characters, we only encode runs of 4 or more characters. Thus we have n > 0 and n != ESCAPE and n <= 0xFFFF. If we encounter a run where n == ESCAPE, we represent this as: c ESCAPE n-1 c The ESCAPE value is chosen so as not to collide with commonly seen values.
      • arrayToRLEString

        public static final java.lang.String arrayToRLEString​(char[] a)
        Construct a string representing a char array. Use run-length encoding. A character represents itself, unless it is the ESCAPE character. Then the following notations are possible: ESCAPE ESCAPE ESCAPE literal ESCAPE n c n instances of character c Since an encoded run occupies 3 characters, we only encode runs of 4 or more characters. Thus we have n > 0 and n != ESCAPE and n <= 0xFFFF. If we encounter a run where n == ESCAPE, we represent this as: c ESCAPE n-1 c The ESCAPE value is chosen so as not to collide with commonly seen values.
      • arrayToRLEString

        public static final java.lang.String arrayToRLEString​(byte[] a)
        Construct a string representing a byte array. Use run-length encoding. Two bytes are packed into a single char, with a single extra zero byte at the end if needed. A byte represents itself, unless it is the ESCAPE_BYTE. Then the following notations are possible: ESCAPE_BYTE ESCAPE_BYTE ESCAPE_BYTE literal ESCAPE_BYTE n b n instances of byte b Since an encoded run occupies 3 bytes, we only encode runs of 4 or more bytes. Thus we have n > 0 and n != ESCAPE_BYTE and n <= 0xFF. If we encounter a run where n == ESCAPE_BYTE, we represent this as: b ESCAPE_BYTE n-1 b The ESCAPE_BYTE value is chosen so as not to collide with commonly seen values.
      • encodeRun

        private static final <T extends java.lang.Appendable> void encodeRun​(T buffer,
                                                                             int value,
                                                                             int length)
        Encode a run, possibly a degenerate run (of < 4 values).
        Parameters:
        length - The length of the run; must be > 0 && <= 0xFFFF.
      • appendInt

        private static final <T extends java.lang.Appendable> void appendInt​(T buffer,
                                                                             int value)
      • encodeRun

        private static final <T extends java.lang.Appendable> void encodeRun​(T buffer,
                                                                             short value,
                                                                             int length)
        Encode a run, possibly a degenerate run (of < 4 values).
        Parameters:
        length - The length of the run; must be > 0 && <= 0xFFFF.
      • encodeRun

        private static final <T extends java.lang.Appendable> void encodeRun​(T buffer,
                                                                             byte value,
                                                                             int length,
                                                                             byte[] state)
        Encode a run, possibly a degenerate run (of < 4 values).
        Parameters:
        length - The length of the run; must be > 0 && <= 0xFF.
      • appendEncodedByte

        private static final <T extends java.lang.Appendable> void appendEncodedByte​(T buffer,
                                                                                     byte value,
                                                                                     byte[] state)
        Append a byte to the given Appendable, packing two bytes into each character. The state parameter maintains intermediary data between calls.
        Parameters:
        state - A two-element array, with state[0] == 0 if this is the first byte of a pair, or state[0] != 0 if this is the second byte of a pair, in which case state[1] is the first byte.
      • RLEStringToIntArray

        public static final int[] RLEStringToIntArray​(java.lang.String s)
        Construct an array of ints from a run-length encoded string.
      • getInt

        static final int getInt​(java.lang.String s,
                                int i)
      • RLEStringToShortArray

        public static final short[] RLEStringToShortArray​(java.lang.String s)
        Construct an array of shorts from a run-length encoded string.
      • RLEStringToCharArray

        public static final char[] RLEStringToCharArray​(java.lang.String s)
        Construct an array of shorts from a run-length encoded string.
      • RLEStringToByteArray

        public static final byte[] RLEStringToByteArray​(java.lang.String s)
        Construct an array of bytes from a run-length encoded string.
      • formatForSource

        public static final java.lang.String formatForSource​(java.lang.String s)
        Format a String for representation in a source file. This includes breaking it into lines and escaping characters using octal notation when necessary (control characters and double quotes).
      • format1ForSource

        public static final java.lang.String format1ForSource​(java.lang.String s)
        Format a String for representation in a source file. Like formatForSource but does not do line breaking.
      • escape

        public static final java.lang.String escape​(java.lang.String s)
        Convert characters outside the range U+0020 to U+007F to Unicode escapes, and convert backslash to a double backslash.
      • _digit8

        private static final int _digit8​(int c)
      • _digit16

        private static final int _digit16​(int c)
      • unescapeAndLengthAt

        public static int unescapeAndLengthAt​(java.lang.CharSequence s,
                                              int offset)
        Converts an escape to a code point value. We attempt to parallel the icu4c unescapeAt() function. This function returns an integer with both the code point (bits 28..8) and the length of the escape sequence (bits 7..0). offset+length is the index after the escape sequence.
        Parameters:
        offset - the offset to the character after the backslash.
        Returns:
        the code point and length, or -1 on error.
      • unescapeAndLengthAt

        private static int unescapeAndLengthAt​(java.lang.CharSequence s,
                                               int offset,
                                               int length)
      • codePointAndLength

        private static int codePointAndLength​(int c,
                                              int length)
      • codePointAndLength

        private static int codePointAndLength​(int c,
                                              int start,
                                              int limit)
      • cpFromCodePointAndLength

        public static int cpFromCodePointAndLength​(int cpAndLength)
      • lengthFromCodePointAndLength

        public static int lengthFromCodePointAndLength​(int cpAndLength)
      • unescape

        public static java.lang.String unescape​(java.lang.CharSequence s)
        Convert all escapes in a given string using unescapeAndLengthAt().
        Throws:
        java.lang.IllegalArgumentException - if an invalid escape is seen.
      • unescapeLeniently

        public static java.lang.String unescapeLeniently​(java.lang.CharSequence s)
        Convert all escapes in a given string using unescapeAndLengthAt(). Leave invalid escape sequences unchanged.
      • hex

        public static java.lang.String hex​(long ch)
        Convert a char to 4 hex uppercase digits. E.g., hex('a') => "0041".
      • hex

        public static java.lang.String hex​(long i,
                                           int places)
        Supplies a zero-padded hex representation of an integer (without 0x)
      • hex

        public static java.lang.String hex​(java.lang.CharSequence s)
        Convert a string to comma-separated groups of 4 hex uppercase digits. E.g., hex('ab') => "0041,0042".
      • hex

        public static <S extends java.lang.CharSequence,​U extends java.lang.CharSequence,​T extends java.lang.Appendable> T hex​(S s,
                                                                                                                                           int width,
                                                                                                                                           U separator,
                                                                                                                                           boolean useCodePoints,
                                                                                                                                           T result)
        Convert a string to separated groups of hex uppercase digits. E.g., hex('ab'...) => "0041,0042". Append the output to the given Appendable.
      • hex

        public static java.lang.String hex​(byte[] o,
                                           int start,
                                           int end,
                                           java.lang.String separator)
      • hex

        public static <S extends java.lang.CharSequence> java.lang.String hex​(S s,
                                                                              int width,
                                                                              S separator)
        Convert a string to comma-separated groups of 4 hex uppercase digits. E.g., hex('ab') => "0041,0042".
      • split

        public static void split​(java.lang.String s,
                                 char divider,
                                 java.lang.String[] output)
        Split a string into pieces based on the given divider character
        Parameters:
        s - the string to split
        divider - the character on which to split. Occurrences of this character are not included in the output
        output - an array to receive the substrings between instances of divider. It must be large enough on entry to accommodate all output. Adjacent instances of the divider character will place empty strings into output. Before returning, output is padded out with empty strings.
      • split

        public static java.lang.String[] split​(java.lang.String s,
                                               char divider)
        Split a string into pieces based on the given divider character
        Parameters:
        s - the string to split
        divider - the character on which to split. Occurrences of this character are not included in the output
        Returns:
        output an array to receive the substrings between instances of divider. Adjacent instances of the divider character will place empty strings into output.
      • lookup

        public static int lookup​(java.lang.String source,
                                 java.lang.String[] target)
        Look up a given string in a string array. Returns the index at which the first occurrence of the string was found in the array, or -1 if it was not found.
        Parameters:
        source - the string to search for
        target - the array of zero or more strings in which to look for source
        Returns:
        the index of target at which source first occurs, or -1 if not found
      • parseChar

        public static boolean parseChar​(java.lang.String id,
                                        int[] pos,
                                        char ch)
        Parse a single non-whitespace character 'ch', optionally preceded by whitespace.
        Parameters:
        id - the string to be parsed
        pos - INPUT-OUTPUT parameter. On input, pos[0] is the offset of the first character to be parsed. On output, pos[0] is the index after the last parsed character. If the parse fails, pos[0] will be unchanged.
        ch - the non-whitespace character to be parsed.
        Returns:
        true if 'ch' is seen preceded by zero or more whitespace characters.
      • parsePattern

        public static int parsePattern​(java.lang.String rule,
                                       int pos,
                                       int limit,
                                       java.lang.String pattern,
                                       int[] parsedInts)
        Parse a pattern string starting at offset pos. Keywords are matched case-insensitively. Spaces may be skipped and may be optional or required. Integer values may be parsed, and if they are, they will be returned in the given array. If successful, the offset of the next non-space character is returned. On failure, -1 is returned.
        Parameters:
        pattern - must only contain lowercase characters, which will match their uppercase equivalents as well. A space character matches one or more required spaces. A '~' character matches zero or more optional spaces. A '#' character matches an integer and stores it in parsedInts, which the caller must ensure has enough capacity.
        parsedInts - array to receive parsed integers. Caller must ensure that parsedInts.length is >= the number of '#' signs in 'pattern'.
        Returns:
        the position after the last character parsed, or -1 if the parse failed
      • parsePattern

        public static int parsePattern​(java.lang.String pat,
                                       Replaceable text,
                                       int index,
                                       int limit)
        Parse a pattern string within the given Replaceable and a parsing pattern. Characters are matched literally and case-sensitively except for the following special characters: ~ zero or more Pattern_White_Space chars If end of pattern is reached with all matches along the way, pos is advanced to the first unparsed index and returned. Otherwise -1 is returned.
        Parameters:
        pat - pattern that controls parsing
        text - text to be parsed, starting at index
        index - offset to first character to parse
        limit - offset after last character to parse
        Returns:
        index after last parsed character, or -1 on parse failure.
      • parseInteger

        public static int parseInteger​(java.lang.String rule,
                                       int[] pos,
                                       int limit)
        Parse an integer at pos, either of the form \d+ or of the form 0x[0-9A-Fa-f]+ or 0[0-7]+, that is, in standard decimal, hex, or octal format.
        Parameters:
        pos - INPUT-OUTPUT parameter. On input, the first character to parse. On output, the character after the last parsed character.
      • parseUnicodeIdentifier

        public static java.lang.String parseUnicodeIdentifier​(java.lang.String str,
                                                              int[] pos)
        Parse a Unicode identifier from the given string at the given position. Return the identifier, or null if there is no identifier.
        Parameters:
        str - the string to parse
        pos - INPUT-OUTPUT parameter. On INPUT, pos[0] is the first character to examine. It must be less than str.length(), and it must not point to a whitespace character. That is, must have pos[0] < str.length(). On OUTPUT, the position after the last parsed character.
        Returns:
        the Unicode identifier, or null if there is no valid identifier at pos[0].
      • recursiveAppendNumber

        private static <T extends java.lang.Appendable> void recursiveAppendNumber​(T result,
                                                                                   int n,
                                                                                   int radix,
                                                                                   int minDigits)
        Append the digits of a positive integer to the given Appendable in the given radix. This is done recursively since it is easiest to generate the low- order digit first, but it must be appended last.
        Parameters:
        result - is the Appendable to append to
        n - is the positive integer
        radix - is the radix, from 2 to 36 inclusive
        minDigits - is the minimum number of digits to append.
      • appendNumber

        public static <T extends java.lang.Appendable> T appendNumber​(T result,
                                                                      int n,
                                                                      int radix,
                                                                      int minDigits)
        Append a number to the given Appendable in the given radix. Standard digits '0'-'9' are used and letters 'A'-'Z' for radices 11 through 36.
        Parameters:
        result - the digits of the number are appended here
        n - the number to be converted to digits; may be negative. If negative, a '-' is prepended to the digits.
        radix - a radix from 2 to 36 inclusive.
        minDigits - the minimum number of digits, not including any '-', to produce. Values less than 2 have no effect. One digit is always emitted regardless of this parameter.
        Returns:
        a reference to result
      • parseNumber

        public static int parseNumber​(java.lang.String text,
                                      int[] pos,
                                      int radix)
        Parse an unsigned 31-bit integer at the given offset. Use UCharacter.digit() to parse individual characters into digits.
        Parameters:
        text - the text to be parsed
        pos - INPUT-OUTPUT parameter. On entry, pos[0] is the offset within text at which to start parsing; it should point to a valid digit. On exit, pos[0] is the offset after the last parsed character. If the parse failed, it will be unchanged on exit. Must be >= 0 on entry.
        radix - the radix in which to parse; must be >= 2 and <= 36.
        Returns:
        a non-negative parsed number, or -1 upon parse failure. Parse fails if there are no digits, that is, if pos[0] does not point to a valid digit on entry, or if the number to be parsed does not fit into a 31-bit unsigned integer.
      • isUnprintable

        public static boolean isUnprintable​(int c)
        Return true if the character is NOT printable ASCII. The tab, newline and linefeed characters are considered unprintable.
      • shouldAlwaysBeEscaped

        public static boolean shouldAlwaysBeEscaped​(int c)
        Returns:
        true for control codes and for surrogate and noncharacter code points
      • escapeUnprintable

        public static <T extends java.lang.Appendable> boolean escapeUnprintable​(T result,
                                                                                 int c)
        Escapes one unprintable code point using uxxxx notation for U+0000 to U+FFFF and Uxxxxxxxx for U+10000 and above. If the character is printable ASCII, then do nothing and return false. Otherwise, append the escaped notation and return true.
      • escape

        public static <T extends java.lang.Appendable> T escape​(T result,
                                                                int c)
        Escapes one code point using uxxxx notation for U+0000 to U+FFFF and Uxxxxxxxx for U+10000 and above.
        Returns:
        result
      • quotedIndexOf

        public static int quotedIndexOf​(java.lang.String text,
                                        int start,
                                        int limit,
                                        java.lang.String setOfChars)
        Returns the index of the first character in a set, ignoring quoted text. For example, in the string "abc'hide'h", the 'h' in "hide" will not be found by a search for "h". Unlike String.indexOf(), this method searches not for a single character, but for any character of the string setOfChars.
        Parameters:
        text - text to be searched
        start - the beginning index, inclusive; 0 <= start <= limit.
        limit - the ending index, exclusive; start <= limit <= text.length().
        setOfChars - string with one or more distinct characters
        Returns:
        Offset of the first character in setOfChars found, or -1 if not found.
        See Also:
        String.indexOf(int)
      • appendToRule

        public static void appendToRule​(java.lang.StringBuffer rule,
                                        int c,
                                        boolean isLiteral,
                                        boolean escapeUnprintable,
                                        java.lang.StringBuffer quoteBuf)
        Append a character to a rule that is being built up. To flush the quoteBuf to rule, make one final call with isLiteral == true. If there is no final character, pass in (int)-1 as c.
        Parameters:
        rule - the string to append the character to
        c - the character to append, or (int)-1 if none.
        isLiteral - if true, then the given character should not be quoted or escaped. Usually this means it is a syntactic element such as > or $
        escapeUnprintable - if true, then unprintable characters should be escaped using escapeUnprintable(). These escapes will appear outside of quotes.
        quoteBuf - a buffer which is used to build up quoted substrings. The caller should initially supply an empty buffer, and thereafter should not modify the buffer. The buffer should be cleared out by, at the end, calling this method with a literal character (which may be -1).
      • appendToRule

        public static void appendToRule​(java.lang.StringBuffer rule,
                                        java.lang.String text,
                                        boolean isLiteral,
                                        boolean escapeUnprintable,
                                        java.lang.StringBuffer quoteBuf)
        Append the given string to the rule. Calls the single-character version of appendToRule for each character.
      • appendToRule

        public static void appendToRule​(java.lang.StringBuffer rule,
                                        UnicodeMatcher matcher,
                                        boolean escapeUnprintable,
                                        java.lang.StringBuffer quoteBuf)
        Given a matcher reference, which may be null, append its pattern as a literal to the given rule.
      • compareUnsigned

        public static final int compareUnsigned​(int source,
                                                int target)
        Compares 2 unsigned integers
        Parameters:
        source - 32 bit unsigned integer
        target - 32 bit unsigned integer
        Returns:
        0 if equals, 1 if source is greater than target and -1 otherwise
      • highBit

        public static final byte highBit​(int n)
        Find the highest bit in a positive integer. This is done by doing a binary search through the bits.
        Parameters:
        n - is the integer
        Returns:
        the bit number of the highest bit, with 0 being the low order bit, or -1 if n is not positive
      • valueOf

        public static java.lang.String valueOf​(int[] source)
        Utility method to take a int[] containing codepoints and return a string representation with code units.
      • repeat

        public static java.lang.String repeat​(java.lang.String s,
                                              int count)
        Utility to duplicate a string count times
        Parameters:
        s - String to be duplicated.
        count - Number of times to duplicate a string.
      • splitString

        public static java.lang.String[] splitString​(java.lang.String src,
                                                     java.lang.String target)
      • splitWhitespace

        public static java.lang.String[] splitWhitespace​(java.lang.String src)
        Split the string at runs of ascii whitespace characters.
      • fromHex

        public static java.lang.String fromHex​(java.lang.String string,
                                               int minLength,
                                               java.lang.String separator)
        Parse a list of hex numbers and return a string
        Parameters:
        string - String of hex numbers.
        minLength - Minimal length.
        separator - Separator.
        Returns:
        A string from hex numbers.
      • fromHex

        public static java.lang.String fromHex​(java.lang.String string,
                                               int minLength,
                                               java.util.regex.Pattern separator)
        Parse a list of hex numbers and return a string
        Parameters:
        string - String of hex numbers.
        minLength - Minimal length.
        separator - Separator.
        Returns:
        A string from hex numbers.
      • addExact

        public static int addExact​(int x,
                                   int y)
        This implementation is equivalent to Java 8+ Math#addExact(int, int)
        Parameters:
        x - the first value
        y - the second value
        Returns:
        the result
      • charSequenceEquals

        public static boolean charSequenceEquals​(java.lang.CharSequence a,
                                                 java.lang.CharSequence b)
        Returns whether the chars in the two CharSequences are equal.
      • charSequenceHashCode

        public static int charSequenceHashCode​(java.lang.CharSequence value)
        Returns a hash code for a CharSequence that is equivalent to calling charSequence.toString().hashCode()
      • appendTo

        public static <A extends java.lang.Appendable> A appendTo​(java.lang.CharSequence string,
                                                                  A appendable)
        Appends a CharSequence to an Appendable, converting IOException to ICUUncheckedIOException.
      • joinStrings

        public static java.lang.String joinStrings​(java.lang.CharSequence delimiter,
                                                   java.lang.Iterable<? extends java.lang.CharSequence> elements)
        Java 8+ String#join(CharSequence, Iterable) compatible method for Java 7 env.
        Parameters:
        delimiter - the delimiter that separates each element
        elements - the elements to join together.
        Returns:
        a new String that is composed of the elements separated by the delimiter
        Throws:
        java.lang.NullPointerException - If delimiter or elements is null