Class CharsetBOCU1

  • All Implemented Interfaces:
    java.lang.Comparable<java.nio.charset.Charset>

    class CharsetBOCU1
    extends CharsetICU
    • Constructor Detail

      • CharsetBOCU1

        public CharsetBOCU1​(java.lang.String icuCanonicalName,
                            java.lang.String javaCanonicalName,
                            java.lang.String[] aliases)
    • Method Detail

      • BOCU1_LENGTH_FROM_PACKED

        private static int BOCU1_LENGTH_FROM_PACKED​(int packed)
      • BOCU1_TRAIL_TO_BYTE

        private static int BOCU1_TRAIL_TO_BYTE​(int trail)
      • BOCU1_SIMPLE_PREV

        private static int BOCU1_SIMPLE_PREV​(int c)
      • bocu1Prev

        private static int bocu1Prev​(int c)
        Compute the next "previous" value for differencing from the current code point.
        Parameters:
        c - current code point, 0x3040..0xd7a3 (rest handled by macro below)
        Returns:
        "previous code point" state value
      • BOCU1_PREV

        private static int BOCU1_PREV​(int c)
        Fast version of bocu1Prev() for most scripts.
      • DIFF_IS_SINGLE

        private static boolean DIFF_IS_SINGLE​(int diff)
        Is a diff value encodable in a single byte?
      • PACK_SINGLE_DIFF

        private static int PACK_SINGLE_DIFF​(int diff)
        Encode a diff value in a single byte.
      • DIFF_IS_DOUBLE

        private static boolean DIFF_IS_DOUBLE​(int diff)
        Is a diff value encodable in two bytes?
      • newDecoder

        public java.nio.charset.CharsetDecoder newDecoder()
        Specified by:
        newDecoder in class java.nio.charset.Charset
      • newEncoder

        public java.nio.charset.CharsetEncoder newEncoder()
        Specified by:
        newEncoder in class java.nio.charset.Charset
      • getUnicodeSetImpl

        void getUnicodeSetImpl​(UnicodeSet setFillIn,
                               int which)
        Description copied from class: CharsetICU
        This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored. Detects Unicode signature byte sequences at the start of the byte stream and returns number of bytes of the BOM of the indicated Unicode charset. 0 is returned when no Unicode signature is recognized.
        Specified by:
        getUnicodeSetImpl in class CharsetICU