Class CharsetRecog_mbcs

    • Constructor Detail

      • CharsetRecog_mbcs

        CharsetRecog_mbcs()
    • Method Detail

      • getName

        abstract java.lang.String getName()
        Get the IANA name of this charset.
        Specified by:
        getName in class CharsetRecognizer
        Returns:
        the charset name.
      • match

        int match​(CharsetDetector det,
                  int[] commonChars)
        Test the match of this charset with the input text data which is obtained via the CharsetDetector object.
        Parameters:
        det - The CharsetDetector, which contains the input text to be checked for being in this charset.
        Returns:
        Two values packed into one int (Damn java, anyhow)
        bits 0-7: the match confidence, ranging from 0-100
        bits 8-15: The match reason, an enum-like value.
      • nextChar

        abstract boolean nextChar​(CharsetRecog_mbcs.iteratedChar it,
                                  CharsetDetector det)
        Get the next character (however many bytes it is) from the input data Subclasses for specific charset encodings must implement this function to get characters according to the rules of their encoding scheme. This function is not a method of class iteratedChar only because that would require a lot of extra derived classes, which is awkward.
        Parameters:
        it - The iteratedChar "struct" into which the returned char is placed.
        det - The charset detector, which is needed to get at the input byte data being iterated over.
        Returns:
        True if a character was returned, false at end of input.