Class CollationFastLatin


  • public final class CollationFastLatin
    extends java.lang.Object
    • Field Detail

      • VERSION

        public static final int VERSION
        Fast Latin format version (one byte 1..FF). Must be incremented for any runtime-incompatible changes, in particular, for changes to any of the following constants. When the major version number of the main data format changes, we can reset this fast Latin version to 1.
        See Also:
        Constant Field Values
      • CONTRACTION

        static final int CONTRACTION
        Contraction with one fast Latin character. Use INDEX_MASK to find the start of the contraction list after the fixed table. The first entry contains the default mapping. Otherwise use CONTR_CHAR_MASK for the contraction character index (in ascending order). Use CONTR_LENGTH_SHIFT for the length of the entry (1=BAIL_OUT, 2=one CE, 3=two CEs). Also, U+0000 maps to a contraction entry, so that the fast path need not check for NUL termination. It usually maps to a contraction list with only the completely ignorable default value.
        See Also:
        Constant Field Values
      • EXPANSION

        static final int EXPANSION
        An expansion encodes two CEs. Use INDEX_MASK to find the pair of CEs after the fixed table. The higher a mini CE value, the easier it is to process. For expansions and higher, no context needs to be considered.
        See Also:
        Constant Field Values
      • MIN_LONG

        static final int MIN_LONG
        Encodes one CE with a long/low mini primary (there are 128). All potentially-variable primaries must be in this range, to make the short-primary path as fast as possible.
        See Also:
        Constant Field Values
      • MIN_SHORT

        static final int MIN_SHORT
        Encodes one CE with a short/high primary (there are 60), plus a secondary CE if the secondary weight is high. Fast handling: At least all letter primaries should be in this range.
        See Also:
        Constant Field Values
      • MAX_SHORT

        static final int MAX_SHORT
        The highest primary weight is reserved for U+FFFF.
        See Also:
        Constant Field Values
      • SEC_OFFSET

        static final int SEC_OFFSET
        Lookup: Add this offset to secondary weights, except for completely ignorable CEs. Must be greater than any special value, e.g., MERGE_WEIGHT. The exact value is not relevant for the format version.
        See Also:
        Constant Field Values
      • TWO_COMMON_SEC_PLUS_OFFSET

        static final int TWO_COMMON_SEC_PLUS_OFFSET
        See Also:
        Constant Field Values
      • TER_OFFSET

        static final int TER_OFFSET
        Lookup: Add this offset to tertiary weights, except for completely ignorable CEs. Must be greater than any special value, e.g., MERGE_WEIGHT. Must be greater than case bits as well, so that with combined case+tertiary weights plus the offset the tertiary bits does not spill over into the case bits. The exact value is not relevant for the format version.
        See Also:
        Constant Field Values
      • TWO_COMMON_TER_PLUS_OFFSET

        static final int TWO_COMMON_TER_PLUS_OFFSET
        See Also:
        Constant Field Values
      • CONTR_CHAR_MASK

        static final int CONTR_CHAR_MASK
        Contraction result first word bits 8..0 contain the second contraction character, as a char index 0..NUM_FAST_CHARS-1. Each contraction list is terminated with a word containing CONTR_CHAR_MASK.
        See Also:
        Constant Field Values
      • CONTR_LENGTH_SHIFT

        static final int CONTR_LENGTH_SHIFT
        Contraction result first word bits 10..9 contain the result length: 1=bail out, 2=one mini CE, 3=two mini CEs
        See Also:
        Constant Field Values
      • BAIL_OUT_RESULT

        public static final int BAIL_OUT_RESULT
        Comparison return value when the regular comparison must be used. The exact value is not relevant for the format version.
        See Also:
        Constant Field Values
    • Constructor Detail

      • CollationFastLatin

        private CollationFastLatin()
    • Method Detail

      • getCharIndex

        static int getCharIndex​(char c)
      • getOptions

        public static int getOptions​(CollationData data,
                                     CollationSettings settings,
                                     char[] primaries)
        Computes the options value for the compare functions and writes the precomputed primary weights. Returns -1 if the Latin fastpath is not supported for the data and settings. The capacity must be LATIN_LIMIT.
      • compareUTF16

        public static int compareUTF16​(char[] table,
                                       char[] primaries,
                                       int options,
                                       java.lang.CharSequence left,
                                       java.lang.CharSequence right,
                                       int startIndex)
      • lookup

        private static int lookup​(char[] table,
                                  int c)
      • nextPair

        private static long nextPair​(char[] table,
                                     int c,
                                     int ce,
                                     java.lang.CharSequence s16,
                                     int sIndex)
        Java returns a negative result (use the '~' operator) if sIndex is to be incremented. C++ modifies sIndex.
      • getPrimaries

        private static int getPrimaries​(int variableTop,
                                        int pair)
      • getSecondariesFromOneShortCE

        private static int getSecondariesFromOneShortCE​(int ce)
      • getSecondaries

        private static int getSecondaries​(int variableTop,
                                          int pair)
      • getCases

        private static int getCases​(int variableTop,
                                    boolean strengthIsPrimary,
                                    int pair)
      • getTertiaries

        private static int getTertiaries​(int variableTop,
                                         boolean withCaseBits,
                                         int pair)
      • getQuaternaries

        private static int getQuaternaries​(int variableTop,
                                           int pair)