Class CollationIterator

    • Constructor Detail

      • CollationIterator

        public CollationIterator​(CollationData d)
        Partially constructs the iterator. In Java, we cache partially constructed iterators and finish their setup when starting to work on text (via reset(boolean) and the setText(numeric, ...) methods of subclasses). This avoids memory allocations for iterators that remain unused.

        In C++, there is only one constructor, and iterators are stack-allocated as needed.

      • CollationIterator

        public CollationIterator​(CollationData d,
                                 boolean numeric)
    • Method Detail

      • equals

        public boolean equals​(java.lang.Object other)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • resetToOffset

        public abstract void resetToOffset​(int newOffset)
        Resets the iterator state and sets the position to the specified offset. Subclasses must implement, and must call the parent class method, or CollationIterator.reset().
      • getOffset

        public abstract int getOffset()
      • nextCE

        public final long nextCE()
        Returns the next collation element.
      • fetchCEs

        public final int fetchCEs()
        Fetches all CEs.
        Returns:
        getCEsLength()
      • setCurrentCE

        final void setCurrentCE​(long ce)
        Overwrites the current CE (the last one returned by nextCE()).
      • previousCE

        public final long previousCE​(UVector32 offsets)
        Returns the previous collation element.
      • getCEsLength

        public final int getCEsLength()
      • getCE

        public final long getCE​(int i)
      • getCEs

        public final long[] getCEs()
      • clearCEs

        final void clearCEs()
      • clearCEsIfNoneRemaining

        public final void clearCEsIfNoneRemaining()
      • nextCodePoint

        public abstract int nextCodePoint()
        Returns the next code point (with post-increment). Public for identical-level comparison and for testing.
      • previousCodePoint

        public abstract int previousCodePoint()
        Returns the previous code point (with pre-decrement). Public for identical-level comparison and for testing.
      • reset

        protected final void reset()
      • reset

        protected final void reset​(boolean numeric)
        Resets the state as well as the numeric setting, and completes the initialization. Only exists in Java where we reset cached CollationIterator instances rather than stack-allocating temporary ones. (See also the constructor comments.)
      • handleNextCE32

        protected long handleNextCE32()
        Returns the next code point and its local CE32 value. Returns Collation.FALLBACK_CE32 at the end of the text (c<0) or when c's CE32 value is to be looked up in the base data (fallback). The code point is used for fallbacks, context and implicit weights. It is ignored when the returned CE32 is not special (e.g., FFFD_CE32). Returns the code point in bits 63..32 (signed) and the CE32 in bits 31..0.
      • makeCodePointAndCE32Pair

        protected long makeCodePointAndCE32Pair​(int c,
                                                int ce32)
      • handleGetTrailSurrogate

        protected char handleGetTrailSurrogate()
        Called when handleNextCE32() returns a LEAD_SURROGATE_TAG for a lead surrogate code unit. Returns the trail surrogate in that case and advances past it, if a trail surrogate follows the lead surrogate. Otherwise returns any other code unit and does not advance.
      • forbidSurrogateCodePoints

        protected boolean forbidSurrogateCodePoints()
        Returns:
        false if surrogate code points U+D800..U+DFFF map to their own implicit primary weights (for UTF-16), or true if they map to CE(U+FFFD) (for UTF-8)
      • forwardNumCodePoints

        protected abstract void forwardNumCodePoints​(int num)
      • backwardNumCodePoints

        protected abstract void backwardNumCodePoints​(int num)
      • getDataCE32

        protected int getDataCE32​(int c)
        Returns the CE32 from the data trie. Normally the same as data.getCE32(), but overridden in the builder. Call this only when the faster data.getCE32() cannot be used.
      • getCE32FromBuilderData

        protected int getCE32FromBuilderData​(int ce32)
      • appendCEsFromCE32

        protected final void appendCEsFromCE32​(CollationData d,
                                               int c,
                                               int ce32,
                                               boolean forward)
      • isSurrogate

        private static final boolean isSurrogate​(int c)
      • isLeadSurrogate

        protected static final boolean isLeadSurrogate​(int c)
      • isTrailSurrogate

        protected static final boolean isTrailSurrogate​(int c)
      • nextCEFromCE32

        private final long nextCEFromCE32​(CollationData d,
                                          int c,
                                          int ce32)
      • getCE32FromPrefix

        private final int getCE32FromPrefix​(CollationData d,
                                            int ce32)
      • nextSkippedCodePoint

        private final int nextSkippedCodePoint()
      • backwardNumSkipped

        private final void backwardNumSkipped​(int n)
      • nextCE32FromContraction

        private final int nextCE32FromContraction​(CollationData d,
                                                  int contractionCE32,
                                                  java.lang.CharSequence trieChars,
                                                  int trieOffset,
                                                  int ce32,
                                                  int c)
      • nextCE32FromDiscontiguousContraction

        private final int nextCE32FromDiscontiguousContraction​(CollationData d,
                                                               CharsTrie suffixes,
                                                               int ce32,
                                                               int lookAhead,
                                                               int c)
      • previousCEUnsafe

        private final long previousCEUnsafe​(int c,
                                            UVector32 offsets)
        Returns the previous CE when data.isUnsafeBackward(c, isNumeric).
      • appendNumericCEs

        private final void appendNumericCEs​(int ce32,
                                            boolean forward)
        Turns a string of digits (bytes 0..9) into a sequence of CEs that will sort in numeric order. Starts from this ce32's digit value and consumes the following/preceding digits. The digits string must not be empty and must not have leading zeros.
      • appendNumericSegmentCEs

        private final void appendNumericSegmentCEs​(java.lang.CharSequence digits)
        Turns 1..254 digits into a sequence of CEs. Called by appendNumericCEs() for each segment of at most 254 digits.