Class Trie2Writable

  • All Implemented Interfaces:
    java.lang.Iterable<Trie2.Range>

    public class Trie2Writable
    extends Trie2
    • Field Detail

      • UTRIE2_MAX_INDEX_LENGTH

        private static final int UTRIE2_MAX_INDEX_LENGTH
        Maximum length of the runtime index array. Limited by its own 16-bit index values, and by uint16_t UTrie2Header.indexLength. (The actual maximum length is lower, (0x110000>>UTRIE2_SHIFT_2)+UTRIE2_UTF8_2B_INDEX_2_LENGTH+UTRIE2_MAX_INDEX_1_LENGTH.)
        See Also:
        Constant Field Values
      • UTRIE2_MAX_DATA_LENGTH

        private static final int UTRIE2_MAX_DATA_LENGTH
        Maximum length of the runtime data array. Limited by 16-bit index values that are left-shifted by UTRIE2_INDEX_SHIFT, and by uint16_t UTrie2Header.shiftedDataLength.
        See Also:
        Constant Field Values
      • UNEWTRIE2_INITIAL_DATA_LENGTH

        private static final int UNEWTRIE2_INITIAL_DATA_LENGTH
        See Also:
        Constant Field Values
      • UNEWTRIE2_MEDIUM_DATA_LENGTH

        private static final int UNEWTRIE2_MEDIUM_DATA_LENGTH
        See Also:
        Constant Field Values
      • UNEWTRIE2_INDEX_2_NULL_OFFSET

        private static final int UNEWTRIE2_INDEX_2_NULL_OFFSET
        The null index-2 block, following the gap in the index-2 table.
        See Also:
        Constant Field Values
      • UNEWTRIE2_INDEX_2_START_OFFSET

        private static final int UNEWTRIE2_INDEX_2_START_OFFSET
        The start of allocated index-2 blocks.
        See Also:
        Constant Field Values
      • UNEWTRIE2_DATA_NULL_OFFSET

        private static final int UNEWTRIE2_DATA_NULL_OFFSET
        The null data block. Length 64=0x40 even if UTRIE2_DATA_BLOCK_LENGTH is smaller, to work with 6-bit trail bytes from 2-byte UTF-8.
        See Also:
        Constant Field Values
      • UNEWTRIE2_DATA_START_OFFSET

        private static final int UNEWTRIE2_DATA_START_OFFSET
        The start of allocated data blocks.
        See Also:
        Constant Field Values
      • UNEWTRIE2_DATA_0800_OFFSET

        private static final int UNEWTRIE2_DATA_0800_OFFSET
        The start of data blocks for U+0800 and above. Below, compaction uses a block length of 64 for 2-byte UTF-8. From here on, compaction uses UTRIE2_DATA_BLOCK_LENGTH. Data values for 0x780 code points beyond ASCII.
        See Also:
        Constant Field Values
      • index1

        private int[] index1
      • index2

        private int[] index2
      • data

        private int[] data
      • index2Length

        private int index2Length
      • dataCapacity

        private int dataCapacity
      • firstFreeBlock

        private int firstFreeBlock
      • index2NullOffset

        private int index2NullOffset
      • isCompacted

        private boolean isCompacted
      • map

        private int[] map
      • UTRIE2_DEBUG

        private boolean UTRIE2_DEBUG
    • Constructor Detail

      • Trie2Writable

        public Trie2Writable​(int initialValueP,
                             int errorValueP)
        Create a new, empty, writable Trie2. 32-bit data values are used.
        Parameters:
        initialValueP - the initial value that is set for all code points
        errorValueP - the value for out-of-range code points and illegal UTF-8
      • Trie2Writable

        public Trie2Writable​(Trie2 source)
        Create a new build time (modifiable) Trie2 whose contents are the same as the source Trie2.
        Parameters:
        source - the source Trie2. Its contents will be copied into the new Trie2.
    • Method Detail

      • init

        private void init​(int initialValueP,
                          int errorValueP)
      • isInNullBlock

        private boolean isInNullBlock​(int c,
                                      boolean forLSCP)
      • allocIndex2Block

        private int allocIndex2Block()
      • getIndex2Block

        private int getIndex2Block​(int c,
                                   boolean forLSCP)
      • allocDataBlock

        private int allocDataBlock​(int copyBlock)
      • releaseDataBlock

        private void releaseDataBlock​(int block)
      • isWritableBlock

        private boolean isWritableBlock​(int block)
      • setIndex2Entry

        private void setIndex2Entry​(int i2,
                                    int block)
      • getDataBlock

        private int getDataBlock​(int c,
                                 boolean forLSCP)
        No error checking for illegal arguments.
      • set

        public Trie2Writable set​(int c,
                                 int value)
        Set a value for a code point.
        Parameters:
        c - the code point
        value - the value
      • set

        private Trie2Writable set​(int c,
                                  boolean forLSCP,
                                  int value)
      • uncompact

        private void uncompact()
      • writeBlock

        private void writeBlock​(int block,
                                int value)
      • fillBlock

        private void fillBlock​(int block,
                               int start,
                               int limit,
                               int value,
                               int initialValue,
                               boolean overwrite)
        initialValue is ignored if overwrite=true
      • setRange

        public Trie2Writable setRange​(int start,
                                      int end,
                                      int value,
                                      boolean overwrite)
        Set a value in a range of code points [start..end]. All code points c with start<=c<=end will get the value if overwrite is true or if the old value is the initial value.
        Parameters:
        start - the first code point to get the value
        end - the last code point to get the value (inclusive)
        value - the value
        overwrite - flag for whether old non-initial values are to be overwritten
      • setRange

        public Trie2Writable setRange​(Trie2.Range range,
                                      boolean overwrite)
        Set the values from a Trie2.Range. All code points within the range will get the value if overwrite is true or if the old value is the initial value. Ranges with the lead surrogate flag set will set the alternate lead-surrogate values in the Trie, rather than the code point values. This function is intended to work with the ranges produced when iterating the contents of a source Trie.
        Parameters:
        range - contains the range of code points and the value to be set.
        overwrite - flag for whether old non-initial values are to be overwritten
      • setForLeadSurrogateCodeUnit

        public Trie2Writable setForLeadSurrogateCodeUnit​(char codeUnit,
                                                         int value)
        Set a value for a UTF-16 code unit. Note that a Trie2 stores separate values for supplementary code points in the lead surrogate range (accessed via the plain set() and get() interfaces) and for lead surrogate code units. The lead surrogate code unit values are set via this function and read by the function getFromU16SingleLead(). For code units outside of the lead surrogate range, this function behaves identically to set().
        Parameters:
        codeUnit - A UTF-16 code unit.
        value - the value to be stored in the Trie2.
      • get

        public int get​(int codePoint)
        Get the value for a code point as stored in the Trie2.
        Specified by:
        get in class Trie2
        Parameters:
        codePoint - the code point
        Returns:
        the value
      • get

        private int get​(int c,
                        boolean fromLSCP)
      • getFromU16SingleLead

        public int getFromU16SingleLead​(char c)
        Get a trie value for a UTF-16 code unit. This function returns the same value as get() if the input character is outside of the lead surrogate range There are two values stored in a Trie for inputs in the lead surrogate range. This function returns the alternate value, while Trie2.get() returns the main value.
        Specified by:
        getFromU16SingleLead in class Trie2
        Parameters:
        c - the code point or lead surrogate value.
        Returns:
        the value
      • equal_int

        private boolean equal_int​(int[] a,
                                  int s,
                                  int t,
                                  int length)
      • findSameIndex2Block

        private int findSameIndex2Block​(int index2Length,
                                        int otherBlock)
      • findSameDataBlock

        private int findSameDataBlock​(int dataLength,
                                      int otherBlock,
                                      int blockLength)
      • findHighStart

        private int findHighStart​(int highValue)
      • compactData

        private void compactData()
      • compactIndex2

        private void compactIndex2()
      • compactTrie

        private void compactTrie()
      • toTrie2_16

        public Trie2_16 toTrie2_16()
        Produce an optimized, read-only Trie2_16 from this writable Trie. The data values outside of the range that will fit in a 16 bit unsigned value will be truncated.
      • toTrie2_32

        public Trie2_32 toTrie2_32()
        Produce an optimized, read-only Trie2_32 from this writable Trie.