Package com.ibm.icu.util
Class CodePointTrie
- java.lang.Object
-
- com.ibm.icu.util.CodePointMap
-
- com.ibm.icu.util.CodePointTrie
-
- All Implemented Interfaces:
java.lang.Iterable<CodePointMap.Range>
- Direct Known Subclasses:
CodePointTrie.Fast
,CodePointTrie.Small
public abstract class CodePointTrie extends CodePointMap
Immutable Unicode code point trie. Fast, reasonably compact, map from Unicode code points (U+0000..U+10FFFF) to integer values. For details see https://icu.unicode.org/design/struct/utrieThis class is not intended for public subclassing.
- See Also:
MutableCodePointTrie
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.ibm.icu.util.CodePointMap
CodePointMap.Range, CodePointMap.RangeOption, CodePointMap.StringIterator, CodePointMap.ValueFilter
-
-
Field Summary
Fields Modifier and Type Field Description private int[]
ascii
private static int
ASCII_LIMIT
private static int
BMP_INDEX_LENGTH
The length of the BMP index table.(package private) static int
CP_PER_INDEX_2_ENTRY
Number of code points per index-2 table entry.protected CodePointTrie.Data
data
Deprecated.This API is ICU internal only.protected int
dataLength
Deprecated.This API is ICU internal only.private int
dataNullOffset
Internal data null block offset, not shifted.private static int
ERROR_VALUE_NEG_DATA_OFFSET
Offset from dataLength (to be subtracted) for fetching the value returned for out-of-range code points and ill-formed UTF-8/16.(package private) static int
FAST_DATA_BLOCK_LENGTH
Number of entries in a data block for code points below the fast limit.private static int
FAST_DATA_MASK
Mask for getting the lower bits for the in-fast-data-block offset.(package private) static int
FAST_SHIFT
private static int
HIGH_VALUE_NEG_DATA_OFFSET
Offset from dataLength (to be subtracted) for fetching the value returned for code points highStart..U+10FFFF.protected int
highStart
Deprecated.This API is ICU internal only.private char[]
index
(package private) static int
INDEX_2_BLOCK_LENGTH
Number of entries in an index-2 block.(package private) static int
INDEX_2_MASK
Mask for getting the lower bits for the in-index-2-block offset.(package private) static int
INDEX_3_BLOCK_LENGTH
Number of entries in an index-3 block.private static int
INDEX_3_MASK
Mask for getting the lower bits for the in-index-3-block offset.private int
index3NullOffset
Internal index-3 null block offset.private static int
MAX_UNICODE
(package private) static int
NO_DATA_NULL_OFFSET
(package private) static int
NO_INDEX3_NULL_OFFSET
Value for index3NullOffset which indicates that there is no index-3 null block.private int
nullValue
private static int
OMITTED_BMP_INDEX_1_LENGTH
Number of index-1 entries for the BMP.private static int
OPTIONS_DATA_LENGTH_MASK
private static int
OPTIONS_DATA_NULL_OFFSET_MASK
private static int
OPTIONS_RESERVED_MASK
private static int
OPTIONS_VALUE_BITS_MASK
private static int
SHIFT_1
Shift size for getting the index-1 table offset.(package private) static int
SHIFT_1_2
Difference between two shift sizes, for getting an index-1 offset from an index-2 offset.private static int
SHIFT_2
Shift size for getting the index-2 table offset.(package private) static int
SHIFT_2_3
Difference between two shift sizes, for getting an index-2 offset from an index-3 offset.(package private) static int
SHIFT_3
Shift size for getting the index-3 table offset.(package private) static int
SMALL_DATA_BLOCK_LENGTH
Number of entries in a small data block.(package private) static int
SMALL_DATA_MASK
Mask for getting the lower bits for the in-small-data-block offset.private static int
SMALL_INDEX_LENGTH
(package private) static int
SMALL_LIMIT
private static int
SMALL_MAX
-
Constructor Summary
Constructors Modifier Constructor Description private
CodePointTrie(char[] index, CodePointTrie.Data data, int highStart, int index3NullOffset, int dataNullOffset)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description int
asciiGet(int c)
Returns a trie value for an ASCII code point, without range checking.protected abstract int
cpIndex(int c)
Deprecated.This API is ICU internal only.protected int
fastIndex(int c)
Deprecated.This API is ICU internal only.static CodePointTrie
fromBinary(CodePointTrie.Type type, CodePointTrie.ValueWidth valueWidth, java.nio.ByteBuffer bytes)
Creates a trie from its binary form, stored in the ByteBuffer starting at the current position.int
get(int c)
Returns the value for a code point as stored in the map, with range checking.boolean
getRange(int start, CodePointMap.ValueFilter filter, CodePointMap.Range range)
Sets the range object to a range of code points beginning with the start parameter.abstract CodePointTrie.Type
getType()
Returns the trie type.CodePointTrie.ValueWidth
getValueWidth()
Returns the number of bits in a trie data value.private int
internalSmallIndex(CodePointTrie.Type type, int c)
private static int
maybeFilterValue(int value, int trieNullValue, int nullValue, CodePointMap.ValueFilter filter)
protected int
smallIndex(CodePointTrie.Type type, int c)
Deprecated.This API is ICU internal only.int
toBinary(java.io.OutputStream os)
Writes a representation of the trie to the output stream.-
Methods inherited from class com.ibm.icu.util.CodePointMap
getRange, iterator, stringIterator
-
-
-
-
Field Detail
-
MAX_UNICODE
private static final int MAX_UNICODE
- See Also:
- Constant Field Values
-
ASCII_LIMIT
private static final int ASCII_LIMIT
- See Also:
- Constant Field Values
-
FAST_SHIFT
static final int FAST_SHIFT
- See Also:
- Constant Field Values
-
FAST_DATA_BLOCK_LENGTH
static final int FAST_DATA_BLOCK_LENGTH
Number of entries in a data block for code points below the fast limit. 64=0x40 @internal- See Also:
- Constant Field Values
-
FAST_DATA_MASK
private static final int FAST_DATA_MASK
Mask for getting the lower bits for the in-fast-data-block offset. @internal- See Also:
- Constant Field Values
-
SMALL_MAX
private static final int SMALL_MAX
- See Also:
- Constant Field Values
-
ERROR_VALUE_NEG_DATA_OFFSET
private static final int ERROR_VALUE_NEG_DATA_OFFSET
Offset from dataLength (to be subtracted) for fetching the value returned for out-of-range code points and ill-formed UTF-8/16.- See Also:
- Constant Field Values
-
HIGH_VALUE_NEG_DATA_OFFSET
private static final int HIGH_VALUE_NEG_DATA_OFFSET
Offset from dataLength (to be subtracted) for fetching the value returned for code points highStart..U+10FFFF.- See Also:
- Constant Field Values
-
BMP_INDEX_LENGTH
private static final int BMP_INDEX_LENGTH
The length of the BMP index table. 1024=0x400- See Also:
- Constant Field Values
-
SMALL_LIMIT
static final int SMALL_LIMIT
- See Also:
- Constant Field Values
-
SMALL_INDEX_LENGTH
private static final int SMALL_INDEX_LENGTH
- See Also:
- Constant Field Values
-
SHIFT_3
static final int SHIFT_3
Shift size for getting the index-3 table offset.- See Also:
- Constant Field Values
-
SHIFT_2
private static final int SHIFT_2
Shift size for getting the index-2 table offset.- See Also:
- Constant Field Values
-
SHIFT_1
private static final int SHIFT_1
Shift size for getting the index-1 table offset.- See Also:
- Constant Field Values
-
SHIFT_2_3
static final int SHIFT_2_3
Difference between two shift sizes, for getting an index-2 offset from an index-3 offset. 5=9-4- See Also:
- Constant Field Values
-
SHIFT_1_2
static final int SHIFT_1_2
Difference between two shift sizes, for getting an index-1 offset from an index-2 offset. 5=14-9- See Also:
- Constant Field Values
-
OMITTED_BMP_INDEX_1_LENGTH
private static final int OMITTED_BMP_INDEX_1_LENGTH
Number of index-1 entries for the BMP. (4) This part of the index-1 table is omitted from the serialized form.- See Also:
- Constant Field Values
-
INDEX_2_BLOCK_LENGTH
static final int INDEX_2_BLOCK_LENGTH
Number of entries in an index-2 block. 32=0x20- See Also:
- Constant Field Values
-
INDEX_2_MASK
static final int INDEX_2_MASK
Mask for getting the lower bits for the in-index-2-block offset.- See Also:
- Constant Field Values
-
CP_PER_INDEX_2_ENTRY
static final int CP_PER_INDEX_2_ENTRY
Number of code points per index-2 table entry. 512=0x200- See Also:
- Constant Field Values
-
INDEX_3_BLOCK_LENGTH
static final int INDEX_3_BLOCK_LENGTH
Number of entries in an index-3 block. 32=0x20- See Also:
- Constant Field Values
-
INDEX_3_MASK
private static final int INDEX_3_MASK
Mask for getting the lower bits for the in-index-3-block offset.- See Also:
- Constant Field Values
-
SMALL_DATA_BLOCK_LENGTH
static final int SMALL_DATA_BLOCK_LENGTH
Number of entries in a small data block. 16=0x10- See Also:
- Constant Field Values
-
SMALL_DATA_MASK
static final int SMALL_DATA_MASK
Mask for getting the lower bits for the in-small-data-block offset.- See Also:
- Constant Field Values
-
OPTIONS_DATA_LENGTH_MASK
private static final int OPTIONS_DATA_LENGTH_MASK
- See Also:
- Constant Field Values
-
OPTIONS_DATA_NULL_OFFSET_MASK
private static final int OPTIONS_DATA_NULL_OFFSET_MASK
- See Also:
- Constant Field Values
-
OPTIONS_RESERVED_MASK
private static final int OPTIONS_RESERVED_MASK
- See Also:
- Constant Field Values
-
OPTIONS_VALUE_BITS_MASK
private static final int OPTIONS_VALUE_BITS_MASK
- See Also:
- Constant Field Values
-
NO_INDEX3_NULL_OFFSET
static final int NO_INDEX3_NULL_OFFSET
Value for index3NullOffset which indicates that there is no index-3 null block. Bit 15 is unused for this value because this bit is used if the index-3 contains 18-bit indexes.- See Also:
- Constant Field Values
-
NO_DATA_NULL_OFFSET
static final int NO_DATA_NULL_OFFSET
- See Also:
- Constant Field Values
-
ascii
private final int[] ascii
-
index
private final char[] index
-
data
@Deprecated protected final CodePointTrie.Data data
Deprecated.This API is ICU internal only.
-
dataLength
@Deprecated protected final int dataLength
Deprecated.This API is ICU internal only.
-
highStart
@Deprecated protected final int highStart
Deprecated.This API is ICU internal only.Start of the last range which ends at U+10FFFF.
-
index3NullOffset
private final int index3NullOffset
Internal index-3 null block offset. Set to an impossibly high value (e.g., 0xffff) if there is no dedicated index-3 null block.
-
dataNullOffset
private final int dataNullOffset
Internal data null block offset, not shifted. Set to an impossibly high value (e.g., 0xfffff) if there is no dedicated data null block.
-
nullValue
private final int nullValue
-
-
Constructor Detail
-
CodePointTrie
private CodePointTrie(char[] index, CodePointTrie.Data data, int highStart, int index3NullOffset, int dataNullOffset)
-
-
Method Detail
-
fromBinary
public static CodePointTrie fromBinary(CodePointTrie.Type type, CodePointTrie.ValueWidth valueWidth, java.nio.ByteBuffer bytes)
Creates a trie from its binary form, stored in the ByteBuffer starting at the current position. Advances the buffer position to just after the trie data. Inverse oftoBinary(OutputStream)
.The data is copied from the buffer; later modification of the buffer will not affect the trie.
- Parameters:
type
- selects the trie type; this method throws an exception if the type does not match the binary data; use null to accept any typevalueWidth
- selects the number of bits in a data value; this method throws an exception if the valueWidth does not match the binary data; use null to accept any data value widthbytes
- a buffer containing the binary data of a CodePointTrie- Returns:
- the trie
- See Also:
MutableCodePointTrie(int, int)
,MutableCodePointTrie.buildImmutable(CodePointTrie.Type, CodePointTrie.ValueWidth)
,toBinary(OutputStream)
-
getType
public abstract CodePointTrie.Type getType()
Returns the trie type.- Returns:
- the trie type
-
getValueWidth
public final CodePointTrie.ValueWidth getValueWidth()
Returns the number of bits in a trie data value.- Returns:
- the number of bits in a trie data value
-
get
public int get(int c)
Returns the value for a code point as stored in the map, with range checking. Returns an implementation-defined error value if c is not in the range 0..U+10FFFF.- Specified by:
get
in classCodePointMap
- Parameters:
c
- the code point- Returns:
- the map value, or an implementation-defined error value if the code point is not in the range 0..U+10FFFF
-
asciiGet
public final int asciiGet(int c)
Returns a trie value for an ASCII code point, without range checking.- Parameters:
c
- the input code point; must be U+0000..U+007F- Returns:
- The ASCII code point's trie value.
-
maybeFilterValue
private static final int maybeFilterValue(int value, int trieNullValue, int nullValue, CodePointMap.ValueFilter filter)
-
getRange
public final boolean getRange(int start, CodePointMap.ValueFilter filter, CodePointMap.Range range)
Sets the range object to a range of code points beginning with the start parameter. The range start is the same as the start input parameter (even if there are preceding code points that have the same value). The range end is the last code point such that all those from start to there have the same value. Returns false if start is not 0..U+10FFFF. Can be used to efficiently iterate over all same-value ranges in a map. (This is normally faster than iterating over code points and get()ting each value, but may be much slower than a data structure that stores ranges directly.)If the
CodePointMap.ValueFilter
parameter is not null, then the value to be delivered is passed through that filter, and the return value is the end of the range where all values are modified to the same actual value. The value is unchanged if that parameter is null.Example:
int start = 0; CodePointMap.Range range = new CodePointMap.Range(); while (map.getRange(start, null, range)) { int end = range.getEnd(); int value = range.getValue(); // Work with the range start..end and its value. start = end + 1; }
- Specified by:
getRange
in classCodePointMap
- Parameters:
start
- range startfilter
- an object that may modify the map data value, or null if the values from the map are to be used unmodifiedrange
- the range object that will be set to the code point range and value- Returns:
- true if start is 0..U+10FFFF; otherwise no new range is fetched
-
toBinary
public final int toBinary(java.io.OutputStream os)
Writes a representation of the trie to the output stream. Inverse offromBinary(com.ibm.icu.util.CodePointTrie.Type, com.ibm.icu.util.CodePointTrie.ValueWidth, java.nio.ByteBuffer)
.- Parameters:
os
- the output stream- Returns:
- the number of bytes written
-
fastIndex
@Deprecated protected final int fastIndex(int c)
Deprecated.This API is ICU internal only.
-
smallIndex
@Deprecated protected final int smallIndex(CodePointTrie.Type type, int c)
Deprecated.This API is ICU internal only.
-
internalSmallIndex
private final int internalSmallIndex(CodePointTrie.Type type, int c)
-
cpIndex
@Deprecated protected abstract int cpIndex(int c)
Deprecated.This API is ICU internal only.
-
-