Package com.ibm.icu.impl
Class IntTrieBuilder
- java.lang.Object
-
- com.ibm.icu.impl.TrieBuilder
-
- com.ibm.icu.impl.IntTrieBuilder
-
public class IntTrieBuilder extends TrieBuilder
Builder class to manipulate and generate a trie. This is useful for ICU data in primitive types. Provides a compact way to store information that is indexed by Unicode values, such as character properties, types, keyboard values, etc. This is very useful when you have a block of Unicode data that contains significant values while the rest of the Unicode data is unused in the application or when you have a lot of redundance, such as where all 21,000 Han ideographs have the same value. However, lookup is much faster than a hash table. A trie of any primitive data type serves two purposes:- Fast access of the indexed values.
- Smaller memory footprint.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.ibm.icu.impl.TrieBuilder
TrieBuilder.DataManipulate
-
-
Field Summary
Fields Modifier and Type Field Description protected int[]
m_data_
protected int
m_initialValue_
private int
m_leadUnitValue_
-
Fields inherited from class com.ibm.icu.impl.TrieBuilder
BMP_INDEX_LENGTH_, DATA_BLOCK_LENGTH, DATA_GRANULARITY_, INDEX_SHIFT_, m_dataCapacity_, m_dataLength_, m_index_, m_indexLength_, m_isCompacted_, m_isLatin1Linear_, m_map_, MASK_, MAX_DATA_LENGTH_, MAX_INDEX_LENGTH_, OPTIONS_DATA_IS_32_BIT_, OPTIONS_INDEX_SHIFT_, OPTIONS_LATIN1_IS_LINEAR_, SHIFT_, SURROGATE_BLOCK_COUNT_
-
-
Constructor Summary
Constructors Constructor Description IntTrieBuilder(int[] aliasdata, int maxdatalength, int initialvalue, int leadunitvalue, boolean latin1linear)
Constructs a build tableIntTrieBuilder(IntTrieBuilder table)
Copy constructor
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private int
allocDataBlock()
private void
compact(boolean overlap)
Compact a folded build-time trie.private void
fillBlock(int block, int start, int limit, int value, boolean overwrite)
private static int
findSameDataBlock(int[] data, int dataLength, int otherBlock, int step)
Find the same data blockprivate void
fold(TrieBuilder.DataManipulate manipulate)
Fold the normalization data for supplementary code points into a compact area on top of the BMP-part of the trie index, with the lead surrogates indexing this compact area.private int
getDataBlock(int ch)
No error checking for illegal arguments.int
getValue(int ch)
Gets a 32 bit data from the table dataint
getValue(int ch, boolean[] inBlockZero)
Get a 32 bit data from the table dataIntTrie
serialize(TrieBuilder.DataManipulate datamanipulate, Trie.DataManipulate triedatamanipulate)
Serializes the build table with 32 bit dataint
serialize(java.io.OutputStream os, boolean reduceTo16Bits, TrieBuilder.DataManipulate datamanipulate)
Serializes the build table to an output stream.boolean
setRange(int start, int limit, int value, boolean overwrite)
Set a value in a range of code points [start..limit].boolean
setValue(int ch, int value)
Sets a 32 bit data in the table data-
Methods inherited from class com.ibm.icu.impl.TrieBuilder
equal_int, findSameIndexBlock, findUnusedBlocks, isInZeroBlock
-
-
-
-
Constructor Detail
-
IntTrieBuilder
public IntTrieBuilder(IntTrieBuilder table)
Copy constructor
-
IntTrieBuilder
public IntTrieBuilder(int[] aliasdata, int maxdatalength, int initialvalue, int leadunitvalue, boolean latin1linear)
Constructs a build table- Parameters:
aliasdata
- data to be filled into tablemaxdatalength
- maximum data length allowed in tableinitialvalue
- initial data valuelatin1linear
- is latin 1 to be linear
-
-
Method Detail
-
getValue
public int getValue(int ch)
Gets a 32 bit data from the table data- Parameters:
ch
- codepoint which data is to be retrieved- Returns:
- the 32 bit data
-
getValue
public int getValue(int ch, boolean[] inBlockZero)
Get a 32 bit data from the table data- Parameters:
ch
- code point for which data is to be retrieved.inBlockZero
- Output parameter, inBlockZero[0] returns true if the char maps into block zero, otherwise false.- Returns:
- the 32 bit data value.
-
setValue
public boolean setValue(int ch, int value)
Sets a 32 bit data in the table data- Parameters:
ch
- codepoint which data is to be setvalue
- to set- Returns:
- true if the set is successful, otherwise if the table has been compacted return false
-
serialize
public IntTrie serialize(TrieBuilder.DataManipulate datamanipulate, Trie.DataManipulate triedatamanipulate)
Serializes the build table with 32 bit data- Parameters:
datamanipulate
- builder raw fold method implementationtriedatamanipulate
- result trie fold method- Returns:
- a new trie
-
serialize
public int serialize(java.io.OutputStream os, boolean reduceTo16Bits, TrieBuilder.DataManipulate datamanipulate) throws java.io.IOException
Serializes the build table to an output stream. Compacts the build-time trie after all values are set, and then writes the serialized form onto an output stream. After this, this build-time Trie can only be serialized again and/or closed; no further values can be added. This function is the rough equivalent of utrie_seriaize() in ICU4C.- Parameters:
os
- the output stream to which the seriaized trie will be written. If nul, the function still returns the size of the serialized Trie.reduceTo16Bits
- If true, reduce the data size to 16 bits. The resulting serialized form can then be used to create a CharTrie.datamanipulate
- builder raw fold method implementation- Returns:
- the number of bytes written to the output stream.
- Throws:
java.io.IOException
-
setRange
public boolean setRange(int start, int limit, int value, boolean overwrite)
Set a value in a range of code points [start..limit]. All code points c with start <= c < limit will get the value if overwrite is true or if the old value is 0.- Parameters:
start
- the first code point to get the valuelimit
- one past the last code point to get the valuevalue
- the valueoverwrite
- flag for whether old non-initial values are to be overwritten- Returns:
- false if a failure occurred (illegal argument or data array overrun)
-
allocDataBlock
private int allocDataBlock()
-
getDataBlock
private int getDataBlock(int ch)
No error checking for illegal arguments.- Parameters:
ch
- codepoint to look for- Returns:
- -1 if no new data block available (out of memory in data array)
-
compact
private void compact(boolean overlap)
Compact a folded build-time trie. The compaction - removes blocks that are identical with earlier ones - overlaps adjacent blocks as much as possible (if overlap == true) - moves blocks in steps of the data granularity - moves and overlaps blocks that overlap with multiple values in the overlap region It does not - try to move and overlap blocks that are not already adjacent- Parameters:
overlap
- flag
-
findSameDataBlock
private static final int findSameDataBlock(int[] data, int dataLength, int otherBlock, int step)
Find the same data block- Parameters:
data
- arraydataLength
-otherBlock
-step
-
-
fold
private final void fold(TrieBuilder.DataManipulate manipulate)
Fold the normalization data for supplementary code points into a compact area on top of the BMP-part of the trie index, with the lead surrogates indexing this compact area. Duplicate the index values for lead surrogates: From inside the BMP area, where some may be overridden with folded values, to just after the BMP area, where they can be retrieved for code point lookups.- Parameters:
manipulate
- fold implementation
-
fillBlock
private void fillBlock(int block, int start, int limit, int value, boolean overwrite)
-
-