Package com.ibm.icu.impl
Class UnicodeMap<T>
- java.lang.Object
-
- com.ibm.icu.impl.UnicodeMap<T>
-
- All Implemented Interfaces:
StringTransform
,Transform<java.lang.String,java.lang.String>
,Freezable<UnicodeMap<T>>
,java.lang.Cloneable
,java.lang.Iterable<java.lang.String>
public final class UnicodeMap<T> extends java.lang.Object implements java.lang.Cloneable, Freezable<UnicodeMap<T>>, StringTransform, java.lang.Iterable<java.lang.String>
Class for mapping Unicode characters and strings to values, optimized for single code points, where ranges of code points have the same value. Much smaller storage than using HashMap, and much faster and more compact than a list of UnicodeSets. The API design mimics Mapbut can't extend it due to some necessary changes (much as UnicodeSet mimics Set ). Note that nulls are not permitted as values; that is, a put(x,null) is the same as remove(x).
At this point "" is also not allowed as a key, although that may change.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
UnicodeMap.Composer<T>
Used to add complex values, where the value isn't replaced but in some sense composedstatic class
UnicodeMap.EntryRange<T>
Struct-like class used to iterate over a UnicodeMap in a for loop.private class
UnicodeMap.EntryRanges
private class
UnicodeMap.EntrySetX
private class
UnicodeMap.IteratorX
-
Field Summary
Fields Modifier and Type Field Description (package private) static boolean
ASSERTIONS
For serializationprivate java.util.LinkedHashSet<T>
availableValues
(package private) static boolean
DEBUG_WRITE
private boolean
errorOnReset
(package private) static long
GROWTH_GAP
(package private) static long
GROWTH_PERCENT
private int
lastIndex
private int
length
private boolean
locked
private boolean
staleAvailableValues
private java.util.TreeMap<java.lang.String,T>
stringMap
private int[]
transitions
(package private) T[]
values
-
Constructor Summary
Constructors Constructor Description UnicodeMap()
UnicodeMap(UnicodeMap other)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description private int
__findIndex(int codepoint)
private void
_checkFind(int codepoint, int value)
(package private) void
_checkInvariants()
private int
_findIndex(int c)
Finds an index such that inversionList[i] <= codepoint < inversionList[i+1] Assumes that 0 <= codepoint <= 0x10FFFFprivate void
_insertGapAt(int index, int count)
Add a gap from index to index+count-1.private UnicodeMap
_put(int codepoint, T value)
Associates code point with value.private UnicodeMap
_putAll(int startCodePoint, int endCodePoint, T value)
private void
_removeAt(int index, int count)
Remove the items from index through index+count-1.<U extends java.util.Map<T,UnicodeSet>>
UaddInverseTo(U target)
Gets the inverse of this map, adding to the target.static boolean
areEqual(java.lang.Object a, java.lang.Object b)
UnicodeMap<T>
clear()
UnicodeMap<T>
cloneAsThawed()
Standard clone.UnicodeMap<T>
composeWith(UnicodeMap<T> other, UnicodeMap.Composer<T> composer)
UnicodeMap<T>
composeWith(UnicodeSet set, T value, UnicodeMap.Composer<T> composer)
boolean
containsKey(int key)
boolean
containsKey(java.lang.String key)
boolean
containsValue(T value)
java.lang.Iterable<UnicodeMap.EntryRange<T>>
entryRanges()
Returns an Iterable over EntryRange, designed for efficient for loops over UnicodeMaps.java.lang.Iterable<java.util.Map.Entry<java.lang.String,T>>
entrySet()
boolean
equals(java.lang.Object other)
static int
findCommonPrefix(java.lang.String last, java.lang.String s)
Utility to find the maximal common prefix of two strings.UnicodeMap<T>
freeze()
Freezes the object.static <T> java.util.Map<T,UnicodeSet>
freeze(java.util.Map<T,UnicodeSet> target)
Freeze an inverse map.T
get(int codepoint)
Gets the value associated with a given code point.T
get(java.lang.String value)
Gets the value associated with a given code point.java.util.Collection<T>
getAvailableValues()
Old form for compatibility<U extends java.util.Collection<T>>
UgetAvailableValues(U result)
Old form for compatibilityboolean
getErrorOnReset()
java.util.Set<java.lang.String>
getNonRangeStrings()
Get the strings that are not in the ranges.int
getRangeCount()
Get the number of ranges; used for getRangeStart/End.int
getRangeEnd(int range)
Get the start of a range.int
getRangeStart(int range)
Get the start of a range.T
getRangeValue(int range)
Get the value for the range.UnicodeSet
getSet(T value)
Old form for compatibilityUnicodeSet
getSet(T value, UnicodeSet result)
Old form for compatibilityT
getValue(int key)
Old form for compatibilityT
getValue(java.lang.String key)
Old form for compatibilityint
hashCode()
boolean
isEmpty()
boolean
isFrozen()
Determines whether the object has been frozen or not.java.util.Iterator<java.lang.String>
iterator()
UnicodeSet
keySet()
Returns the keyset consisting of all the keys that would produce (non-null) values.UnicodeSet
keySet(T value)
Returns the keyset consisting of all the keys that would produce the given value.UnicodeSet
keySet(T value, UnicodeSet result)
Returns the keyset consisting of all the keys that would produce the given value.UnicodeMap<T>
put(int codepoint, T value)
Sets the codepoint value.UnicodeMap<T>
put(java.lang.String string, T value)
Sets the codepoint value.UnicodeMap<T>
putAll(int startCodePoint, int endCodePoint, T value)
Adds bunch o' codepoints; otherwise like add.UnicodeMap<T>
putAll(UnicodeMap<T> unicodeMap)
Add all the (main) values from a UnicodeMapUnicodeMap<T>
putAll(UnicodeSet codepoints, T value)
Adds bunch o' codepoints; otherwise like put.UnicodeMap<T>
putAll(java.util.Map<? extends java.lang.String,? extends T> map)
<U extends java.util.Map<java.lang.Integer,T>>
UputAllCodepointsInto(U map)
Utility for extracting mapUnicodeMap<T>
putAllFiltered(UnicodeMap<T> prop, UnicodeSet filter)
Add all the (main) values from a Unicode propertyUnicodeMap<T>
putAllIn(java.util.Map<? super java.lang.String,? super T> map)
Deprecated.<U extends java.util.Map<java.lang.String,T>>
UputAllInto(U map)
Utility for extracting mapUnicodeMap<T>
putAllInverse(java.util.Map<T,UnicodeSet> source)
UnicodeMap<T>
remove(int key)
UnicodeMap<T>
remove(java.lang.String key)
UnicodeMap<T>
removeAll(UnicodeMap<T> reference)
UnicodeMap<T>
removeAll(UnicodeSet set)
private UnicodeMap<T>
removeRetainAll(UnicodeMap<T> reference, boolean remove)
UnicodeMap<T>
retainAll(UnicodeMap<T> reference)
UnicodeMap<T>
retainAll(UnicodeSet set)
UnicodeMap<T>
setErrorOnReset(boolean errorOnReset)
Puts the UnicodeMap into a state whereby new mappings are accepted, but changes to old mappings cause an exception.UnicodeMap<T>
setMissing(T value)
Set the currently unmapped Unicode code points to the given value.int
size()
java.util.Set<java.lang.String>
stringKeys()
Returns the keys that consist of multiple code points.java.lang.String
toString()
java.lang.String
toString(java.util.Comparator<T> collected)
java.lang.String
transform(java.lang.String source)
Change a new string from the source string according to the mappings.java.util.Set<T>
values()
Convenience method<U extends java.util.Collection<T>>
Uvalues(U result)
Returns the list of possible values.
-
-
-
Field Detail
-
ASSERTIONS
static final boolean ASSERTIONS
For serialization- See Also:
- Constant Field Values
-
GROWTH_PERCENT
static final long GROWTH_PERCENT
- See Also:
- Constant Field Values
-
GROWTH_GAP
static final long GROWTH_GAP
- See Also:
- Constant Field Values
-
length
private int length
-
transitions
private int[] transitions
-
values
T[] values
-
availableValues
private java.util.LinkedHashSet<T> availableValues
-
staleAvailableValues
private transient boolean staleAvailableValues
-
errorOnReset
private transient boolean errorOnReset
-
locked
private transient volatile boolean locked
-
lastIndex
private int lastIndex
-
stringMap
private java.util.TreeMap<java.lang.String,T> stringMap
-
DEBUG_WRITE
static final boolean DEBUG_WRITE
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
UnicodeMap
public UnicodeMap()
-
UnicodeMap
public UnicodeMap(UnicodeMap other)
-
-
Method Detail
-
clear
public UnicodeMap<T> clear()
-
equals
public boolean equals(java.lang.Object other)
- Overrides:
equals
in classjava.lang.Object
-
areEqual
public static boolean areEqual(java.lang.Object a, java.lang.Object b)
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
cloneAsThawed
public UnicodeMap<T> cloneAsThawed()
Standard clone. Warning, as with Collections, does not do deep clone.- Specified by:
cloneAsThawed
in interfaceFreezable<T>
-
_checkInvariants
void _checkInvariants()
-
_findIndex
private int _findIndex(int c)
Finds an index such that inversionList[i] <= codepoint < inversionList[i+1] Assumes that 0 <= codepoint <= 0x10FFFF- Parameters:
codepoint
-- Returns:
- the index
-
_checkFind
private void _checkFind(int codepoint, int value)
-
__findIndex
private int __findIndex(int codepoint)
-
_removeAt
private void _removeAt(int index, int count)
Remove the items from index through index+count-1. Logically reduces the size of the internal arrays.- Parameters:
index
-count
-
-
_insertGapAt
private void _insertGapAt(int index, int count)
Add a gap from index to index+count-1. The values there are undefined, and must be set. Logically grows arrays to accommodate. Actual growth is limited- Parameters:
index
-count
-
-
_put
private UnicodeMap _put(int codepoint, T value)
Associates code point with value. Removes any previous association. All code that calls this MUST check for frozen first!- Parameters:
codepoint
-value
-- Returns:
- this, for chaining
-
_putAll
private UnicodeMap _putAll(int startCodePoint, int endCodePoint, T value)
-
put
public UnicodeMap<T> put(int codepoint, T value)
Sets the codepoint value.- Parameters:
codepoint
-value
-- Returns:
- this (for chaining)
-
put
public UnicodeMap<T> put(java.lang.String string, T value)
Sets the codepoint value.- Parameters:
codepoint
-value
-- Returns:
- this (for chaining)
-
putAll
public UnicodeMap<T> putAll(UnicodeSet codepoints, T value)
Adds bunch o' codepoints; otherwise like put.- Parameters:
codepoints
-value
-- Returns:
- this (for chaining)
-
putAll
public UnicodeMap<T> putAll(int startCodePoint, int endCodePoint, T value)
Adds bunch o' codepoints; otherwise like add.- Parameters:
startCodePoint
-endCodePoint
-value
-- Returns:
- this (for chaining)
-
putAll
public UnicodeMap<T> putAll(UnicodeMap<T> unicodeMap)
Add all the (main) values from a UnicodeMap- Parameters:
unicodeMap
- the property to add to the map- Returns:
- this (for chaining)
-
putAllFiltered
public UnicodeMap<T> putAllFiltered(UnicodeMap<T> prop, UnicodeSet filter)
Add all the (main) values from a Unicode property- Parameters:
prop
- the property to add to the map- Returns:
- this (for chaining)
-
setMissing
public UnicodeMap<T> setMissing(T value)
Set the currently unmapped Unicode code points to the given value.- Parameters:
value
- the value to set- Returns:
- this (for chaining)
-
keySet
public UnicodeSet keySet(T value, UnicodeSet result)
Returns the keyset consisting of all the keys that would produce the given value. Deposits into result if it is not null. Remember to clear if you just want the new values.
-
keySet
public UnicodeSet keySet(T value)
Returns the keyset consisting of all the keys that would produce the given value. the new values.
-
keySet
public UnicodeSet keySet()
Returns the keyset consisting of all the keys that would produce (non-null) values.
-
values
public <U extends java.util.Collection<T>> U values(U result)
Returns the list of possible values. Deposits each non-null value into result. Creates result if it is null. Remember to clear result if you are not appending to existing collection.- Parameters:
result
-- Returns:
- result
-
values
public java.util.Set<T> values()
Convenience method
-
get
public T get(int codepoint)
Gets the value associated with a given code point. Returns null, if there is no such value.- Parameters:
codepoint
-- Returns:
- the value
-
get
public T get(java.lang.String value)
Gets the value associated with a given code point. Returns null, if there is no such value.- Parameters:
codepoint
-- Returns:
- the value
-
transform
public java.lang.String transform(java.lang.String source)
Change a new string from the source string according to the mappings. For each code point cp, if getValue(cp) is null, append the character, otherwise append getValue(cp).toString() TODO: extend to strings- Specified by:
transform
in interfaceStringTransform
- Specified by:
transform
in interfaceTransform<java.lang.String,java.lang.String>
- Parameters:
source
-- Returns:
-
composeWith
public UnicodeMap<T> composeWith(UnicodeMap<T> other, UnicodeMap.Composer<T> composer)
-
composeWith
public UnicodeMap<T> composeWith(UnicodeSet set, T value, UnicodeMap.Composer<T> composer)
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
toString
public java.lang.String toString(java.util.Comparator<T> collected)
-
getErrorOnReset
public boolean getErrorOnReset()
- Returns:
- Returns the errorOnReset value.
-
setErrorOnReset
public UnicodeMap<T> setErrorOnReset(boolean errorOnReset)
Puts the UnicodeMap into a state whereby new mappings are accepted, but changes to old mappings cause an exception.- Parameters:
errorOnReset
- The errorOnReset to set.
-
isFrozen
public boolean isFrozen()
Description copied from interface:Freezable
Determines whether the object has been frozen or not.
-
freeze
public UnicodeMap<T> freeze()
Description copied from interface:Freezable
Freezes the object.
-
findCommonPrefix
public static int findCommonPrefix(java.lang.String last, java.lang.String s)
Utility to find the maximal common prefix of two strings. TODO: fix supplemental support
-
getRangeCount
public int getRangeCount()
Get the number of ranges; used for getRangeStart/End. The ranges together cover all of the single-codepoint keys in the UnicodeMap. Other keys can be gotten with getStrings().
-
getRangeStart
public int getRangeStart(int range)
Get the start of a range. All code points between start and end are in the UnicodeMap's keyset.
-
getRangeEnd
public int getRangeEnd(int range)
Get the start of a range. All code points between start and end are in the UnicodeMap's keyset.
-
getRangeValue
public T getRangeValue(int range)
Get the value for the range.
-
getNonRangeStrings
public java.util.Set<java.lang.String> getNonRangeStrings()
Get the strings that are not in the ranges. Returns null if there are none.- Returns:
-
containsKey
public boolean containsKey(java.lang.String key)
-
containsKey
public boolean containsKey(int key)
-
containsValue
public boolean containsValue(T value)
-
isEmpty
public boolean isEmpty()
-
putAll
public UnicodeMap<T> putAll(java.util.Map<? extends java.lang.String,? extends T> map)
-
putAllIn
public UnicodeMap<T> putAllIn(java.util.Map<? super java.lang.String,? super T> map)
Deprecated.Utility for extracting map
-
putAllInto
public <U extends java.util.Map<java.lang.String,T>> U putAllInto(U map)
Utility for extracting map
-
putAllCodepointsInto
public <U extends java.util.Map<java.lang.Integer,T>> U putAllCodepointsInto(U map)
Utility for extracting map
-
remove
public UnicodeMap<T> remove(java.lang.String key)
-
remove
public UnicodeMap<T> remove(int key)
-
size
public int size()
-
entrySet
public java.lang.Iterable<java.util.Map.Entry<java.lang.String,T>> entrySet()
-
entryRanges
public java.lang.Iterable<UnicodeMap.EntryRange<T>> entryRanges()
Returns an Iterable over EntryRange, designed for efficient for loops over UnicodeMaps. Caution: For efficiency, the EntryRange may be reused, so the EntryRange may change on each iteration! The value is guaranteed never to be null. The entryRange.string values (non-null) are after all the ranges.- Returns:
- entry range, for for loops
-
iterator
public java.util.Iterator<java.lang.String> iterator()
- Specified by:
iterator
in interfacejava.lang.Iterable<T>
-
getValue
public T getValue(java.lang.String key)
Old form for compatibility
-
getValue
public T getValue(int key)
Old form for compatibility
-
getAvailableValues
public java.util.Collection<T> getAvailableValues()
Old form for compatibility
-
getAvailableValues
public <U extends java.util.Collection<T>> U getAvailableValues(U result)
Old form for compatibility
-
getSet
public UnicodeSet getSet(T value)
Old form for compatibility
-
getSet
public UnicodeSet getSet(T value, UnicodeSet result)
Old form for compatibility
-
removeAll
public final UnicodeMap<T> removeAll(UnicodeSet set)
-
removeAll
public final UnicodeMap<T> removeAll(UnicodeMap<T> reference)
-
retainAll
public final UnicodeMap<T> retainAll(UnicodeSet set)
-
retainAll
public final UnicodeMap<T> retainAll(UnicodeMap<T> reference)
-
removeRetainAll
private final UnicodeMap<T> removeRetainAll(UnicodeMap<T> reference, boolean remove)
-
stringKeys
public final java.util.Set<java.lang.String> stringKeys()
Returns the keys that consist of multiple code points.- Returns:
-
addInverseTo
public <U extends java.util.Map<T,UnicodeSet>> U addInverseTo(U target)
Gets the inverse of this map, adding to the target. Like putAllIn- Returns:
-
freeze
public static <T> java.util.Map<T,UnicodeSet> freeze(java.util.Map<T,UnicodeSet> target)
Freeze an inverse map.- Parameters:
target
-- Returns:
-
putAllInverse
public UnicodeMap<T> putAllInverse(java.util.Map<T,UnicodeSet> source)
- Parameters:
target
-- Returns:
-
-