Class CollationKey

  • All Implemented Interfaces:
    java.lang.Comparable<CollationKey>

    public final class CollationKey
    extends java.lang.Object
    implements java.lang.Comparable<CollationKey>
    A CollationKey represents a String under the rules of a specific Collator object. Comparing two CollationKeys returns the relative order of the Strings they represent.

    Since the rule set of Collators can differ, the sort orders of the same string under two different Collators might differ. Hence comparing CollationKeys generated from different Collators can give incorrect results.

    Both the method CollationKey.compareTo(CollationKey) and the method Collator.compare(String, String) compare two strings and returns their relative order. The performance characteristics of these two approaches can differ. Note that collation keys are often less efficient than simply doing comparison. For more details, see the ICU User Guide.

    During the construction of a CollationKey, the entire source string is examined and processed into a series of bits terminated by a null, that are stored in the CollationKey. When CollationKey.compareTo(CollationKey) executes, it performs bitwise comparison on the bit sequences. This can incurs startup cost when creating the CollationKey, but once the key is created, binary comparisons are fast. This approach is recommended when the same strings are to be compared over and over again.

    On the other hand, implementations of Collator.compare(String, String) can examine and process the strings only until the first characters differing in order. This approach is recommended if the strings are to be compared only once.

    More information about the composition of the bit sequence can be found in the user guide.

    The following example shows how CollationKeys can be used to sort a list of Strings.

     // Create an array of CollationKeys for the Strings to be sorted.
     Collator myCollator = Collator.getInstance();
     CollationKey[] keys = new CollationKey[3];
     keys[0] = myCollator.getCollationKey("Tom");
     keys[1] = myCollator.getCollationKey("Dick");
     keys[2] = myCollator.getCollationKey("Harry");
     sort( keys );
     
    //...
    // Inside body of sort routine, compare keys this way if( keys[i].compareTo( keys[j] ) > 0 ) // swap keys[i] and keys[j]
    //...
    // Finally, when we've returned from sort. System.out.println( keys[0].getSourceString() ); System.out.println( keys[1].getSourceString() ); System.out.println( keys[2].getSourceString() );

    This class is not subclassable

    See Also:
    Collator, RuleBasedCollator
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  CollationKey.BoundMode
      Options that used in the API CollationKey.getBound() for getting a CollationKey based on the bound mode requested.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private int m_hashCode_
      Hash code for the key
      private byte[] m_key_
      Sequence of bytes that represents the sort key
      private int m_length_
      Gets the length of this CollationKey
      private java.lang.String m_source_
      Source string this CollationKey represents
      private static int MERGE_SEPERATOR_
      Collation key merge seperator
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
        CollationKey​(java.lang.String source, byte[] key)
      CollationKey constructor.
      private CollationKey​(java.lang.String source, byte[] key, int length)
      Private constructor, takes a length argument so it need not be lazy-evaluated.
        CollationKey​(java.lang.String source, RawCollationKey key)
      CollationKey constructor that forces key to release its internal byte array for adoption.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int compareTo​(CollationKey target)
      Compare this CollationKey to another CollationKey.
      boolean equals​(CollationKey target)
      Compare this CollationKey and the argument target CollationKey for equality.
      boolean equals​(java.lang.Object target)
      Compare this CollationKey and the specified Object for equality.
      CollationKey getBound​(int boundType, int noOfLevels)
      Produces a bound for the sort order of a given collation key and a strength level.
      private int getLength()
      Gets the length of the CollationKey
      java.lang.String getSourceString()
      Return the source string that this CollationKey represents.
      int hashCode()
      Returns a hash code for this CollationKey.
      CollationKey merge​(CollationKey source)
      Merges this CollationKey with another.
      byte[] toByteArray()
      Duplicates and returns the value of this CollationKey as a sequence of big-endian bytes terminated by a null.
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • m_key_

        private byte[] m_key_
        Sequence of bytes that represents the sort key
      • m_source_

        private java.lang.String m_source_
        Source string this CollationKey represents
      • m_hashCode_

        private int m_hashCode_
        Hash code for the key
      • m_length_

        private int m_length_
        Gets the length of this CollationKey
      • MERGE_SEPERATOR_

        private static final int MERGE_SEPERATOR_
        Collation key merge seperator
        See Also:
        Constant Field Values
    • Constructor Detail

      • CollationKey

        public CollationKey​(java.lang.String source,
                            byte[] key)
        CollationKey constructor. This constructor is given public access, unlike the JDK version, to allow access to users extending the Collator class. See Collator.getCollationKey(String).
        Parameters:
        source - string this CollationKey is to represent
        key - array of bytes that represent the collation order of argument source terminated by a null
        See Also:
        Collator
      • CollationKey

        private CollationKey​(java.lang.String source,
                             byte[] key,
                             int length)
        Private constructor, takes a length argument so it need not be lazy-evaluated. There must be a 00 byte at key[length] and none before.
      • CollationKey

        public CollationKey​(java.lang.String source,
                            RawCollationKey key)
        CollationKey constructor that forces key to release its internal byte array for adoption. key will have a null byte array after this construction.
        Parameters:
        source - string this CollationKey is to represent
        key - RawCollationKey object that represents the collation order of argument source.
        See Also:
        Collator, RawCollationKey
    • Method Detail

      • getSourceString

        public java.lang.String getSourceString()
        Return the source string that this CollationKey represents.
        Returns:
        source string that this CollationKey represents
      • toByteArray

        public byte[] toByteArray()
        Duplicates and returns the value of this CollationKey as a sequence of big-endian bytes terminated by a null.

        If two CollationKeys can be legitimately compared, then one can compare the byte arrays of each to obtain the same result, e.g.

         byte key1[] = collationkey1.toByteArray();
         byte key2[] = collationkey2.toByteArray();
         int key, targetkey;
         int i = 0;
         do {
               key = key1[i] & 0xFF;
             targetkey = key2[i] & 0xFF;
             if (key < targetkey) {
                 System.out.println("String 1 is less than string 2");
                 return;
             }
             if (targetkey < key) {
                 System.out.println("String 1 is more than string 2");
             }
             i ++;
         } while (key != 0 && targetKey != 0);
        
         System.out.println("Strings are equal.");
         
        Returns:
        CollationKey value in a sequence of big-endian byte bytes terminated by a null.
      • compareTo

        public int compareTo​(CollationKey target)
        Compare this CollationKey to another CollationKey. The collation rules of the Collator that created this key are applied.

        Note: Comparison between CollationKeys created by different Collators might return incorrect results. See class documentation.

        Specified by:
        compareTo in interface java.lang.Comparable<CollationKey>
        Parameters:
        target - target CollationKey
        Returns:
        an integer value. If the value is less than zero this CollationKey is less than than target, if the value is zero they are equal, and if the value is greater than zero this CollationKey is greater than target.
        Throws:
        java.lang.NullPointerException - is thrown if argument is null.
        See Also:
        Collator.compare(String, String)
      • equals

        public boolean equals​(java.lang.Object target)
        Compare this CollationKey and the specified Object for equality. The collation rules of the Collator that created this key are applied.

        See note in compareTo(CollationKey) for warnings about possible incorrect results.

        Overrides:
        equals in class java.lang.Object
        Parameters:
        target - the object to compare to.
        Returns:
        true if the two keys compare as equal, false otherwise.
        Throws:
        java.lang.ClassCastException - is thrown when the argument is not a CollationKey. NullPointerException is thrown when the argument is null.
        See Also:
        compareTo(CollationKey)
      • equals

        public boolean equals​(CollationKey target)
        Compare this CollationKey and the argument target CollationKey for equality. The collation rules of the Collator object which created these objects are applied.

        See note in compareTo(CollationKey) for warnings of incorrect results

        Parameters:
        target - the CollationKey to compare to.
        Returns:
        true if two objects are equal, false otherwise.
        Throws:
        java.lang.NullPointerException - is thrown when the argument is null.
      • hashCode

        public int hashCode()
        Returns a hash code for this CollationKey. The hash value is calculated on the key itself, not the String from which the key was created. Thus if x and y are CollationKeys, then x.hashCode(x) == y.hashCode() if x.equals(y) is true. This allows language-sensitive comparison in a hash table.
        Overrides:
        hashCode in class java.lang.Object
        Returns:
        the hash value.
      • getBound

        public CollationKey getBound​(int boundType,
                                     int noOfLevels)
        Produces a bound for the sort order of a given collation key and a strength level. This API does not attempt to find a bound for the CollationKey String representation, hence null will be returned in its place.

        Resulting bounds can be used to produce a range of strings that are between upper and lower bounds. For example, if bounds are produced for a sortkey of string "smith", strings between upper and lower bounds with primary strength would include "Smith", "SMITH", "sMiTh".

        There are two upper bounds that can be produced. If BoundMode.UPPER is produced, strings matched would be as above. However, if a bound is produced using BoundMode.UPPER_LONG is used, the above example will also match "Smithsonian" and similar.

        For more on usage, see example in test procedure src/com/ibm/icu/dev/test/collator/CollationAPITest/TestBounds.

        Collation keys produced may be compared using the compare API.

        Parameters:
        boundType - Mode of bound required. It can be BoundMode.LOWER, which produces a lower inclusive bound, BoundMode.UPPER, that produces upper bound that matches strings of the same length or BoundMode.UPPER_LONG that matches strings that have the same starting substring as the source string.
        noOfLevels - Strength levels required in the resulting bound (for most uses, the recommended value is PRIMARY). This strength should be less than the maximum strength of this CollationKey. See users guide for explanation on the strength levels a collation key can have.
        Returns:
        the result bounded CollationKey with a valid sort order but a null String representation.
        Throws:
        java.lang.IllegalArgumentException - thrown when the strength level requested is higher than or equal to the strength in this CollationKey. In the case of an Exception, information about the maximum strength to use will be returned in the Exception. The user can then call getBound() again with the appropriate strength.
        See Also:
        CollationKey, CollationKey.BoundMode, Collator.PRIMARY, Collator.SECONDARY, Collator.TERTIARY, Collator.QUATERNARY, Collator.IDENTICAL
      • merge

        public CollationKey merge​(CollationKey source)
        Merges this CollationKey with another. The levels are merged with their corresponding counterparts (primaries with primaries, secondaries with secondaries etc.). Between the values from the same level a separator is inserted.

        This is useful, for example, for combining sort keys from first and last names to sort such pairs. See http://www.unicode.org/reports/tr10/#Merging_Sort_Keys

        The recommended way to achieve "merged" sorting is by concatenating strings with U+FFFE between them. The concatenation has the same sort order as the merged sort keys, but merge(getSortKey(str1), getSortKey(str2)) may differ from getSortKey(str1 + '￾' + str2). Using strings with U+FFFE may yield shorter sort keys.

        For details about Sort Key Features see https://unicode-org.github.io/icu/userguide/collation/api#sort-key-features

        It is possible to merge multiple sort keys by consecutively merging another one with the intermediate result.

        Only the sort key bytes of the CollationKeys are merged. This API does not attempt to merge the String representations of the CollationKeys, hence null will be returned as the result's String representation.

        Example (uncompressed):

        191B1D 01 050505 01 910505 00
         1F2123 01 050505 01 910505 00
        will be merged as
        191B1D 02 1F2123 01 050505 02 050505 01 910505 02 910505 00
        Parameters:
        source - CollationKey to merge with
        Returns:
        a CollationKey that contains the valid merged sort keys with a null String representation, i.e. new CollationKey(null, merged_sort_keys)
        Throws:
        java.lang.IllegalArgumentException - thrown if source CollationKey argument is null or of 0 length.
      • getLength

        private int getLength()
        Gets the length of the CollationKey
        Returns:
        length of the CollationKey