Class NumberConverter


  • public class NumberConverter
    extends java.lang.Object

    Implementation of Number to String Conversion algorithm specified by XSL Transformations (XSLT) Version 2.0, W3C Recommendation, 23 January 2007.

    This algorithm differs from that specified in XSLT 1.0 in the following ways:

    • input numbers are greater than or equal to zero rather than greater than zero;
    • introduces format tokens { w, W, Ww };
    • introduces ordinal parameter to generate ordinal numbers;

    Implementation Defaults and Limitations

    • If language parameter is unspecified (null or empty string), then the value of DEFAULT_LANGUAGE is used, which is defined below as "eng" (English).
    • Only English, French, and Spanish word numerals are supported, and only if less than one trillion (1,000,000,000,000).
    • Ordinal word numerals are supported for French and Spanish only when less than or equal to ten (10).

    Implementation Notes

    • In order to handle format tokens outside the Unicode BMP, all processing is done in Unicode Scalar Values represented with Integer and Integer[] types. Without affecting behavior, this may be subsequently optimized to use int and int[] types.
    • In order to communicate various sub-parameters, including ordinalization, a features is employed, which consists of comma separated name and optional value tokens, where name and value are separated by an equals '=' sign.
    • Ordinal numbers are selected by specifying a word based format token in combination with a 'ordinal' feature with no value, in which case the features 'male' and 'female' may be used to specify gender for gender sensitive languages. For example, the feature string "ordinal,female" selects female ordinals.

    This work was originally authored by Glenn Adams (gadams@apache.org).

    • Field Detail

      • LETTER_VALUE_ALPHABETIC

        public static final int LETTER_VALUE_ALPHABETIC
        alphabetical
        See Also:
        Constant Field Values
      • LETTER_VALUE_TRADITIONAL

        public static final int LETTER_VALUE_TRADITIONAL
        traditional
        See Also:
        Constant Field Values
      • TOKEN_ALPHANUMERIC

        private static final int TOKEN_ALPHANUMERIC
        alhphanumeric token type
        See Also:
        Constant Field Values
      • TOKEN_NONALPHANUMERIC

        private static final int TOKEN_NONALPHANUMERIC
        nonalphanumeric token type
        See Also:
        Constant Field Values
      • DEFAULT_TOKEN

        private static final java.lang.Integer[] DEFAULT_TOKEN
        default token
      • DEFAULT_SEPARATOR

        private static final java.lang.Integer[] DEFAULT_SEPARATOR
        default separator
      • DEFAULT_LANGUAGE

        private static final java.lang.String DEFAULT_LANGUAGE
        default language
        See Also:
        Constant Field Values
      • prefix

        private java.lang.Integer[] prefix
        prefix token
      • suffix

        private java.lang.Integer[] suffix
        suffix token
      • tokens

        private java.lang.Integer[][] tokens
        sequence of tokens, as parsed from format
      • separators

        private java.lang.Integer[][] separators
        sequence of separators, as parsed from format
      • groupingSeparator

        private int groupingSeparator
        grouping separator
      • groupingSize

        private int groupingSize
        grouping size
      • letterValue

        private int letterValue
        letter value
      • features

        private java.lang.String features
        letter value system
      • language

        private java.lang.String language
        language
      • country

        private java.lang.String country
        country
      • equivalentLanguages

        private static java.lang.String[][] equivalentLanguages
      • supportedAlphabeticSequences

        private static int[][] supportedAlphabeticSequences
      • supportedSpecials

        private static int[][] supportedSpecials
      • englishWordOnes

        private static java.lang.String[] englishWordOnes
        English Word Numerals
      • englishWordTeens

        private static java.lang.String[] englishWordTeens
      • englishWordTens

        private static java.lang.String[] englishWordTens
      • englishWordOthers

        private static java.lang.String[] englishWordOthers
      • englishWordOnesOrd

        private static java.lang.String[] englishWordOnesOrd
      • englishWordTeensOrd

        private static java.lang.String[] englishWordTeensOrd
      • englishWordTensOrd

        private static java.lang.String[] englishWordTensOrd
      • englishWordOthersOrd

        private static java.lang.String[] englishWordOthersOrd
      • frenchWordOnes

        private static java.lang.String[] frenchWordOnes
        French Word Numerals
      • frenchWordTeens

        private static java.lang.String[] frenchWordTeens
      • frenchWordTens

        private static java.lang.String[] frenchWordTens
      • frenchWordOthers

        private static java.lang.String[] frenchWordOthers
      • frenchWordOnesOrdMale

        private static java.lang.String[] frenchWordOnesOrdMale
      • frenchWordOnesOrdFemale

        private static java.lang.String[] frenchWordOnesOrdFemale
      • spanishWordOnes

        private static java.lang.String[] spanishWordOnes
        Spanish Word Numerals
      • spanishWordTeens

        private static java.lang.String[] spanishWordTeens
      • spanishWordTweens

        private static java.lang.String[] spanishWordTweens
      • spanishWordTens

        private static java.lang.String[] spanishWordTens
      • spanishWordHundreds

        private static java.lang.String[] spanishWordHundreds
      • spanishWordOthers

        private static java.lang.String[] spanishWordOthers
      • spanishWordOnesOrdMale

        private static java.lang.String[] spanishWordOnesOrdMale
      • spanishWordOnesOrdFemale

        private static java.lang.String[] spanishWordOnesOrdFemale
      • romanMapping

        private static int[] romanMapping
        Roman (Latin) Numerals
      • romanStandardForms

        private static java.lang.String[] romanStandardForms
      • romanLargeForms

        private static java.lang.String[] romanLargeForms
      • romanNumberForms

        private static java.lang.String[] romanNumberForms
      • hebrewGematriaAlphabeticMap

        private static int[] hebrewGematriaAlphabeticMap
        Gematria (Hebrew) Numerals
      • arabicAbjadiAlphabeticMap

        private static int[] arabicAbjadiAlphabeticMap
        Arabic Numerals
      • arabicHijaiAlphabeticMap

        private static int[] arabicHijaiAlphabeticMap
      • hiraganaGojuonAlphabeticMap

        private static int[] hiraganaGojuonAlphabeticMap
        Kana (Japanese) Numerals
      • katakanaGojuonAlphabeticMap

        private static int[] katakanaGojuonAlphabeticMap
      • thaiAlphabeticMap

        private static int[] thaiAlphabeticMap
        Thai Numerals
    • Constructor Detail

      • NumberConverter

        public NumberConverter​(java.lang.String format,
                               int groupingSeparator,
                               int groupingSize,
                               int letterValue,
                               java.lang.String features,
                               java.lang.String language,
                               java.lang.String country)
                        throws java.lang.IllegalArgumentException
        Construct parameterized number converter.
        Parameters:
        format - format for the page number (may be null or empty, which is treated as null)
        groupingSeparator - grouping separator (if zero, then no grouping separator applies)
        groupingSize - grouping size (if zero or negative, then no grouping size applies)
        letterValue - letter value (must be one of the above letter value enumeration values)
        features - features (feature sub-parameters)
        language - (may be null or empty, which is treated as null)
        country - (may be null or empty, which is treated as null)
        Throws:
        java.lang.IllegalArgumentException - if format is not a valid UTF-16 string (e.g., has unpaired surrogate)
    • Method Detail

      • convert

        public java.lang.String convert​(long number)
        Convert a number to string according to conversion parameters.
        Parameters:
        number - number to conver
        Returns:
        string representing converted number
      • convert

        public java.lang.String convert​(java.util.List<java.lang.Long> numbers)
        Convert list of numbers to string according to conversion parameters.
        Parameters:
        numbers - list of numbers to convert
        Returns:
        string representing converted list of numbers
      • parseFormatTokens

        private void parseFormatTokens​(java.lang.String format)
                                throws java.lang.IllegalArgumentException
        Throws:
        java.lang.IllegalArgumentException
      • isAlphaNumeric

        private static boolean isAlphaNumeric​(int c)
      • convertNumbers

        private void convertNumbers​(java.util.List<java.lang.Integer> scalars,
                                    java.util.List<java.lang.Long> numbers)
      • convertNumber

        private java.lang.Integer[] convertNumber​(long number,
                                                  java.lang.Integer[] separator,
                                                  java.lang.Integer[] token)
      • formatNumber

        private java.lang.Integer[] formatNumber​(long number,
                                                 java.lang.Integer[] token)
      • formatNumberAsDecimal

        private java.lang.Integer[] formatNumberAsDecimal​(long number,
                                                          int one,
                                                          int width)
        Format NUMBER as decimal using characters denoting digits that start at ONE, adding one or more (zero) padding characters as needed to fill out field WIDTH.
        Parameters:
        number - to be formatted
        one - unicode scalar value denoting numeric value 1
        width - non-negative integer denoting field width of number, possible including padding
        Returns:
        formatted number as array of unicode scalars
      • performGrouping

        private static java.util.List<java.lang.Integer> performGrouping​(java.util.List<java.lang.Integer> sl,
                                                                         int groupingSize,
                                                                         int groupingSeparator)
      • formatNumberAsSequence

        private java.lang.Integer[] formatNumberAsSequence​(long number,
                                                           int one,
                                                           int base,
                                                           int[] map)
        Format NUMBER as using sequence of characters that start at ONE, and having BASE radix.
        Parameters:
        number - to be formatted
        one - unicode scalar value denoting start of sequence (numeric value 1)
        base - number of elements in sequence
        map - if non-null, then maps sequences indices to unicode scalars
        Returns:
        formatted number as array of unicode scalars
      • formatNumberAsSpecial

        private java.lang.Integer[] formatNumberAsSpecial​(long number,
                                                          int one)
        Format NUMBER as using special system that starts at ONE.
        Parameters:
        number - to be formatted
        one - unicode scalar value denoting start of system (numeric value 1)
        Returns:
        formatted number as array of unicode scalars
      • formatNumberAsWord

        private java.lang.Integer[] formatNumberAsWord​(long number,
                                                       int caseType)
        Format NUMBER as word according to TYPE, which must be either Character.UPPERCASE_LETTER, Character.LOWERCASE_LETTER, or Character.TITLECASE_LETTER. Makes use of this.language to determine language of word.
        Parameters:
        number - to be formatted
        caseType - unicode character type for case conversion
        Returns:
        formatted number as array of unicode scalars
      • isLanguage

        private boolean isLanguage​(java.lang.String iso3Code)
      • isSameLanguage

        private static boolean isSameLanguage​(java.lang.String i3c,
                                              java.lang.String lc)
      • hasFeature

        private static boolean hasFeature​(java.lang.String features,
                                          java.lang.String feature)
      • appendScalars

        private static void appendScalars​(java.util.List<java.lang.Integer> scalars,
                                          java.lang.Integer[] sa)
      • scalarsToString

        private static java.lang.String scalarsToString​(java.util.List<java.lang.Integer> scalars)
      • isPaddedOne

        private static boolean isPaddedOne​(java.lang.Integer[] token)
      • getDecimalValue

        private static int getDecimalValue​(java.lang.Integer scalar)
      • isStartOfDecimalSequence

        private static boolean isStartOfDecimalSequence​(int s)
      • isStartOfAlphabeticSequence

        private static boolean isStartOfAlphabeticSequence​(int s)
      • getSequenceBase

        private static int getSequenceBase​(int s)
      • isStartOfNumericSpecial

        private static boolean isStartOfNumericSpecial​(int s)
      • getSpecialFormatter

        private NumberConverter.SpecialNumberFormatter getSpecialFormatter​(int one,
                                                                           int letterValue,
                                                                           java.lang.String features,
                                                                           java.lang.String language,
                                                                           java.lang.String country)
      • toUpperCase

        private static java.lang.Integer[] toUpperCase​(java.lang.Integer[] sa)
      • toLowerCase

        private static java.lang.Integer[] toLowerCase​(java.lang.Integer[] sa)
      • convertWordCase

        private static java.util.List<java.lang.String> convertWordCase​(java.util.List<java.lang.String> words,
                                                                        int caseType)
      • convertWordCase

        private static java.lang.String convertWordCase​(java.lang.String word,
                                                        int caseType)
      • joinWords

        private static java.lang.String joinWords​(java.util.List<java.lang.String> words,
                                                  java.lang.String separator)