Package com.ibm.icu.impl
Class Normalizer2Impl
- java.lang.Object
-
- com.ibm.icu.impl.Normalizer2Impl
-
public final class Normalizer2Impl extends java.lang.Object
Low-level implementation of the Unicode Normalization Algorithm. For the data structure and details see the documentation at the end of C++ normalizer2impl.h and in the design doc at https://unicode-org.github.io/icu/design/normalization/custom.html
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Normalizer2Impl.Hangul
private static class
Normalizer2Impl.IsAcceptable
static class
Normalizer2Impl.ReorderingBuffer
Writable buffer that takes care of canonical ordering.static class
Normalizer2Impl.UTF16Plus
-
Field Summary
Fields Modifier and Type Field Description private static int
CANON_HAS_COMPOSITIONS
private static int
CANON_HAS_SET
private static int
CANON_NOT_SEGMENT_STARTER
private static int
CANON_VALUE_MASK
private CodePointTrie
canonIterData
private java.util.ArrayList<UnicodeSet>
canonStartSets
private int
centerNoNoDelta
static int
COMP_1_LAST_TUPLE
static int
COMP_1_TRAIL_LIMIT
static int
COMP_1_TRAIL_MASK
static int
COMP_1_TRAIL_SHIFT
static int
COMP_1_TRIPLE
static int
COMP_2_TRAIL_MASK
static int
COMP_2_TRAIL_SHIFT
private static int
DATA_FORMAT
private VersionInfo
dataVersion
static int
DELTA_SHIFT
static int
DELTA_TCCC_0
static int
DELTA_TCCC_1
static int
DELTA_TCCC_GT_1
static int
DELTA_TCCC_MASK
private java.lang.String
extraData
static int
HAS_COMP_BOUNDARY_AFTER
static int
INERT
private static Normalizer2Impl.IsAcceptable
IS_ACCEPTABLE
static int
IX_EXTRA_DATA_OFFSET
static int
IX_LIMIT_NO_NO
static int
IX_MIN_COMP_NO_MAYBE_CP
static int
IX_MIN_DECOMP_NO_CP
static int
IX_MIN_LCCC_CP
static int
IX_MIN_MAYBE_NO
Two-way mappings; each starts with a character that combines backward.static int
IX_MIN_MAYBE_NO_COMBINES_FWD
Two-way mappings & compositions.static int
IX_MIN_MAYBE_YES
static int
IX_MIN_NO_NO
Mappings are comp-normalized.static int
IX_MIN_NO_NO_COMP_BOUNDARY_BEFORE
Mappings are not comp-normalized but have a comp boundary before.static int
IX_MIN_NO_NO_COMP_NO_MAYBE_CC
Mappings do not have a comp boundary before.static int
IX_MIN_NO_NO_EMPTY
Mappings to the empty string.static int
IX_MIN_YES_NO
Mappings & compositions in [minYesNo..minYesNoMappingsOnly[.static int
IX_MIN_YES_NO_MAPPINGS_ONLY
Mappings only in [minYesNoMappingsOnly..minNoNo[.static int
IX_NORM_TRIE_OFFSET
static int
IX_RESERVED3_OFFSET
static int
IX_SMALL_FCD_OFFSET
static int
IX_TOTAL_SIZE
static int
JAMO_L
static int
JAMO_VT
private int
limitNoNo
static int
MAPPING_HAS_CCC_LCCC_WORD
static int
MAPPING_HAS_RAW_MAPPING
static int
MAPPING_LENGTH_MASK
static int
MAX_DELTA
static int
MIN_NORMAL_MAYBE_YES
static int
MIN_YES_YES_WITH_CC
private int
minCompNoMaybeCP
private int
minDecompNoCP
private int
minLcccCP
private int
minMaybeNo
private int
minMaybeNoCombinesFwd
private int
minMaybeYes
private int
minNoNo
private int
minNoNoCompBoundaryBefore
private int
minNoNoCompNoMaybeCC
private int
minNoNoEmpty
private int
minYesNo
private int
minYesNoMappingsOnly
private CodePointTrie.Fast16
normTrie
static int
OFFSET_SHIFT
private static CodePointMap.ValueFilter
segmentStarterMapper
private byte[]
smallFCD
-
Constructor Summary
Constructors Constructor Description Normalizer2Impl()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addCanonIterPropertyStarts(UnicodeSet set)
private void
addComposites(int list, UnicodeSet set)
void
addLcccChars(UnicodeSet set)
void
addPropertyStarts(UnicodeSet set)
private void
addToStartSet(MutableCodePointTrie mutableTrie, int origin, int decompLead)
private int
combine(int list, int trail)
Finds the recomposition result for a forward-combining "lead" character, specified with a pointer to its compositions list, and a backward-combining "trail" character.boolean
compose(java.lang.CharSequence s, int src, int limit, boolean onlyContiguous, boolean doCompose, Normalizer2Impl.ReorderingBuffer buffer)
void
composeAndAppend(java.lang.CharSequence s, boolean doCompose, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)
int
composePair(int a, int b)
int
composeQuickCheck(java.lang.CharSequence s, int src, int limit, boolean onlyContiguous, boolean doSpan)
Very similar to compose(): Make the same changes in both places if relevant.private void
decompose(int c, int norm16, Normalizer2Impl.ReorderingBuffer buffer)
int
decompose(java.lang.CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
void
decompose(java.lang.CharSequence s, int src, int limit, java.lang.StringBuilder dest, int destLengthEstimate)
Decomposes s[src, limit[ and writes the result to dest.java.lang.Appendable
decompose(java.lang.CharSequence s, java.lang.StringBuilder dest)
void
decomposeAndAppend(java.lang.CharSequence s, boolean doDecompose, Normalizer2Impl.ReorderingBuffer buffer)
private int
decomposeShort(java.lang.CharSequence s, int src, int limit, boolean stopAtCompBoundary, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)
Normalizer2Impl
ensureCanonIterData()
Builds the canonical-iterator data for this instance.private int
findNextCompBoundary(java.lang.CharSequence s, int p, int limit, boolean onlyContiguous)
private int
findNextFCDBoundary(java.lang.CharSequence s, int p, int limit)
private int
findPreviousCompBoundary(java.lang.CharSequence s, int p, boolean onlyContiguous)
private int
findPreviousFCDBoundary(java.lang.CharSequence s, int p)
boolean
getCanonStartSet(int c, UnicodeSet set)
Returns true if there are characters whose decomposition starts with c.int
getCC(int norm16)
private int
getCCFromNoNo(int norm16)
static int
getCCFromNormalYesOrMaybe(int norm16)
static int
getCCFromYesOrMaybeYes(int norm16)
int
getCCFromYesOrMaybeYesCP(int c)
private int
getCompositionsList(int norm16)
private int
getCompositionsListForComposite(int norm16)
private int
getCompositionsListForDecompYes(int norm16)
int
getCompQuickCheck(int norm16)
private int
getData(int norm16)
private int
getDataForMaybe(int norm16)
private int
getDataForYesOrNo(int norm16)
java.lang.String
getDecomposition(int c)
Gets the decomposition for one code point.int
getFCD16(int c)
Returns the FCD data for code point c.private int
getFCD16FromMaybeOrNonZeroCC(int norm16)
int
getFCD16FromNormData(int c)
Gets the FCD value from the regular normalization data.int
getNorm16(int c)
private int
getPreviousTrailCC(java.lang.CharSequence s, int start, int p)
java.lang.String
getRawDecomposition(int c)
Gets the raw decomposition for one code point.int
getRawNorm16(int c)
(package private) int
getTrailCCFromCompYesAndZeroCC(int norm16)
private int
hangulLVT()
boolean
hasCompBoundaryAfter(int c, boolean onlyContiguous)
private boolean
hasCompBoundaryAfter(java.lang.CharSequence s, int start, int p, boolean onlyContiguous)
boolean
hasCompBoundaryBefore(int c)
private boolean
hasCompBoundaryBefore(int c, int norm16)
Does c have a composition boundary before it? True if its decomposition begins with a character that has ccc=0 && NFC_QC=Yes (isCompYesAndZeroCC()).private boolean
hasCompBoundaryBefore(java.lang.CharSequence s, int src, int limit)
boolean
hasDecompBoundaryAfter(int c)
boolean
hasDecompBoundaryBefore(int c)
boolean
hasFCDBoundaryAfter(int c)
boolean
hasFCDBoundaryBefore(int c)
boolean
isAlgorithmicNoNo(int norm16)
boolean
isCanonSegmentStarter(int c)
Returns true if code point c starts a canonical-iterator string segment.boolean
isCompInert(int c, boolean onlyContiguous)
boolean
isCompNo(int norm16)
private boolean
isCompYesAndZeroCC(int norm16)
boolean
isDecompInert(int c)
private boolean
isDecompNoAlgorithmic(int norm16)
Since formatVersion 5: same as isAlgorithmicNoNo()boolean
isDecompYes(int norm16)
private boolean
isDecompYesAndZeroCC(int norm16)
boolean
isFCDInert(int c)
private boolean
isHangulLV(int norm16)
private boolean
isHangulLVT(int norm16)
private static boolean
isInert(int norm16)
private static boolean
isJamoL(int norm16)
private static boolean
isJamoVT(int norm16)
private boolean
isMaybe(int norm16)
private boolean
isMaybeYesOrNonZeroCC(int norm16)
private boolean
isMostDecompYesAndZeroCC(int norm16)
A little faster and simpler than isDecompYesAndZeroCC() but does not include the MaybeYes which combine-forward and have ccc=0.private boolean
isTrailCC01ForCompBoundaryAfter(int norm16)
For FCC: Given norm16 HAS_COMP_BOUNDARY_AFTER, does it have tccc<=1?Normalizer2Impl
load(java.lang.String name)
Normalizer2Impl
load(java.nio.ByteBuffer bytes)
int
makeFCD(java.lang.CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
void
makeFCDAndAppend(java.lang.CharSequence s, boolean doMakeFCD, Normalizer2Impl.ReorderingBuffer buffer)
private int
mapAlgorithmic(int c, int norm16)
private boolean
norm16HasCompBoundaryAfter(int norm16, boolean onlyContiguous)
private boolean
norm16HasCompBoundaryBefore(int norm16)
boolean
norm16HasDecompBoundaryAfter(int norm16)
boolean
norm16HasDecompBoundaryBefore(int norm16)
private void
recompose(Normalizer2Impl.ReorderingBuffer buffer, int recomposeStartIndex, boolean onlyContiguous)
boolean
singleLeadMightHaveNonZeroFCD16(int lead)
Returns true if the single-or-lead code unit c might have non-zero FCD data.
-
-
-
Field Detail
-
IS_ACCEPTABLE
private static final Normalizer2Impl.IsAcceptable IS_ACCEPTABLE
-
DATA_FORMAT
private static final int DATA_FORMAT
- See Also:
- Constant Field Values
-
segmentStarterMapper
private static final CodePointMap.ValueFilter segmentStarterMapper
-
MIN_YES_YES_WITH_CC
public static final int MIN_YES_YES_WITH_CC
- See Also:
- Constant Field Values
-
JAMO_VT
public static final int JAMO_VT
- See Also:
- Constant Field Values
-
MIN_NORMAL_MAYBE_YES
public static final int MIN_NORMAL_MAYBE_YES
- See Also:
- Constant Field Values
-
JAMO_L
public static final int JAMO_L
- See Also:
- Constant Field Values
-
INERT
public static final int INERT
- See Also:
- Constant Field Values
-
HAS_COMP_BOUNDARY_AFTER
public static final int HAS_COMP_BOUNDARY_AFTER
- See Also:
- Constant Field Values
-
OFFSET_SHIFT
public static final int OFFSET_SHIFT
- See Also:
- Constant Field Values
-
DELTA_TCCC_0
public static final int DELTA_TCCC_0
- See Also:
- Constant Field Values
-
DELTA_TCCC_1
public static final int DELTA_TCCC_1
- See Also:
- Constant Field Values
-
DELTA_TCCC_GT_1
public static final int DELTA_TCCC_GT_1
- See Also:
- Constant Field Values
-
DELTA_TCCC_MASK
public static final int DELTA_TCCC_MASK
- See Also:
- Constant Field Values
-
DELTA_SHIFT
public static final int DELTA_SHIFT
- See Also:
- Constant Field Values
-
MAX_DELTA
public static final int MAX_DELTA
- See Also:
- Constant Field Values
-
IX_NORM_TRIE_OFFSET
public static final int IX_NORM_TRIE_OFFSET
- See Also:
- Constant Field Values
-
IX_EXTRA_DATA_OFFSET
public static final int IX_EXTRA_DATA_OFFSET
- See Also:
- Constant Field Values
-
IX_SMALL_FCD_OFFSET
public static final int IX_SMALL_FCD_OFFSET
- See Also:
- Constant Field Values
-
IX_RESERVED3_OFFSET
public static final int IX_RESERVED3_OFFSET
- See Also:
- Constant Field Values
-
IX_TOTAL_SIZE
public static final int IX_TOTAL_SIZE
- See Also:
- Constant Field Values
-
IX_MIN_DECOMP_NO_CP
public static final int IX_MIN_DECOMP_NO_CP
- See Also:
- Constant Field Values
-
IX_MIN_COMP_NO_MAYBE_CP
public static final int IX_MIN_COMP_NO_MAYBE_CP
- See Also:
- Constant Field Values
-
IX_MIN_YES_NO
public static final int IX_MIN_YES_NO
Mappings & compositions in [minYesNo..minYesNoMappingsOnly[.- See Also:
- Constant Field Values
-
IX_MIN_NO_NO
public static final int IX_MIN_NO_NO
Mappings are comp-normalized.- See Also:
- Constant Field Values
-
IX_LIMIT_NO_NO
public static final int IX_LIMIT_NO_NO
- See Also:
- Constant Field Values
-
IX_MIN_MAYBE_YES
public static final int IX_MIN_MAYBE_YES
- See Also:
- Constant Field Values
-
IX_MIN_YES_NO_MAPPINGS_ONLY
public static final int IX_MIN_YES_NO_MAPPINGS_ONLY
Mappings only in [minYesNoMappingsOnly..minNoNo[.- See Also:
- Constant Field Values
-
IX_MIN_NO_NO_COMP_BOUNDARY_BEFORE
public static final int IX_MIN_NO_NO_COMP_BOUNDARY_BEFORE
Mappings are not comp-normalized but have a comp boundary before.- See Also:
- Constant Field Values
-
IX_MIN_NO_NO_COMP_NO_MAYBE_CC
public static final int IX_MIN_NO_NO_COMP_NO_MAYBE_CC
Mappings do not have a comp boundary before.- See Also:
- Constant Field Values
-
IX_MIN_NO_NO_EMPTY
public static final int IX_MIN_NO_NO_EMPTY
Mappings to the empty string.- See Also:
- Constant Field Values
-
IX_MIN_LCCC_CP
public static final int IX_MIN_LCCC_CP
- See Also:
- Constant Field Values
-
IX_MIN_MAYBE_NO
public static final int IX_MIN_MAYBE_NO
Two-way mappings; each starts with a character that combines backward.- See Also:
- Constant Field Values
-
IX_MIN_MAYBE_NO_COMBINES_FWD
public static final int IX_MIN_MAYBE_NO_COMBINES_FWD
Two-way mappings & compositions.- See Also:
- Constant Field Values
-
MAPPING_HAS_CCC_LCCC_WORD
public static final int MAPPING_HAS_CCC_LCCC_WORD
- See Also:
- Constant Field Values
-
MAPPING_HAS_RAW_MAPPING
public static final int MAPPING_HAS_RAW_MAPPING
- See Also:
- Constant Field Values
-
MAPPING_LENGTH_MASK
public static final int MAPPING_LENGTH_MASK
- See Also:
- Constant Field Values
-
COMP_1_LAST_TUPLE
public static final int COMP_1_LAST_TUPLE
- See Also:
- Constant Field Values
-
COMP_1_TRIPLE
public static final int COMP_1_TRIPLE
- See Also:
- Constant Field Values
-
COMP_1_TRAIL_LIMIT
public static final int COMP_1_TRAIL_LIMIT
- See Also:
- Constant Field Values
-
COMP_1_TRAIL_MASK
public static final int COMP_1_TRAIL_MASK
- See Also:
- Constant Field Values
-
COMP_1_TRAIL_SHIFT
public static final int COMP_1_TRAIL_SHIFT
- See Also:
- Constant Field Values
-
COMP_2_TRAIL_SHIFT
public static final int COMP_2_TRAIL_SHIFT
- See Also:
- Constant Field Values
-
COMP_2_TRAIL_MASK
public static final int COMP_2_TRAIL_MASK
- See Also:
- Constant Field Values
-
dataVersion
private VersionInfo dataVersion
-
minDecompNoCP
private int minDecompNoCP
-
minCompNoMaybeCP
private int minCompNoMaybeCP
-
minLcccCP
private int minLcccCP
-
minYesNo
private int minYesNo
-
minYesNoMappingsOnly
private int minYesNoMappingsOnly
-
minNoNo
private int minNoNo
-
minNoNoCompBoundaryBefore
private int minNoNoCompBoundaryBefore
-
minNoNoCompNoMaybeCC
private int minNoNoCompNoMaybeCC
-
minNoNoEmpty
private int minNoNoEmpty
-
limitNoNo
private int limitNoNo
-
centerNoNoDelta
private int centerNoNoDelta
-
minMaybeNo
private int minMaybeNo
-
minMaybeNoCombinesFwd
private int minMaybeNoCombinesFwd
-
minMaybeYes
private int minMaybeYes
-
normTrie
private CodePointTrie.Fast16 normTrie
-
extraData
private java.lang.String extraData
-
smallFCD
private byte[] smallFCD
-
canonIterData
private CodePointTrie canonIterData
-
canonStartSets
private java.util.ArrayList<UnicodeSet> canonStartSets
-
CANON_NOT_SEGMENT_STARTER
private static final int CANON_NOT_SEGMENT_STARTER
- See Also:
- Constant Field Values
-
CANON_HAS_COMPOSITIONS
private static final int CANON_HAS_COMPOSITIONS
- See Also:
- Constant Field Values
-
CANON_HAS_SET
private static final int CANON_HAS_SET
- See Also:
- Constant Field Values
-
CANON_VALUE_MASK
private static final int CANON_VALUE_MASK
- See Also:
- Constant Field Values
-
-
Method Detail
-
load
public Normalizer2Impl load(java.nio.ByteBuffer bytes)
-
load
public Normalizer2Impl load(java.lang.String name)
-
addLcccChars
public void addLcccChars(UnicodeSet set)
-
addPropertyStarts
public void addPropertyStarts(UnicodeSet set)
-
addCanonIterPropertyStarts
public void addCanonIterPropertyStarts(UnicodeSet set)
-
ensureCanonIterData
public Normalizer2Impl ensureCanonIterData()
Builds the canonical-iterator data for this instance. This is required before any ofisCanonSegmentStarter(int)
orgetCanonStartSet(int, UnicodeSet)
are called, or else they crash.- Returns:
- this
-
getNorm16
public int getNorm16(int c)
-
getRawNorm16
public int getRawNorm16(int c)
-
getCompQuickCheck
public int getCompQuickCheck(int norm16)
-
isAlgorithmicNoNo
public boolean isAlgorithmicNoNo(int norm16)
-
isCompNo
public boolean isCompNo(int norm16)
-
isDecompYes
public boolean isDecompYes(int norm16)
-
getCC
public int getCC(int norm16)
-
getCCFromNormalYesOrMaybe
public static int getCCFromNormalYesOrMaybe(int norm16)
-
getCCFromYesOrMaybeYes
public static int getCCFromYesOrMaybeYes(int norm16)
-
getCCFromYesOrMaybeYesCP
public int getCCFromYesOrMaybeYesCP(int c)
-
getFCD16
public int getFCD16(int c)
Returns the FCD data for code point c.- Parameters:
c
- A Unicode code point.- Returns:
- The lccc(c) in bits 15..8 and tccc(c) in bits 7..0.
-
singleLeadMightHaveNonZeroFCD16
public boolean singleLeadMightHaveNonZeroFCD16(int lead)
Returns true if the single-or-lead code unit c might have non-zero FCD data.
-
getFCD16FromNormData
public int getFCD16FromNormData(int c)
Gets the FCD value from the regular normalization data.
-
getFCD16FromMaybeOrNonZeroCC
private int getFCD16FromMaybeOrNonZeroCC(int norm16)
-
getDecomposition
public java.lang.String getDecomposition(int c)
Gets the decomposition for one code point.- Parameters:
c
- code point- Returns:
- c's decomposition, if it has one; returns null if it does not have a decomposition
-
getRawDecomposition
public java.lang.String getRawDecomposition(int c)
Gets the raw decomposition for one code point.- Parameters:
c
- code point- Returns:
- c's raw decomposition, if it has one; returns null if it does not have a decomposition
-
isCanonSegmentStarter
public boolean isCanonSegmentStarter(int c)
Returns true if code point c starts a canonical-iterator string segment.ensureCanonIterData()
must have been called before this method, or else this method will crash.- Parameters:
c
- A Unicode code point.- Returns:
- true if c starts a canonical-iterator string segment.
-
getCanonStartSet
public boolean getCanonStartSet(int c, UnicodeSet set)
Returns true if there are characters whose decomposition starts with c. If so, then the set is cleared and then filled with those characters.ensureCanonIterData()
must have been called before this method, or else this method will crash.- Parameters:
c
- A Unicode code point.set
- A UnicodeSet to receive the characters whose decompositions start with c, if there are any.- Returns:
- true if there are characters whose decomposition starts with c.
-
decompose
public java.lang.Appendable decompose(java.lang.CharSequence s, java.lang.StringBuilder dest)
-
decompose
public void decompose(java.lang.CharSequence s, int src, int limit, java.lang.StringBuilder dest, int destLengthEstimate)
Decomposes s[src, limit[ and writes the result to dest. limit can be NULL if src is NUL-terminated. destLengthEstimate is the initial dest buffer capacity and can be -1.
-
decompose
public int decompose(java.lang.CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
-
decomposeAndAppend
public void decomposeAndAppend(java.lang.CharSequence s, boolean doDecompose, Normalizer2Impl.ReorderingBuffer buffer)
-
compose
public boolean compose(java.lang.CharSequence s, int src, int limit, boolean onlyContiguous, boolean doCompose, Normalizer2Impl.ReorderingBuffer buffer)
-
composeQuickCheck
public int composeQuickCheck(java.lang.CharSequence s, int src, int limit, boolean onlyContiguous, boolean doSpan)
Very similar to compose(): Make the same changes in both places if relevant. doSpan: spanQuickCheckYes (ignore bit 0 of the return value) !doSpan: quickCheck- Returns:
- bits 31..1: spanQuickCheckYes (==s.length() if "yes") and bit 0: set if "maybe"; otherwise, if the span length<s.length() then the quick check result is "no"
-
composeAndAppend
public void composeAndAppend(java.lang.CharSequence s, boolean doCompose, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)
-
makeFCD
public int makeFCD(java.lang.CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
-
makeFCDAndAppend
public void makeFCDAndAppend(java.lang.CharSequence s, boolean doMakeFCD, Normalizer2Impl.ReorderingBuffer buffer)
-
hasDecompBoundaryBefore
public boolean hasDecompBoundaryBefore(int c)
-
norm16HasDecompBoundaryBefore
public boolean norm16HasDecompBoundaryBefore(int norm16)
-
hasDecompBoundaryAfter
public boolean hasDecompBoundaryAfter(int c)
-
norm16HasDecompBoundaryAfter
public boolean norm16HasDecompBoundaryAfter(int norm16)
-
isDecompInert
public boolean isDecompInert(int c)
-
hasCompBoundaryBefore
public boolean hasCompBoundaryBefore(int c)
-
hasCompBoundaryAfter
public boolean hasCompBoundaryAfter(int c, boolean onlyContiguous)
-
isCompInert
public boolean isCompInert(int c, boolean onlyContiguous)
-
hasFCDBoundaryBefore
public boolean hasFCDBoundaryBefore(int c)
-
hasFCDBoundaryAfter
public boolean hasFCDBoundaryAfter(int c)
-
isFCDInert
public boolean isFCDInert(int c)
-
isMaybe
private boolean isMaybe(int norm16)
-
isMaybeYesOrNonZeroCC
private boolean isMaybeYesOrNonZeroCC(int norm16)
-
isInert
private static boolean isInert(int norm16)
-
isJamoL
private static boolean isJamoL(int norm16)
-
isJamoVT
private static boolean isJamoVT(int norm16)
-
hangulLVT
private int hangulLVT()
-
isHangulLV
private boolean isHangulLV(int norm16)
-
isHangulLVT
private boolean isHangulLVT(int norm16)
-
isCompYesAndZeroCC
private boolean isCompYesAndZeroCC(int norm16)
-
isDecompYesAndZeroCC
private boolean isDecompYesAndZeroCC(int norm16)
-
isMostDecompYesAndZeroCC
private boolean isMostDecompYesAndZeroCC(int norm16)
A little faster and simpler than isDecompYesAndZeroCC() but does not include the MaybeYes which combine-forward and have ccc=0.
-
isDecompNoAlgorithmic
private boolean isDecompNoAlgorithmic(int norm16)
Since formatVersion 5: same as isAlgorithmicNoNo()
-
getCCFromNoNo
private int getCCFromNoNo(int norm16)
-
getTrailCCFromCompYesAndZeroCC
int getTrailCCFromCompYesAndZeroCC(int norm16)
-
mapAlgorithmic
private int mapAlgorithmic(int c, int norm16)
-
getDataForYesOrNo
private int getDataForYesOrNo(int norm16)
-
getDataForMaybe
private int getDataForMaybe(int norm16)
-
getData
private int getData(int norm16)
-
getCompositionsListForDecompYes
private int getCompositionsListForDecompYes(int norm16)
- Returns:
- index into extraData, or -1
-
getCompositionsListForComposite
private int getCompositionsListForComposite(int norm16)
- Returns:
- index into maybeYesCompositions
-
getCompositionsList
private int getCompositionsList(int norm16)
- Parameters:
c
- code point must have compositions- Returns:
- index into maybeYesCompositions
-
decomposeShort
private int decomposeShort(java.lang.CharSequence s, int src, int limit, boolean stopAtCompBoundary, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)
-
decompose
private void decompose(int c, int norm16, Normalizer2Impl.ReorderingBuffer buffer)
-
combine
private int combine(int list, int trail)
Finds the recomposition result for a forward-combining "lead" character, specified with a pointer to its compositions list, and a backward-combining "trail" character.If the lead and trail characters combine, then this function returns the following "compositeAndFwd" value:
Bits 21..1 composite character Bit 0 set if the composite is a forward-combining starter
otherwise it returns -1.The compositions list has (trail, compositeAndFwd) pair entries, encoded as either pairs or triples of 16-bit units. The last entry has the high bit of its first unit set.
The list is sorted by ascending trail characters (there are no duplicates). A linear search is used.
See normalizer2impl.h for a more detailed description of the compositions list format.
-
addComposites
private void addComposites(int list, UnicodeSet set)
- Parameters:
list
- some character's compositions listset
- recursively receives the composites from these compositions
-
recompose
private void recompose(Normalizer2Impl.ReorderingBuffer buffer, int recomposeStartIndex, boolean onlyContiguous)
-
composePair
public int composePair(int a, int b)
-
hasCompBoundaryBefore
private boolean hasCompBoundaryBefore(int c, int norm16)
Does c have a composition boundary before it? True if its decomposition begins with a character that has ccc=0 && NFC_QC=Yes (isCompYesAndZeroCC()). As a shortcut, this is true if c itself has ccc=0 && NFC_QC=Yes (isCompYesAndZeroCC()) so we need not decompose.
-
norm16HasCompBoundaryBefore
private boolean norm16HasCompBoundaryBefore(int norm16)
-
hasCompBoundaryBefore
private boolean hasCompBoundaryBefore(java.lang.CharSequence s, int src, int limit)
-
norm16HasCompBoundaryAfter
private boolean norm16HasCompBoundaryAfter(int norm16, boolean onlyContiguous)
-
hasCompBoundaryAfter
private boolean hasCompBoundaryAfter(java.lang.CharSequence s, int start, int p, boolean onlyContiguous)
-
isTrailCC01ForCompBoundaryAfter
private boolean isTrailCC01ForCompBoundaryAfter(int norm16)
For FCC: Given norm16 HAS_COMP_BOUNDARY_AFTER, does it have tccc<=1?
-
findPreviousCompBoundary
private int findPreviousCompBoundary(java.lang.CharSequence s, int p, boolean onlyContiguous)
-
findNextCompBoundary
private int findNextCompBoundary(java.lang.CharSequence s, int p, int limit, boolean onlyContiguous)
-
findPreviousFCDBoundary
private int findPreviousFCDBoundary(java.lang.CharSequence s, int p)
-
findNextFCDBoundary
private int findNextFCDBoundary(java.lang.CharSequence s, int p, int limit)
-
getPreviousTrailCC
private int getPreviousTrailCC(java.lang.CharSequence s, int start, int p)
-
addToStartSet
private void addToStartSet(MutableCodePointTrie mutableTrie, int origin, int decompLead)
-
-