Class Edits
- java.lang.Object
-
- com.ibm.icu.text.Edits
-
public final class Edits extends java.lang.Object
Records lengths of string edits but not replacement text. Supports replacements, insertions, deletions in linear progression. Does not support moving/reordering of text.There are two types of edits: change edits and no-change edits. Add edits to instances of this class using
addReplace(int, int)
(for change edits) andaddUnchanged(int)
(for no-change edits). Change edits are retained with full granularity, whereas adjacent no-change edits are always merged together. In no-change edits, there is a one-to-one mapping between code points in the source and destination strings.After all edits have been added, instances of this class should be considered immutable, and an
Edits.Iterator
can be used for queries.There are four flavors of Edits.Iterator:
getFineIterator()
retains full granularity of change edits.getFineChangesIterator()
retains full granularity of change edits, and when calling next() on the iterator, skips over no-change edits (unchanged regions).getCoarseIterator()
treats adjacent change edits as a single edit. (Adjacent no-change edits are automatically merged during the construction phase.)getCoarseChangesIterator()
treats adjacent change edits as a single edit, and when calling next() on the iterator, skips over no-change edits (unchanged regions).
For example, consider the string "abcßDeF", which case-folds to "abcssdef". This string has the following fine edits:
- abc ⇨ abc (no-change)
- ß ⇨ ss (change)
- D ⇨ d (change)
- e ⇨ e (no-change)
- F ⇨ f (change)
- abc ⇨ abc (no-change)
- ßD ⇨ ssd (change)
- e ⇨ e (no-change)
- F ⇨ f (change)
The "fine changes" and "coarse changes" iterators will step through only the change edits when their
Edits.Iterator.next()
methods are called. They are identical to the non-change iterators when theirEdits.Iterator.findSourceIndex(int)
orEdits.Iterator.findDestinationIndex(int)
methods are used to walk through the string.For examples of how to use this class, see the test
TestCaseMapEditsIteratorDocs
in UCharacterCaseTest.java.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Edits.Iterator
Access to the list of edits.
-
Field Summary
Fields Modifier and Type Field Description private char[]
array
private int
delta
private int
length
private static int
LENGTH_IN_1TRAIL
private static int
LENGTH_IN_2TRAIL
private static int
MAX_SHORT_CHANGE
private static int
MAX_SHORT_CHANGE_NEW_LENGTH
private static int
MAX_SHORT_CHANGE_OLD_LENGTH
private static int
MAX_UNCHANGED
private static int
MAX_UNCHANGED_LENGTH
private int
numChanges
private static int
SHORT_CHANGE_NUM_MASK
private static int
STACK_CAPACITY
-
Constructor Summary
Constructors Constructor Description Edits()
Constructs an empty object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addReplace(int oldLength, int newLength)
Adds a change edit: a record for a text replacement/insertion/deletion.void
addUnchanged(int unchangedLength)
Adds a no-change edit: a record for an unchanged segment of text.private void
append(int r)
Edits.Iterator
getCoarseChangesIterator()
Returns an Iterator for coarse-grained change edits (adjacent change edits are treated as one).Edits.Iterator
getCoarseIterator()
Returns an Iterator for coarse-grained change and no-change edits (adjacent change edits are treated as one).Edits.Iterator
getFineChangesIterator()
Returns an Iterator for fine-grained change edits (full granularity of change edits is retained).Edits.Iterator
getFineIterator()
Returns an Iterator for fine-grained change and no-change edits (full granularity of change edits is retained).private boolean
growArray()
boolean
hasChanges()
private int
lastUnit()
int
lengthDelta()
How much longer is the new text compared with the old text?Edits
mergeAndAppend(Edits ab, Edits bc)
Merges the two input Edits and appends the result to this object.int
numberOfChanges()
void
reset()
Resets the data but may not release memory.private void
setLastUnit(int last)
-
-
-
Field Detail
-
MAX_UNCHANGED_LENGTH
private static final int MAX_UNCHANGED_LENGTH
- See Also:
- Constant Field Values
-
MAX_UNCHANGED
private static final int MAX_UNCHANGED
- See Also:
- Constant Field Values
-
MAX_SHORT_CHANGE_OLD_LENGTH
private static final int MAX_SHORT_CHANGE_OLD_LENGTH
- See Also:
- Constant Field Values
-
MAX_SHORT_CHANGE_NEW_LENGTH
private static final int MAX_SHORT_CHANGE_NEW_LENGTH
- See Also:
- Constant Field Values
-
SHORT_CHANGE_NUM_MASK
private static final int SHORT_CHANGE_NUM_MASK
- See Also:
- Constant Field Values
-
MAX_SHORT_CHANGE
private static final int MAX_SHORT_CHANGE
- See Also:
- Constant Field Values
-
LENGTH_IN_1TRAIL
private static final int LENGTH_IN_1TRAIL
- See Also:
- Constant Field Values
-
LENGTH_IN_2TRAIL
private static final int LENGTH_IN_2TRAIL
- See Also:
- Constant Field Values
-
STACK_CAPACITY
private static final int STACK_CAPACITY
- See Also:
- Constant Field Values
-
array
private char[] array
-
length
private int length
-
delta
private int delta
-
numChanges
private int numChanges
-
-
Method Detail
-
reset
public void reset()
Resets the data but may not release memory.
-
setLastUnit
private void setLastUnit(int last)
-
lastUnit
private int lastUnit()
-
addUnchanged
public void addUnchanged(int unchangedLength)
Adds a no-change edit: a record for an unchanged segment of text. Normally called from inside ICU string transformation functions, not user code.
-
addReplace
public void addReplace(int oldLength, int newLength)
Adds a change edit: a record for a text replacement/insertion/deletion. Normally called from inside ICU string transformation functions, not user code.
-
append
private void append(int r)
-
growArray
private boolean growArray()
-
lengthDelta
public int lengthDelta()
How much longer is the new text compared with the old text?- Returns:
- new length minus old length
-
hasChanges
public boolean hasChanges()
- Returns:
- true if there are any change edits
-
numberOfChanges
public int numberOfChanges()
- Returns:
- the number of change edits
-
getCoarseChangesIterator
public Edits.Iterator getCoarseChangesIterator()
Returns an Iterator for coarse-grained change edits (adjacent change edits are treated as one). Can be used to perform simple string updates. Skips no-change edits.- Returns:
- an Iterator that merges adjacent changes.
-
getCoarseIterator
public Edits.Iterator getCoarseIterator()
Returns an Iterator for coarse-grained change and no-change edits (adjacent change edits are treated as one). Can be used to perform simple string updates. Adjacent change edits are treated as one edit.- Returns:
- an Iterator that merges adjacent changes.
-
getFineChangesIterator
public Edits.Iterator getFineChangesIterator()
Returns an Iterator for fine-grained change edits (full granularity of change edits is retained). Can be used for modifying styled text. Skips no-change edits.- Returns:
- an Iterator that separates adjacent changes.
-
getFineIterator
public Edits.Iterator getFineIterator()
Returns an Iterator for fine-grained change and no-change edits (full granularity of change edits is retained). Can be used for modifying styled text.- Returns:
- an Iterator that separates adjacent changes.
-
mergeAndAppend
public Edits mergeAndAppend(Edits ab, Edits bc)
Merges the two input Edits and appends the result to this object.Consider two string transformations (for example, normalization and case mapping) where each records Edits in addition to writing an output string.
Edits ab reflect how substrings of input string a map to substrings of intermediate string b.
Edits bc reflect how substrings of intermediate string b map to substrings of output string c.
This function merges ab and bc such that the additional edits recorded in this object reflect how substrings of input string a map to substrings of output string c.If unrelated Edits are passed in where the output string of the first has a different length than the input string of the second, then an IllegalArgumentException is thrown.
- Parameters:
ab
- reflects how substrings of input string a map to substrings of intermediate string b.bc
- reflects how substrings of intermediate string b map to substrings of output string c.- Returns:
- this, with the merged edits appended
-
-