Package com.ibm.icu.charset
Class CharsetEncoderICU
java.lang.Object
java.nio.charset.CharsetEncoder
com.ibm.icu.charset.CharsetEncoderICU
- Direct Known Subclasses:
CharsetASCII.CharsetEncoderASCII
,CharsetBOCU1.CharsetEncoderBOCU
,CharsetCompoundText.CharsetEncoderCompoundText
,CharsetHZ.CharsetEncoderHZ
,CharsetISCII.CharsetEncoderISCII
,CharsetISO2022.CharsetEncoderISO2022CN
,CharsetISO2022.CharsetEncoderISO2022JP
,CharsetISO2022.CharsetEncoderISO2022KR
,CharsetLMBCS.CharsetEncoderLMBCS
,CharsetMBCS.CharsetEncoderMBCS
,CharsetSCSU.CharsetEncoderSCSU
,CharsetUTF16.CharsetEncoderUTF16
,CharsetUTF32.CharsetEncoderUTF32
,CharsetUTF7.CharsetEncoderUTF7
,CharsetUTF8.CharsetEncoderUTF8
An abstract class that provides framework methods of decoding operations for concrete
subclasses.
In the future this class will contain API that will implement converter semantics of ICU4C.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final CharBuffer
(package private) byte[]
(package private) int
(package private) static final int
(package private) CharsetCallback.Encoder
(package private) int
(package private) Object
(package private) int
these are for encodeLoopICU(package private) char[]
(package private) int
(package private) static final char
private CharsetCallback.Encoder
private CharsetCallback.Encoder
(package private) char[]
(package private) int
(package private) int
(package private) int
(package private) boolean
(package private) boolean
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) CoderResult
cbFromUWriteSub
(CharsetEncoderICU encoder, CharBuffer source, ByteBuffer target, IntBuffer offsets) (package private) CoderResult
cbFromUWriteUChars
(CharsetEncoderICU encoder, CharBuffer source, ByteBuffer target, IntBuffer offsets) (package private) final CoderResult
encode
(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush) protected CoderResult
encodeLoop
(CharBuffer in, ByteBuffer out) Encodes one or more chars.(package private) abstract CoderResult
encodeLoop
(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush) (package private) int
private void
(package private) final CoderResult
fromUnicodeWithCallback
(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush) (package private) static final CoderResult
fromUWriteBytes
(CharsetEncoderICU cnv, byte[] bytesArray, int bytesBegin, int bytesLength, ByteBuffer out, IntBuffer offsets, int sourceIndex) private static CharsetCallback.Encoder
getCallback
(CodingErrorAction action) static int
getMaxBytesForString
(int length, int maxCharSize) Calculates the size of a buffer for conversion from Unicode to a charset.(package private) final CoderResult
handleSurrogates
(char[] sourceArray, int sourceIndex, int sourceLimit, char lead) Same ashandleSurrogates(CharBuffer, char)
, but with arrays.(package private) final CoderResult
handleSurrogates
(CharBuffer source, char lead) Handles a common situation where a character has been read and it may be a lead surrogate followed by a trail surrogate.protected CoderResult
implFlush
(ByteBuffer out) Flushes any characters saved in the converter's internal buffer and resets the converter.protected void
implOnMalformedInput
(CodingErrorAction newAction) Sets the action to be taken if an illegal sequence is encounteredprotected void
implOnUnmappableCharacter
(CodingErrorAction newAction) Sets the action to be taken if an illegal sequence is encounteredprotected void
Resets the from Unicode mode of converterboolean
Is this Encoder allowed to use fallbacks? A fallback mapping is a mapping that will convert a Unicode codepoint sequence to a byte sequence, but the encoded byte sequence will round trip convert to a different Unicode codepoint sequence.(package private) static final boolean
isFromUUseFallback
(boolean iUseFallback, int c) Use fallbacks from Unicode to codepage when useFallback or for private-use code points(package private) final boolean
isFromUUseFallback
(int c) boolean
isLegalReplacement
(byte[] repl) Overrides super class methodprivate static final boolean
isUnicodePrivateUse
(int c) final float
Returns the maxCharsPerByte value for the Charset that created this encoder.void
setFallbackUsed
(boolean usesFallback) Sets whether this Encoder can use fallbacks?final void
setFromUCallback
(CoderResult err, CharsetCallback.Encoder newCallback, Object newContext) Sets the callback encoder method and context to be used if an illegal sequence is encountered.final void
setFromUContext
(Object newContext) Sets fromUContext used in callbacks.private final void
setSourcePosition
(CharBuffer source) Methods inherited from class java.nio.charset.CharsetEncoder
averageBytesPerChar, canEncode, canEncode, charset, encode, encode, flush, implReplaceWith, malformedInputAction, maxBytesPerChar, onMalformedInput, onUnmappableCharacter, replacement, replaceWith, reset, unmappableCharacterAction
-
Field Details
-
MISSING_CHAR_MARKER
static final char MISSING_CHAR_MARKER- See Also:
-
errorBuffer
byte[] errorBuffer -
errorBufferLength
int errorBufferLength -
fromUnicodeStatus
int fromUnicodeStatusthese are for encodeLoopICU -
fromUChar32
int fromUChar32 -
useSubChar1
boolean useSubChar1 -
useFallback
boolean useFallback -
EXT_MAX_UCHARS
static final int EXT_MAX_UCHARS- See Also:
-
preFromUFirstCP
int preFromUFirstCP -
preFromUArray
char[] preFromUArray -
preFromUBegin
int preFromUBegin -
preFromULength
int preFromULength -
invalidUCharBuffer
char[] invalidUCharBuffer -
invalidUCharLength
int invalidUCharLength -
fromUContext
Object fromUContext -
onUnmappableInput
-
onMalformedInput
-
fromCharErrorBehaviour
CharsetCallback.Encoder fromCharErrorBehaviour -
EMPTY
-
-
Constructor Details
-
CharsetEncoderICU
CharsetEncoderICU(CharsetICU cs, byte[] replacement)
-
-
Method Details
-
isFallbackUsed
public boolean isFallbackUsed()Is this Encoder allowed to use fallbacks? A fallback mapping is a mapping that will convert a Unicode codepoint sequence to a byte sequence, but the encoded byte sequence will round trip convert to a different Unicode codepoint sequence.- Returns:
- true if the converter uses fallback, false otherwise.
-
setFallbackUsed
public void setFallbackUsed(boolean usesFallback) Sets whether this Encoder can use fallbacks?- Parameters:
usesFallback
- true if the user wants the converter to take advantage of the fallback mapping, false otherwise.
-
isFromUUseFallback
final boolean isFromUUseFallback(int c) -
isFromUUseFallback
static final boolean isFromUUseFallback(boolean iUseFallback, int c) Use fallbacks from Unicode to codepage when useFallback or for private-use code points -
isUnicodePrivateUse
private static final boolean isUnicodePrivateUse(int c) -
implOnMalformedInput
Sets the action to be taken if an illegal sequence is encountered- Overrides:
implOnMalformedInput
in classCharsetEncoder
- Parameters:
newAction
- action to be taken- Throws:
IllegalArgumentException
-
implOnUnmappableCharacter
Sets the action to be taken if an illegal sequence is encountered- Overrides:
implOnUnmappableCharacter
in classCharsetEncoder
- Parameters:
newAction
- action to be taken- Throws:
IllegalArgumentException
-
setFromUCallback
public final void setFromUCallback(CoderResult err, CharsetCallback.Encoder newCallback, Object newContext) Sets the callback encoder method and context to be used if an illegal sequence is encountered. You would normally call this twice to set both the malform and unmappable error. In this case, newContext should remain the same since using a different newContext each time will negate the last one used.- Parameters:
err
- CoderResultnewCallback
- CharsetCallback.EncodernewContext
- Object
-
setFromUContext
Sets fromUContext used in callbacks.- Parameters:
newContext
- Object- Throws:
IllegalArgumentException
- The object is an illegal argument for UContext.
-
getCallback
-
implFlush
Flushes any characters saved in the converter's internal buffer and resets the converter.- Overrides:
implFlush
in classCharsetEncoder
- Parameters:
out
- action to be taken- Returns:
- result of flushing action and completes the decoding all input. Returns CoderResult.UNDERFLOW if the action succeeds.
-
implReset
protected void implReset()Resets the from Unicode mode of converter- Overrides:
implReset
in classCharsetEncoder
-
fromUnicodeReset
private void fromUnicodeReset() -
encodeLoop
Encodes one or more chars. The default behaviour of the converter is stop and report if an error in input stream is encountered. To set different behaviour use @see CharsetEncoder.onMalformedInput()- Specified by:
encodeLoop
in classCharsetEncoder
- Parameters:
in
- buffer to decodeout
- buffer to populate with decoded result- Returns:
- result of decoding action. Returns CoderResult.UNDERFLOW if the decoding action succeeds or more input is needed for completing the decoding action.
-
encodeLoop
abstract CoderResult encodeLoop(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush) -
encode
-
fromUnicodeWithCallback
final CoderResult fromUnicodeWithCallback(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush) -
isLegalReplacement
public boolean isLegalReplacement(byte[] repl) Overrides super class method- Overrides:
isLegalReplacement
in classCharsetEncoder
-
fromUWriteBytes
static final CoderResult fromUWriteBytes(CharsetEncoderICU cnv, byte[] bytesArray, int bytesBegin, int bytesLength, ByteBuffer out, IntBuffer offsets, int sourceIndex) -
fromUCountPending
int fromUCountPending() -
setSourcePosition
- Parameters:
source
-
-
cbFromUWriteSub
CoderResult cbFromUWriteSub(CharsetEncoderICU encoder, CharBuffer source, ByteBuffer target, IntBuffer offsets) -
cbFromUWriteUChars
CoderResult cbFromUWriteUChars(CharsetEncoderICU encoder, CharBuffer source, ByteBuffer target, IntBuffer offsets) -
handleSurrogates
Handles a common situation where a character has been read and it may be a lead surrogate followed by a trail surrogate. This method can change the source position and will modify fromUChar32.
If
null
is returned, then there was success in reading a surrogate pair, the codepoint is stored infromUChar32
andfromUChar32
should be reset (to 0) after being read.- Parameters:
source
- The encoding source.lead
- A character that may be the first in a surrogate pair.- Returns:
CoderResult.malformedForLength(1)
orCoderResult.UNDERFLOW
if there is a problem, ornull
if there isn't.- See Also:
-
handleSurrogates
Same as
handleSurrogates(CharBuffer, char)
, but with arrays. As an added requirement, the calling method must also increment the index if this method returnsnull
.- Parameters:
source
- The encoding source.lead
- A character that may be the first in a surrogate pair.- Returns:
CoderResult.malformedForLength(1)
orCoderResult.UNDERFLOW
if there is a problem, ornull
if there isn't.- See Also:
-
maxCharsPerByte
public final float maxCharsPerByte()Returns the maxCharsPerByte value for the Charset that created this encoder.- Returns:
- maxCharsPerByte
-
getMaxBytesForString
public static int getMaxBytesForString(int length, int maxCharSize) Calculates the size of a buffer for conversion from Unicode to a charset. The calculated size is guaranteed to be sufficient for this conversion. It takes into account initial and final non-character bytes that are output by some converters. It does not take into account callbacks which output more than one charset character sequence per call, like escape callbacks. The default (substitution) callback only outputs one charset character sequence.- Parameters:
length
- Number of chars to be converted.maxCharSize
- Return value from maxBytesPerChar for the converter that will be used.- Returns:
- Size of a buffer that will be large enough to hold the output of bytes
-