Package com.ibm.icu.charset
Class CharsetCESU8
java.lang.Object
java.nio.charset.Charset
com.ibm.icu.charset.CharsetICU
com.ibm.icu.charset.CharsetUTF8
com.ibm.icu.charset.CharsetCESU8
- All Implemented Interfaces:
Comparable<Charset>
The purpose of this class is to set isCESU8 to true in the super class, and to allow the Charset framework to open
the variant UTF-8 converter without extra setup work. CESU-8 encodes/decodes supplementary characters as 6 bytes
instead of the proper 4 bytes.
-
Nested Class Summary
Nested classes/interfaces inherited from class com.ibm.icu.charset.CharsetUTF8
CharsetUTF8.CharsetDecoderUTF8, CharsetUTF8.CharsetEncoderUTF8 -
Field Summary
Fields inherited from class com.ibm.icu.charset.CharsetICU
codepage, conversionType, hasFromUnicodeFallback, hasToUnicodeFallback, icuCanonicalName, maxBytesPerChar, maxCharsPerByte, minBytesPerChar, name, options, platform, ROUNDTRIP_AND_FALLBACK_SET, ROUNDTRIP_SET, subChar, subChar1, subCharLen, unicodeMask -
Constructor Summary
ConstructorsConstructorDescriptionCharsetCESU8(String icuCanonicalName, String javaCanonicalName, String[] aliases) -
Method Summary
Modifier and TypeMethodDescription(package private) voidgetUnicodeSetImpl(UnicodeSet setFillIn, int which) This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored.Methods inherited from class com.ibm.icu.charset.CharsetUTF8
newDecoder, newEncoderMethods inherited from class com.ibm.icu.charset.CharsetICU
contains, forNameICU, getCharset, getCompleteUnicodeSet, getNonSurrogateUnicodeSet, getUnicodeSet, isFixedWidth, isSurrogateMethods inherited from class java.nio.charset.Charset
aliases, availableCharsets, canEncode, compareTo, decode, defaultCharset, displayName, displayName, encode, encode, equals, forName, forName, hashCode, isRegistered, isSupported, name, toString
-
Constructor Details
-
CharsetCESU8
-
-
Method Details
-
getUnicodeSetImpl
Description copied from class:CharsetICUThis follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored. Detects Unicode signature byte sequences at the start of the byte stream and returns number of bytes of the BOM of the indicated Unicode charset. 0 is returned when no Unicode signature is recognized.- Overrides:
getUnicodeSetImplin classCharsetUTF8
-