Package com.ibm.icu.impl.breakiter
Class LSTMBreakEngine
java.lang.Object
com.ibm.icu.impl.breakiter.DictionaryBreakEngine
com.ibm.icu.impl.breakiter.LSTMBreakEngine
- All Implemented Interfaces:
LanguageBreakEngine
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) classstatic enum(package private) classstatic enumstatic class(package private) classNested classes/interfaces inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
DictionaryBreakEngine.DequeI, DictionaryBreakEngine.PossibleWord -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final LSTMBreakEngine.LSTMDataprivate intprivate final LSTMBreakEngine.Vectorizerprivate static final byteprivate static final byteFields inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
fSet -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static voidaddDotProductTo(float[] a, float[][] b, float[] result) private static voidaddHadamardProductTo(float[] a, float[] b, float[] result) private static voidaddTo(float[] a, float[] result) private float[]compute(float[][] W, float[][] U, float[] B, float[] x, float[] h, float[] c) static LSTMBreakEnginecreate(int script, LSTMBreakEngine.LSTMData data) static LSTMBreakEngine.LSTMDatacreateData(int script) static LSTMBreakEngine.LSTMDatacreateData(UResourceBundle bundle) private static StringdefaultLSTM(int script) intdivideUpDictionaryRange(CharacterIterator fIter, int rangeStart, int rangeEnd, DictionaryBreakEngine.DequeI foundBreaks, boolean isPhraseBreaking) Divide up a range of known dictionary characters handled by this break engine.private static voidhadamardProductTo(float[] a, float[] result) booleanhandles(int c) inthashCode()private static float[]make1DArray(int[] data, int start, int d1) private static float[][]make2DArray(int[] data, int start, int d1, int d2) private LSTMBreakEngine.Vectorizerprivate static intmaxIndex(float[] data) private static voidsigmoid(float[] result, int start, int length) private static voidtanh(float[] result, int start, int length) Methods inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
findBreaks, setCharacters
-
Field Details
-
MIN_WORD
private static final byte MIN_WORD- See Also:
-
MIN_WORD_SPAN
private static final byte MIN_WORD_SPAN- See Also:
-
fData
-
fScript
private int fScript -
fVectorizer
-
-
Constructor Details
-
LSTMBreakEngine
-
-
Method Details
-
make2DArray
private static float[][] make2DArray(int[] data, int start, int d1, int d2) -
make1DArray
private static float[] make1DArray(int[] data, int start, int d1) -
makeVectorizer
-
hashCode
public int hashCode() -
handles
public boolean handles(int c) - Specified by:
handlesin interfaceLanguageBreakEngine- Overrides:
handlesin classDictionaryBreakEngine- Parameters:
c- A Unicode codepoint value- Returns:
- true if the engine can handle this character, false otherwise
-
addDotProductTo
private static void addDotProductTo(float[] a, float[][] b, float[] result) -
addTo
private static void addTo(float[] a, float[] result) -
hadamardProductTo
private static void hadamardProductTo(float[] a, float[] result) -
addHadamardProductTo
private static void addHadamardProductTo(float[] a, float[] b, float[] result) -
sigmoid
private static void sigmoid(float[] result, int start, int length) -
tanh
private static void tanh(float[] result, int start, int length) -
maxIndex
private static int maxIndex(float[] data) -
compute
private float[] compute(float[][] W, float[][] U, float[] B, float[] x, float[] h, float[] c) -
divideUpDictionaryRange
public int divideUpDictionaryRange(CharacterIterator fIter, int rangeStart, int rangeEnd, DictionaryBreakEngine.DequeI foundBreaks, boolean isPhraseBreaking) Description copied from class:DictionaryBreakEngineDivide up a range of known dictionary characters handled by this break engine.
- Specified by:
divideUpDictionaryRangein classDictionaryBreakEngine- Parameters:
fIter- A UText representing the textrangeStart- The start of the range of dictionary charactersrangeEnd- The end of the range of dictionary charactersfoundBreaks- Output of break positions. Positions are pushed. Pre-existing contents of the output stack are unaltered.- Returns:
- The number of breaks found
-
createData
-
defaultLSTM
-
createData
-
create
-