Package com.ibm.icu.text
Class UnicodeSetSpanner
- java.lang.Object
-
- com.ibm.icu.text.UnicodeSetSpanner
-
public class UnicodeSetSpanner extends java.lang.ObjectA helper class used to count, replace, and trim CharSequences based on UnicodeSet matches. An instance is immutable (and thus thread-safe) iff the source UnicodeSet is frozen.Note: The counting, deletion, and replacement depend on alternating a
UnicodeSet.SpanConditionwith its inverse. That is, the code spans, then spans for the inverse, then spans, and so on. For the inverse, the following mapping is used:UnicodeSet.SpanCondition.SIMPLE→UnicodeSet.SpanCondition.NOT_CONTAINEDUnicodeSet.SpanCondition.CONTAINED→UnicodeSet.SpanCondition.NOT_CONTAINEDUnicodeSet.SpanCondition.NOT_CONTAINED→UnicodeSet.SpanCondition.SIMPLE
SIMPLE xxx[ab]cyyy CONTAINED xxx[abc]yyy NOT_CONTAINED [xxx]ab[cyyy] So here is what happens when you alternate:
start |xxxabcyyy NOT_CONTAINED xxx|abcyyy CONTAINED xxxabc|yyy NOT_CONTAINED xxxabcyyy| The entire string is traversed.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classUnicodeSetSpanner.CountMethodOptions for replaceFrom and countIn to control how to treat each matched span.static classUnicodeSetSpanner.TrimOptionOptions for the trim() method
-
Constructor Summary
Constructors Constructor Description UnicodeSetSpanner(UnicodeSet source)Create a spanner from a UnicodeSet.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description intcountIn(java.lang.CharSequence sequence)Returns the number of matching characters found in a character sequence, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE.intcountIn(java.lang.CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod)Returns the number of matching characters found in a character sequence, using SpanCondition.SIMPLE.intcountIn(java.lang.CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)Returns the number of matching characters found in a character sequence.java.lang.StringdeleteFrom(java.lang.CharSequence sequence)Delete all the matching spans in sequence, using SpanCondition.SIMPLE The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.java.lang.StringdeleteFrom(java.lang.CharSequence sequence, UnicodeSet.SpanCondition spanCondition)Delete all matching spans in sequence, according to the spanCondition.booleanequals(java.lang.Object other)UnicodeSetgetUnicodeSet()Returns the UnicodeSet used for processing.inthashCode()java.lang.StringreplaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement)Replace all matching spans in sequence by the replacement, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE.java.lang.StringreplaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod)Replace all matching spans in sequence by replacement, according to the CountMethod, using SpanCondition.SIMPLE.java.lang.StringreplaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)Replace all matching spans in sequence by replacement, according to the countMethod and spanCondition.java.lang.CharSequencetrim(java.lang.CharSequence sequence)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start and end of the string, using TrimOption.BOTH and SpanCondition.SIMPLE.java.lang.CharSequencetrim(java.lang.CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, using the trimOption and SpanCondition.SIMPLE.java.lang.CharSequencetrim(java.lang.CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption, UnicodeSet.SpanCondition spanCondition)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, depending on the trimOption and spanCondition.
-
-
-
Constructor Detail
-
UnicodeSetSpanner
public UnicodeSetSpanner(UnicodeSet source)
Create a spanner from a UnicodeSet. For speed and safety, the UnicodeSet should be frozen. However, this class can be used with a non-frozen version to avoid the cost of freezing.- Parameters:
source- the original UnicodeSet
-
-
Method Detail
-
getUnicodeSet
public UnicodeSet getUnicodeSet()
Returns the UnicodeSet used for processing. It is frozen iff the original was.- Returns:
- the construction set.
-
equals
public boolean equals(java.lang.Object other)
- Overrides:
equalsin classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
countIn
public int countIn(java.lang.CharSequence sequence)
Returns the number of matching characters found in a character sequence, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- the sequence to count characters in- Returns:
- the count. Zero if there are none.
-
countIn
public int countIn(java.lang.CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod)Returns the number of matching characters found in a character sequence, using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- the sequence to count characters incountMethod- whether to treat an entire span as a match, or individual elements as matches- Returns:
- the count. Zero if there are none.
-
countIn
public int countIn(java.lang.CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)Returns the number of matching characters found in a character sequence. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- the sequence to count characters incountMethod- whether to treat an entire span as a match, or individual elements as matchesspanCondition- the spanCondition to use. SIMPLE or CONTAINED means only count the elements in the span; NOT_CONTAINED is the reverse.
WARNING: when a UnicodeSet contains strings, there may be unexpected behavior in edge cases.- Returns:
- the count. Zero if there are none.
-
deleteFrom
public java.lang.String deleteFrom(java.lang.CharSequence sequence)
Delete all the matching spans in sequence, using SpanCondition.SIMPLE The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.- Returns:
- modified string.
-
deleteFrom
public java.lang.String deleteFrom(java.lang.CharSequence sequence, UnicodeSet.SpanCondition spanCondition)Delete all matching spans in sequence, according to the spanCondition. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.spanCondition- specify whether to modify the matching spans (CONTAINED or SIMPLE) or the non-matching (NOT_CONTAINED)- Returns:
- modified string.
-
replaceFrom
public java.lang.String replaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement)Replace all matching spans in sequence by the replacement, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.replacement- replacement sequence. To delete, use ""- Returns:
- modified string.
-
replaceFrom
public java.lang.String replaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod)Replace all matching spans in sequence by replacement, according to the CountMethod, using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.replacement- replacement sequence. To delete, use ""countMethod- whether to treat an entire span as a match, or individual elements as matches- Returns:
- modified string.
-
replaceFrom
public java.lang.String replaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)Replace all matching spans in sequence by replacement, according to the countMethod and spanCondition. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.replacement- replacement sequence. To delete, use ""countMethod- whether to treat an entire span as a match, or individual elements as matchesspanCondition- specify whether to modify the matching spans (CONTAINED or SIMPLE) or the non-matching (NOT_CONTAINED)- Returns:
- modified string.
-
trim
public java.lang.CharSequence trim(java.lang.CharSequence sequence)
Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start and end of the string, using TrimOption.BOTH and SpanCondition.SIMPLE. For example:
... returnsnew UnicodeSet("[ab]").trim("abacatbab")"cat".- Parameters:
sequence- the sequence to trim- Returns:
- a subsequence
-
trim
public java.lang.CharSequence trim(java.lang.CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, using the trimOption and SpanCondition.SIMPLE. For example:
... returnsnew UnicodeSet("[ab]").trim("abacatbab", TrimOption.LEADING)"catbab".- Parameters:
sequence- the sequence to trimtrimOption- LEADING, TRAILING, or BOTH- Returns:
- a subsequence
-
trim
public java.lang.CharSequence trim(java.lang.CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption, UnicodeSet.SpanCondition spanCondition)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, depending on the trimOption and spanCondition. For example:
... returnsnew UnicodeSet("[ab]").trim("abacatbab", TrimOption.LEADING, SpanCondition.SIMPLE)"catbab".- Parameters:
sequence- the sequence to trimtrimOption- LEADING, TRAILING, or BOTHspanCondition- SIMPLE, CONTAINED or NOT_CONTAINED- Returns:
- a subsequence
-
-