java.lang.Object
org.apache.lucene.analysis.icu.segmentation.BreakIteratorWrapper

final class BreakIteratorWrapper extends Object
Wraps RuleBasedBreakIterator, making object reuse convenient and emitting a rule status for emoji sequences.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    (package private) static final com.ibm.icu.text.UnicodeSet
     
    (package private) static final com.ibm.icu.text.UnicodeSet
     
    private final com.ibm.icu.text.RuleBasedBreakIterator
     
    private int
     
    private int
     
    private char[]
     
    private final CharArrayIterator
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    BreakIteratorWrapper(com.ibm.icu.text.RuleBasedBreakIterator rbbi)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private int
    calcStatus(int current, int next)
    Returns current rule status for the text between breaks.
    (package private) int
     
    (package private) int
     
    private boolean
    isEmoji(int current, int next)
    Returns true if the current text represents emoji character or sequence
    (package private) int
     
    (package private) void
    setText(char[] text, int start, int length)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • textIterator

      private final CharArrayIterator textIterator
    • rbbi

      private final com.ibm.icu.text.RuleBasedBreakIterator rbbi
    • text

      private char[] text
    • start

      private int start
    • status

      private int status
    • EMOJI_RK

      static final com.ibm.icu.text.UnicodeSet EMOJI_RK
    • EMOJI

      static final com.ibm.icu.text.UnicodeSet EMOJI
  • Constructor Details

    • BreakIteratorWrapper

      BreakIteratorWrapper(com.ibm.icu.text.RuleBasedBreakIterator rbbi)
  • Method Details

    • current

      int current()
    • getRuleStatus

      int getRuleStatus()
    • next

      int next()
    • calcStatus

      private int calcStatus(int current, int next)
      Returns current rule status for the text between breaks. (determines token type)
    • isEmoji

      private boolean isEmoji(int current, int next)
      Returns true if the current text represents emoji character or sequence
    • setText

      void setText(char[] text, int start, int length)