class CorpusProcessor::Token

The internal representation of a token.

Tokens are extracted from original corpus and are defined by single words or punctuation.

They also contain a category, which is originated form the tagging in the corpus.

Attributes

category[RW]

@return [Symbol] the type of the {Token}. It should be a valid category

from {Categories}.
word[RW]

@return [String] the word from text. It shouldn’t contain spaces.

Public Class Methods

new(word = '', category = nil) click to toggle source

@param word [String] the word from text. It shouldn’t contain spaces. @param category [Symbol] the type of the {Token}. It should be a valid

category from {Categories}.
# File lib/corpus-processor/token.rb, line 20
def initialize word = '', category = nil
  self.word     = word
  self.category = category
end

Public Instance Methods

==(other) click to toggle source

Determine equality of two {Token}s.

@param other [Token] the other {Token} to test.

# File lib/corpus-processor/token.rb, line 28
def ==(other)
  word == other.word && category == other.category
end