class CorpusProcessor::Generators::StanfordNer
The generator for Stanford NER corpus.
Generates corpus in the format used by Stanford NER training.
Public Class Methods
new(categories = CorpusProcessor::Categories.default)
click to toggle source
@param categories [Hash] the categories definitions loaded by
{CorpusProcessor::Categories}.
# File lib/corpus-processor/generators/stanford_ner.rb, line 8 def initialize categories = CorpusProcessor::Categories.default @categories = categories.fetch :output end
Public Instance Methods
generate(tokens)
click to toggle source
Generate the corpus from tokens.
@param tokens [Array<CorpusProcessor::Token>] the tokens from which
the corpus is generated.
@return [String] the generated corpus.
# File lib/corpus-processor/generators/stanford_ner.rb, line 17 def generate tokens tokens.map { |token| "#{ token.word }\t#{ @categories[token.category] }" }.join("\n") + "\n" end