class Mani::Tokenizer

This class contains methods to handle the tokenization of strings.

Constants

ESCAPE_CHARACTER

The escape character

LITERAL_CLOSE_DELIMITER

The delimiter signifying a “close sequence” escape sequence

LITERAL_OPEN_DELIMITER

The delimiter signifying an “open sequence” escape sequence

SEQUENCE_CLOSE

The pattern to match the end of a sequence

SEQUENCE_CLOSE_DELIMITER

The delimiter signifying the end of a sequence

SEQUENCE_OPEN

The pattern to match the start of a sequence

SEQUENCE_OPEN_DELIMITER

The delimiter signifying the start of a sequence

Public Class Methods

get_tokens(text) click to toggle source

Retrieves the tokens comprising the supplied text.

@param [String] text The text @return [Array]

# File lib/mani/tokenizer.rb, line 43
def self.get_tokens(text)
  tokenize StringScanner.new(text), []
end
strip_comment_delimiters(text) click to toggle source

Strips the comment delimiters from the supplied text.

@param [String] text The text @return [String]

# File lib/mani/tokenizer.rb, line 51
def self.strip_comment_delimiters(text)
  text
    .gsub(LITERAL_OPEN_DELIMITER, SEQUENCE_OPEN_DELIMITER)
    .gsub(LITERAL_CLOSE_DELIMITER, SEQUENCE_CLOSE_DELIMITER)
end
tokenize(scanner, tokens) click to toggle source

Recursively scans the string within the supplied scanner to produce a list of tokens.

@param [StringScanner] scanner The string scanner @param [Array] tokens The tokens @return [Array]

# File lib/mani/tokenizer.rb, line 63
def self.tokenize(scanner, tokens)
  match = scanner.scan_until SEQUENCE_OPEN
  unless match
    static = strip_comment_delimiters scanner.rest
    tokens.concat [[:static, static]] unless static.empty?
    return tokens
  end

  if scanner.check_until SEQUENCE_CLOSE
    static = strip_comment_delimiters match.chomp(SEQUENCE_OPEN_DELIMITER)
    tokens.concat [[:static, static]] unless static.empty?

    match = scanner.scan_until SEQUENCE_CLOSE
    match.chomp! SEQUENCE_CLOSE_DELIMITER

    sequence = strip_comment_delimiters match
    tokens.concat [[:sequence, sequence]] unless sequence.empty?

    tokenize scanner, tokens
  else
    static = strip_comment_delimiters(match + scanner.rest)
    tokens.concat [[:static, static]]
  end
end