class SearchLingo::Tokenizer

Tokenizer breaks down a query string into individual tokens.

Tokenizer.new 'foo'
Tokenizer.foo 'foo "bar baz"'
Tokenizer.foo 'foo "bar baz" froz: quux'

Constants

COMPOUND_TOKEN

Pattern for matching a compound token (a term with an optional modifier).

DELIMITER

Pattern for matching the delimiter between tokens.

SIMPLE_TOKEN

Pattern for matching a simple token (a term without a modifier).

Attributes

scanner[R]

Public Instance Methods

each() { |next until eos?| ... } click to toggle source

Iterates over the query string. If called with a block, it yields each token. If called without a block, it returns an Enumerator.

# File lib/search_lingo/tokenizer.rb, line 38
def each
  return to_enum(__callee__) unless block_given?

  yield self.next until scanner.eos?
end
next() click to toggle source

Returns a Token for the next token in the query string. When the end of the query string is reached raises StopIteration.

# File lib/search_lingo/tokenizer.rb, line 47
def next
  scanner.skip DELIMITER
  token = scanner.scan COMPOUND_TOKEN
  raise StopIteration unless token

  Token.new token
end
simplify() click to toggle source

Rewinds the query string from the last returned token and returns a Token for the next simple token.

# File lib/search_lingo/tokenizer.rb, line 60
def simplify
  scanner.unscan
  Token.new scanner.scan SIMPLE_TOKEN
end