module Spellr::TokenRegexps

Constants

AFTER_KEY_SKIPS
ALPHA_SEP_RE
BACKSLASH_ESCAPE_RE

TODO: hex escapes e.g. xAA. TODO: language aware escapes

HEX_RE
KEY_DATA_URL
KEY_GTM_RE
KEY_HYPERWALLET_RE
KEY_PATTERNS_RE
KEY_SENDGRID_RE
KEY_SHA1
KEY_SHA512
LEFTOVER_NON_WORD_BITS_RE
LOWER_CASE_RE
NOT_EVEN_NON_WORDS_RE

NON WORDS ####

NUM_SEP_RE
OTHER_CASE_RE

for characters in [:alpha:] that aren't in [:lower:] or [:upper:] e.g. Arabic

POSSIBLE_KEY_RE
REPEATED_SINGLE_LETTERS_RE
SEQUENTIAL_LETTERS_RE

There's got to be a better way of writing this

SHELL_COLOR_ESCAPE_RE
SKIPS
SPELLR_DISABLE_RE
SPELLR_ENABLE_RE
SPELLR_LINE_DISABLE_RE
TERM_RE
THREE_CHUNK_RE
TITLE_CASE_RE
Word], [Word]Word [Word]'s [Wordn't
UPPER_CASE_RE
WORD
WORD]Word [WORDN'T

[WORD]'S [WORD]'s [WORD]s

URL_ENCODED_ENTITIES_RE
URL_FRAGMENT
URL_HOSTNAME

literal \ so that i can match on domains in regexps. no-one cares but me.

URL_IP_ADDRESS
URL_PATH
URL_PORT
URL_QUERY
URL_QUERY_PART
URL_RE
URL_REST

URL can be any valid hostname, it must have either a scheme, userinfo, or path it may have those and any of the others and a port, or a query or a fragment.

URL_SCHEME

I didn't want to do this myself BUT i need something to heuristically match on, and it's difficult

URL_USERINFO

Public Instance Methods

min_alpha_re() click to toggle source

this is in a method because the minimum word length stuff was throwing it off TODO: move to config maybe?

# File lib/spellr/token_regexps.rb, line 85
def min_alpha_re
  @min_alpha_re ||= Regexp.union(
    /[A-Z][a-z]{#{Spellr.config.word_minimum_length - 1}}/,
    /[a-z]{#{Spellr.config.word_minimum_length}}/,
    /[A-Z]{#{Spellr.config.word_minimum_length}}/
  ).freeze
end