module Spellr::TokenRegexps
Constants
- AFTER_KEY_SKIPS
- ALPHA_SEP_RE
- BACKSLASH_ESCAPE_RE
TODO: hex escapes e.g. xAA. TODO: language aware escapes
- HEX_RE
- KEY_DATA_URL
- KEY_GTM_RE
- KEY_HYPERWALLET_RE
- KEY_PATTERNS_RE
- KEY_SENDGRID_RE
- KEY_SHA1
- KEY_SHA512
- LEFTOVER_NON_WORD_BITS_RE
- LOWER_CASE_RE
- NOT_EVEN_NON_WORDS_RE
NON WORDS ####
- NUM_SEP_RE
- OTHER_CASE_RE
for characters in [:alpha:] that aren't in [:lower:] or [:upper:] e.g. Arabic
- POSSIBLE_KEY_RE
- REPEATED_SINGLE_LETTERS_RE
- SEQUENTIAL_LETTERS_RE
There's got to be a better way of writing this
- SHELL_COLOR_ESCAPE_RE
- SKIPS
- SPELLR_DISABLE_RE
- SPELLR_ENABLE_RE
- SPELLR_LINE_DISABLE_RE
- TERM_RE
- THREE_CHUNK_RE
- TITLE_CASE_RE
- Word], [Word]Word [Word]'s [Wordn't
- UPPER_CASE_RE
- WORD
- WORD]Word [WORDN'T
-
[WORD]'S [WORD]'s [WORD]s
- URL_ENCODED_ENTITIES_RE
- URL_FRAGMENT
- URL_HOSTNAME
literal \ so that i can match on domains in regexps. no-one cares but me.
- URL_IP_ADDRESS
- URL_PATH
- URL_PORT
- URL_QUERY
- URL_QUERY_PART
- URL_RE
- URL_REST
URL can be any valid hostname, it must have either a scheme, userinfo, or path it may have those and any of the others and a port, or a query or a fragment.
- URL_SCHEME
I didn't want to do this myself BUT i need something to heuristically match on, and it's difficult
- URL_USERINFO
Public Instance Methods
min_alpha_re()
click to toggle source
this is in a method because the minimum word length stuff was throwing it off TODO: move to config maybe?
# File lib/spellr/token_regexps.rb, line 85 def min_alpha_re @min_alpha_re ||= Regexp.union( /[A-Z][a-z]{#{Spellr.config.word_minimum_length - 1}}/, /[a-z]{#{Spellr.config.word_minimum_length}}/, /[A-Z]{#{Spellr.config.word_minimum_length}}/ ).freeze end