class Twitter::TwitterText::Regex

A collection of regular expressions for parsing Tweet text. The regular expression list is frozen at load time to ensure immutability. These regular expressions are used throughout the TwitterText classes. Special care has been taken to make sure these reular expressions work with Tweets in all languages.

Constants

CTRL_CHARS
DIRECTIONAL_CHARACTERS
DOMAIN_VALID_CHARS
HASHTAG
HASHTAG_LETTERS_AND_MARKS

Generated from unicode_regex/unicode_regex_groups.scala, more inclusive than Ruby's p{L}p{M}

HASHTAG_LETTERS_NUMERALS
HASHTAG_LETTERS_NUMERALS_SET
HASHTAG_LETTERS_SET
HASHTAG_NUMERALS

Generated from unicode_regex/unicode_regex_groups.scala, more inclusive than Ruby's p{Nd}

HASHTAG_SPECIAL_CHARS
INVALID_CHARACTERS

Character not allowed in Tweets

LATIN_ACCENTS

Latin accented characters Excludes 0xd7 from the range (the multiplication sign, confusable with “x”). Also excludes 0xf7, the division sign

PUNCTUATION_CHARS
RTL_CHARACTERS
SPACE_CHARS
TLDS
UNICODE_SPACES

Space is more than %20, U+3000 for example is the full-width space used with Kanji. Provide a short-hand to access both the list of characters and a pattern suitible for use with String#split

Taken from: ActiveSupport::Multibyte::Handlers::UTF8Handler::UNICODE_WHITESPACE

Public Class Methods

[](key) click to toggle source

Return the regular expression for a given key. If the key is not a known symbol a nil will be returned.

# File lib/twitter-text/regex.rb, line 376
def self.[](key)
  REGEXEN[key]
end