class Twitter::TwitterText::Regex
A collection of regular expressions for parsing Tweet text. The regular expression list is frozen at load time to ensure immutability. These regular expressions are used throughout the TwitterText
classes. Special care has been taken to make sure these reular expressions work with Tweets in all languages.
Constants
- CTRL_CHARS
- DIRECTIONAL_CHARACTERS
- DOMAIN_VALID_CHARS
- HASHTAG
- HASHTAG_LETTERS_AND_MARKS
Generated from unicode_regex/unicode_regex_groups.scala, more inclusive than Ruby's p{L}p{M}
- HASHTAG_LETTERS_NUMERALS
- HASHTAG_LETTERS_NUMERALS_SET
- HASHTAG_LETTERS_SET
- HASHTAG_NUMERALS
Generated from unicode_regex/unicode_regex_groups.scala, more inclusive than Ruby's p{Nd}
- HASHTAG_SPECIAL_CHARS
- INVALID_CHARACTERS
Character not allowed in Tweets
- LATIN_ACCENTS
Latin accented characters Excludes 0xd7 from the range (the multiplication sign, confusable with “x”). Also excludes 0xf7, the division sign
- PUNCTUATION_CHARS
- RTL_CHARACTERS
- SPACE_CHARS
- TLDS
- UNICODE_SPACES
Space is more than %20, U+3000 for example is the full-width space used with Kanji. Provide a short-hand to access both the list of characters and a pattern suitible for use with String#split
Taken from: ActiveSupport::Multibyte::Handlers::UTF8Handler::UNICODE_WHITESPACE
Public Class Methods
Return the regular expression for a given key
. If the key
is not a known symbol a nil
will be returned.
# File lib/twitter-text/regex.rb, line 376 def self.[](key) REGEXEN[key] end