module ICU::Util::String

For converting strings in various ways.

Constants

ACCENTED_CHARS
LOWER_CHARS
UNACCENTED_CHARS
UPPER_CHARS

Public Class Methods

capitalize(str) click to toggle source

Capilalize a UTF-8 string that might contain accented characters.

# File lib/icu_name/util.rb, line 44
def self.capitalize(str)
  return str.capitalize if str.ascii_only? || !str.match(/\A(.)(.*)\z/)
  upcase($1) + downcase($2)
end
downcase(str) click to toggle source

Downcase a UTF-8 string that might contain accented characters.

# File lib/icu_name/util.rb, line 37
def self.downcase(str)
  str = str.downcase
  return str if str.ascii_only?
  str.tr(UPPER_CHARS, LOWER_CHARS)
end
is_utf8(str) click to toggle source

Decide if a string is valid UTF-8 or not, returning true or false.

# File lib/icu_name/util.rb, line 14
def self.is_utf8(str)
  dup = str.dup
  dup.force_encoding("UTF-8")
  dup.valid_encoding?
end
to_utf8(str) click to toggle source

Try to convert any string to UTF-8.

# File lib/icu_name/util.rb, line 21
def self.to_utf8(str)
  utf8 = is_utf8(str)
  dup = str.dup
  return dup.force_encoding("UTF-8") if utf8
  dup.force_encoding("Windows-1252") if dup.encoding.name.match(/^(ASCII-8BIT|UTF-8)$/)
  dup.encode("UTF-8")
end
transliterate(str) click to toggle source

Transliterate Latin-1 accented characters to ASCII.

# File lib/icu_name/util.rb, line 50
def self.transliterate(str)
  return str.dup if str.ascii_only?
  str.tr(ACCENTED_CHARS, UNACCENTED_CHARS)
end
upcase(str) click to toggle source

Upcase a UTF-8 string that might contain accented characters.

# File lib/icu_name/util.rb, line 30
def self.upcase(str)
  str = str.upcase
  return str if str.ascii_only?
  str.tr(LOWER_CHARS, UPPER_CHARS)
end