module UTF8Util
Constants
- REPLACEMENT_CHAR
use '?' intsead of the unicode replace char, since that is 3 bytes and can increase the string size if it's done a lot
Public Class Methods
clean(str)
click to toggle source
Replace invalid UTF-8 character sequences with a replacement character
Returns a copy of this String as valid UTF-8.
# File lib/resque/vendor/utf8_util.rb, line 17 def self.clean(str) clean!(str.dup) end
clean!(str)
click to toggle source
Replace invalid UTF-8 character sequences with a replacement character
Returns self as valid UTF-8.
# File lib/resque/vendor/utf8_util.rb, line 9 def self.clean!(str) return str if str.encoding.to_s == "UTF-8" str.force_encoding("binary").encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => REPLACEMENT_CHAR) end