module Addressable::IDNA
Constants
- ACE_MAX_LENGTH
- ACE_PREFIX
- COMPOSITION_TABLE
- PUNYCODE_BASE
- PUNYCODE_DAMP
- PUNYCODE_DELIMITER
- PUNYCODE_INITIAL_BIAS
- PUNYCODE_INITIAL_N
- PUNYCODE_MAXINT
- PUNYCODE_PRINT_ASCII
- PUNYCODE_SKEW
- PUNYCODE_TMAX
- PUNYCODE_TMIN
- UNICODE_DATA
This is a sparse Unicode table. Codepoints without entries are assumed to have the value: [0, 0, nil, nil, nil, nil, nil]
- UNICODE_DATA_CANONICAL
- UNICODE_DATA_COMBINING_CLASS
- UNICODE_DATA_COMPATIBILITY
- UNICODE_DATA_EXCLUSION
- UNICODE_DATA_LOWERCASE
- UNICODE_DATA_TITLECASE
- UNICODE_DATA_UPPERCASE
- UNICODE_MAX_LENGTH
- UNICODE_TABLE
This module is loosely based on idn_actionmailer by Mick Staugaard, the unicode library by Yoshida Masato, and the punycode implementation by Kazuhiro Nishiyama. Most of the code was copied verbatim, but some reformatting was done, and some translation from C was done.
Without their code to work from as a base, we’d all still be relying on the presence of libidn. Which nobody ever seems to have installed.
Original sources: github.com/staugaard/idn_actionmailer www.yoshidam.net/Ruby.html#unicode rubyforge.org/frs/?group_id=2550
- UTF8_REGEX
- UTF8_REGEX_MULTIBYTE
Public Class Methods
# File lib/addressable/idna/native.rb, line 42 def self.to_ascii(value) value.to_s.split('.', -1).map do |segment| if segment.size > 0 && segment.size < 64 IDN::Idna.toASCII(segment, IDN::Idna::ALLOW_UNASSIGNED) elsif segment.size >= 64 segment else '' end end.join('.') end
# File lib/addressable/idna/native.rb, line 54 def self.to_unicode(value) value.to_s.split('.', -1).map do |segment| if segment.size > 0 && segment.size < 64 IDN::Idna.toUnicode(segment, IDN::Idna::ALLOW_UNASSIGNED) elsif segment.size >= 64 segment else '' end end.join('.') end
@deprecated Use {String#unicode_normalize(:nfkc)} instead
# File lib/addressable/idna/native.rb, line 34 def unicode_normalize_kc(value) value.to_s.unicode_normalize(:nfkc) end
Private Class Methods
# File lib/addressable/idna/pure.rb, line 140 def self.lookup_unicode_lowercase(codepoint) codepoint_data = UNICODE_DATA[codepoint] (codepoint_data ? (codepoint_data[UNICODE_DATA_LOWERCASE] || codepoint) : codepoint) end
Bias adaptation method
# File lib/addressable/idna/pure.rb, line 488 def self.punycode_adapt(delta, numpoints, firsttime) delta = firsttime ? delta / PUNYCODE_DAMP : delta >> 1 # delta >> 1 is a faster way of doing delta / 2 delta += delta / numpoints difference = PUNYCODE_BASE - PUNYCODE_TMIN k = 0 while delta > (difference * PUNYCODE_TMAX) / 2 delta /= difference k += PUNYCODE_BASE end k + (difference + 1) * delta / (delta + PUNYCODE_SKEW) end
# File lib/addressable/idna/pure.rb, line 456 def self.punycode_basic?(codepoint) codepoint < 0x80 end
# File lib/addressable/idna/native.rb, line 28 def self.punycode_decode(value) IDN::Punycode.decode(value.to_s) end
Returns the numeric value of a basic codepoint (for use in representing integers) in the range 0 to base - 1, or PUNYCODE_BASE
if codepoint does not represent a value.
# File lib/addressable/idna/pure.rb, line 474 def self.punycode_decode_digit(codepoint) if codepoint - 48 < 10 codepoint - 22 elsif codepoint - 65 < 26 codepoint - 65 elsif codepoint - 97 < 26 codepoint - 97 else PUNYCODE_BASE end end
# File lib/addressable/idna/pure.rb, line 461 def self.punycode_delimiter?(codepoint) codepoint == PUNYCODE_DELIMITER end
# File lib/addressable/idna/native.rb, line 24 def self.punycode_encode(value) IDN::Punycode.encode(value.to_s) end
# File lib/addressable/idna/pure.rb, line 466 def self.punycode_encode_digit(d) d + 22 + 75 * ((d < 26) ? 1 : 0) end
Unicode aware downcase method.
@api private @param [String] input
The input string.
@return [String] The downcased result.
# File lib/addressable/idna/pure.rb, line 132 def self.unicode_downcase(input) input = input.to_s unless input.is_a?(String) unpacked = input.unpack("U*") unpacked.map! { |codepoint| lookup_unicode_lowercase(codepoint) } return unpacked.pack("U*") end