class HexaPDF::Font::CMap
Represents a CMap
, a mapping from character codes to CIDs (character IDs) or to their Unicode value.
See: PDF1.7 s9.7.5, s9.10.3; Adobe Technical Notes #5014 and #5411
Attributes
The name of the CMap
.
The ordering part of the CMap
version.
The registry part of the CMap
version.
The supplement part of the CMap
version.
The writing mode of the CMap: 0 for horizontal, 1 for vertical writing.
Public Class Methods
Returns a string containing a ToUnicode CMap
that represents the given code to Unicode codepoint mapping.
See: Writer#create_to_unicode_cmap
# File lib/hexapdf/font/cmap.rb, line 83 def self.create_to_unicode_cmap(mapping) Writer.new.create_to_unicode_cmap(mapping) end
Creates a new CMap
object by parsing a predefined CMap
with the given name.
Raises an error if the given CMap
is not found.
# File lib/hexapdf/font/cmap.rb, line 63 def self.for_name(name) return @cmap_cache[name] if @cmap_cache.key?(name) file = File.join(CMAP_DIR, name) if File.exist?(file) @cmap_cache[name] = parse(File.read(file, encoding: ::Encoding::UTF_8)) else raise HexaPDF::Error, "No CMap named '#{name}' found" end end
Creates a new CMap
object.
# File lib/hexapdf/font/cmap.rb, line 106 def initialize @codespace_ranges = [] @cid_mapping = {} @cid_range_mappings = [] @unicode_mapping = {} end
Returns true
if the given name specifies a predefined CMap
.
# File lib/hexapdf/font/cmap.rb, line 56 def self.predefined?(name) File.exist?(File.join(CMAP_DIR, name)) end
Public Instance Methods
Adds an individual mapping from character code to CID.
# File lib/hexapdf/font/cmap.rb, line 166 def add_cid_mapping(code, cid) @cid_mapping[code] = cid end
Adds a CID range, mapping characters codes from start_code
to end_code
to CIDs starting with start_cid
.
# File lib/hexapdf/font/cmap.rb, line 172 def add_cid_range(start_code, end_code, start_cid) @cid_range_mappings << [start_code..end_code, start_cid] end
Add a codespace range using an array of ranges for the individual bytes.
This means that the first range is checked against the first byte, the second range against the second byte and so on.
# File lib/hexapdf/font/cmap.rb, line 125 def add_codespace_range(first, *rest) @codespace_ranges << [first, rest] end
Adds a mapping from character code to Unicode string in UTF-8 encoding.
# File lib/hexapdf/font/cmap.rb, line 191 def add_unicode_mapping(code, string) @unicode_mapping[code] = string end
Parses the string and returns all character codes.
An error is raised if the string contains invalid bytes.
# File lib/hexapdf/font/cmap.rb, line 132 def read_codes(string) codes = [] bytes = string.each_byte loop do byte = bytes.next code = 0 found = @codespace_ranges.any? do |first_byte_range, rest_ranges| next unless first_byte_range.cover?(byte) code = (code << 8) + byte valid = rest_ranges.all? do |range| begin byte = bytes.next rescue StopIteration raise HexaPDF::Error, "Missing bytes while reading codes via CMap" end code = (code << 8) + byte range.cover?(byte) end codes << code if valid end unless found raise HexaPDF::Error, "Invalid byte while reading codes via CMap: #{byte}" end end codes end
Returns the CID for the given character code, or 0 if no mapping was found.
# File lib/hexapdf/font/cmap.rb, line 177 def to_cid(code) cid = @cid_mapping.fetch(code, -1) if cid == -1 @cid_range_mappings.reverse_each do |range, start_cid| if range.cover?(code) cid = start_cid + code - range.first break end end end (cid == -1 ? 0 : cid) end
Returns the Unicode string in UTF-8 encoding for the given character code, or nil
if no mapping was found.
# File lib/hexapdf/font/cmap.rb, line 197 def to_unicode(code) unicode_mapping[code] end