class HexaPDF::Font::TrueTypeWrapper
This class wraps a generic TrueType
font object and provides the methods needed for working with the font in a PDF context.
TrueType
fonts can be represented in two ways in PDF: As a simple font with Subtype TrueType
or as a composite font using a Type2 CIDFont. The wrapper only supports the composite font case because:
-
By using a composite font more than 256 characters can be encoded with one font object.
-
Fonts for vertical writing can potentially be used.
-
The PDF specification recommends using a composite font (see PDF1.7 s9.9 at the end).
Additionally, TrueType
fonts are always embedded.
Attributes
Returns the PDF object associated with the wrapper.
Returns the wrapped TrueType
font object.
Public Class Methods
Creates a new object wrapping the TrueType
font for the PDF document.
The optional argument pdf_object
can be used to set the PDF font object that this wrapper should be associated with. If no object is set, a suitable one is automatically created.
If subset
is true, the font is subset.
# File lib/hexapdf/font/true_type_wrapper.rb, line 126 def initialize(document, font, pdf_object: nil, subset: true) @wrapped_font = font @missing_glyph_callable = document.config['font.on_missing_glyph'] @subsetter = (subset ? HexaPDF::Font::TrueType::Subsetter.new(font) : nil) @cmap = font[:cmap].preferred_table if @cmap.nil? raise HexaPDF::Error, "No mapping table for Unicode characters found for TTF " \ "font #{font.full_name}" end @pdf_object = pdf_object || create_pdf_object(document) @id_to_glyph = {} @codepoint_to_glyph = {} @encoded_glyphs = {} end
Public Instance Methods
Returns an array of glyph objects representing the characters in the UTF-8 encoded string.
# File lib/hexapdf/font/true_type_wrapper.rb, line 177 def decode_utf8(str) str.codepoints.map! do |c| @codepoint_to_glyph[c] ||= begin if (gid = @cmap[c]) glyph(gid, +'' << c) else @missing_glyph_callable.call(+'' << c, font_type, @wrapped_font) end end end end
Encodes the glyph and returns the code string.
# File lib/hexapdf/font/true_type_wrapper.rb, line 191 def encode(glyph) (@encoded_glyphs[glyph.id] ||= begin if glyph.kind_of?(InvalidGlyph) raise HexaPDF::Error, "Glyph for #{glyph.str.inspect} missing" end if @subsetter [[@subsetter.use_glyph(glyph.id)].pack('n'), glyph] else [[glyph.id].pack('n'), glyph] end end)[0] end
Returns the type of the font, i.e. :TrueType.
# File lib/hexapdf/font/true_type_wrapper.rb, line 145 def font_type :TrueType end
Returns a Glyph
object for the given glyph ID.
The optional argument str
should be the string representation of the glyph. Only use it if it is known,
Note: Although this method is public, it should normally not be used by application code!
# File lib/hexapdf/font/true_type_wrapper.rb, line 165 def glyph(id, str = nil) @id_to_glyph[id] ||= begin if id >= 0 && id < @wrapped_font[:maxp].num_glyphs Glyph.new(@wrapped_font, id, str || (+'' << (@cmap.gid_to_code(id) || 0xFFFD))) else @missing_glyph_callable.call("\u{FFFD}", font_type, @wrapped_font) end end end
Returns the scaling factor for converting font units into PDF units.
# File lib/hexapdf/font/true_type_wrapper.rb, line 150 def scaling_factor @scaling_factor ||= 1000.0 / @wrapped_font[:head].units_per_em end
Returns true
if the wrapped TrueType
font will be subset.
# File lib/hexapdf/font/true_type_wrapper.rb, line 155 def subset? !@subsetter.nil? end
Private Instance Methods
Adds the /DW and /W fields to the CIDFont dictionary.
# File lib/hexapdf/font/true_type_wrapper.rb, line 300 def complete_width_information(dict) default_width = glyph(3, " ").width.to_i widths = @encoded_glyphs.reject {|_, v| v[1].width == default_width }.map do |id, v| [(@subsetter ? @subsetter.subset_glyph_id(id) : id), v[1].width] end.sort! dict[:DescendantFonts].first.set_widths(widths, default_width: default_width) end
Creates a Type0 font object representing the TrueType
font.
The returned font object contains only information available at creation time, so no information about glyph specific attributes like width. The missing information is added before the PDF document gets written.
# File lib/hexapdf/font/true_type_wrapper.rb, line 212 def create_pdf_object(document) fd = document.add({Type: :FontDescriptor, FontName: @wrapped_font.font_name.intern, FontWeight: @wrapped_font.weight, Flags: 0, FontBBox: @wrapped_font.bounding_box.map {|m| m * scaling_factor }, ItalicAngle: @wrapped_font.italic_angle || 0, Ascent: @wrapped_font.ascender * scaling_factor, Descent: @wrapped_font.descender * scaling_factor, StemV: @wrapped_font.dominant_vertical_stem_width}) if @wrapped_font[:'OS/2'].version >= 2 fd[:CapHeight] = @wrapped_font.cap_height * scaling_factor fd[:XHeight] = @wrapped_font.x_height * scaling_factor else # estimate values # Estimate as per https://www.microsoft.com/typography/otspec/os2.htm#ch fd[:CapHeight] = if @cmap[0x0048] # H @wrapped_font[:glyf][@cmap[0x0048]].y_max * scaling_factor else @wrapped_font.ascender * 0.8 * scaling_factor end # Estimate as per https://www.microsoft.com/typography/otspec/os2.htm#xh fd[:XHeight] = if @cmap[0x0078] # x @wrapped_font[:glyf][@cmap[0x0078]].y_max * scaling_factor else @wrapped_font.ascender * 0.5 * scaling_factor end end fd.flag(:fixed_pitch) if @wrapped_font[:post].is_fixed_pitch? || @wrapped_font[:hhea].num_of_long_hor_metrics == 1 fd.flag(:italic) if @wrapped_font[:'OS/2'].selection_include?(:italic) || @wrapped_font[:'OS/2'].selection_include?(:oblique) fd.flag(:symbolic) cid_font = document.add({Type: :Font, Subtype: :CIDFontType2, BaseFont: fd[:FontName], FontDescriptor: fd, CIDSystemInfo: {Registry: "Adobe", Ordering: "Identity", Supplement: 0}, CIDToGIDMap: :Identity}) dict = document.add({Type: :Font, Subtype: :Type0, BaseFont: cid_font[:BaseFont], Encoding: :"Identity-H", DescendantFonts: [cid_font]}) dict.font_wrapper = self document.register_listener(:complete_objects) do update_font_name(dict) embed_font(dict, document) complete_width_information(dict) create_to_unicode_cmap(dict, document) end dict end
Creates the /ToUnicode CMap
and updates the font dictionary so that text extraction works correctly.
# File lib/hexapdf/font/true_type_wrapper.rb, line 310 def create_to_unicode_cmap(dict, document) stream = HexaPDF::StreamData.new do mapping = @encoded_glyphs.keys.map! do |id| # Using 0xFFFD as mentioned in Adobe #5411, last line before section 1.5 [(@subsetter ? @subsetter.subset_glyph_id(id) : id), @cmap.gid_to_code(id) || 0xFFFD] end.sort_by!(&:first) HexaPDF::Font::CMap.create_to_unicode_cmap(mapping) end stream_obj = document.add({}, stream: stream) stream_obj.set_filter(:FlateDecode) dict[:ToUnicode] = stream_obj end
Embeds the font.
# File lib/hexapdf/font/true_type_wrapper.rb, line 286 def embed_font(dict, document) if @subsetter data = @subsetter.build_font length = data.size stream = HexaPDF::StreamData.new(length: length) { data } else length = @wrapped_font.io.size stream = HexaPDF::StreamData.new(@wrapped_font.io, length: length) end font = document.add({Length1: length, Filter: :FlateDecode}, stream: stream) dict[:DescendantFonts].first[:FontDescriptor][:FontFile2] = font end
Updates the font name with a unique tag if the font is subset.
# File lib/hexapdf/font/true_type_wrapper.rb, line 268 def update_font_name(dict) return unless @subsetter tag = +'' data = @encoded_glyphs.each_with_object(''.b) {|(id, v), s| s << id.to_s << v[0] } hash = Digest::MD5.hexdigest(data << @wrapped_font.font_name).to_i(16) while hash != 0 && tag.length < 6 hash, mod = hash.divmod(UPPERCASE_LETTERS.length) tag << UPPERCASE_LETTERS[mod] end name = (tag << "+" << @wrapped_font.font_name).intern dict[:BaseFont] = name dict[:DescendantFonts].first[:BaseFont] = name dict[:DescendantFonts].first[:FontDescriptor][:FontName] = name end