class HexaPDF::Font::Type1::AFMParser

Parses files in the AFM file format.

Note that this implementation isn't a full AFM parser, only what is needed for parsing the AFM files for the 14 PDF core fonts is implemented. However, if need be it should be adaptable to other AFM files.

For information on the AFM file format have a look at Adobe technical note #5004 - Adobe Font Metrics File Format Specification Version 4.1, available at the Adobe website.

How Parsing Works

AFM is a line oriented format. Each line consists of one or more values of supported types (string, name, number, integer, array, boolean) which are separated by whitespace characters (space, newline, tab) except for the string type which just uses everything until the end of the line.

This parser reads in line by line and the type parsing functions parse a value from the front of the line and then remove the parsed part from the line, including trailing whitespace characters.

Public Class Methods

new(io) click to toggle source

Creates a new parse for the given IO stream.

# File lib/hexapdf/font/type1/afm_parser.rb, line 78
def initialize(io)
  @io = io
end
parse(filename) → font_metrics click to toggle source
parse(io) → font_metrics

Parses the IO or file and returns a FontMetrics object.

# File lib/hexapdf/font/type1/afm_parser.rb, line 69
def self.parse(source)
  if source.respond_to?(:read)
    new(source).parse
  else
    File.open(source) {|file| new(file).parse }
  end
end

Public Instance Methods

parse() click to toggle source

Parses the AFM file and returns a FontMetrics object.

# File lib/hexapdf/font/type1/afm_parser.rb, line 83
def parse
  @metrics = FontMetrics.new
  sections = []
  each_line do
    case (command = parse_name)
    when /\AStart/
      sections.push(command)
      case command
      when 'StartCharMetrics' then parse_character_metrics
      when 'StartKernPairs' then parse_kerning_pairs
      end
    when /\AEnd/
      sections.pop
      break if sections.empty? && command == 'EndFontMetrics'
    else
      if sections.empty?
        parse_global_font_information(command.to_sym)
      end
    end
  end

  if @metrics.bounding_box && !@metrics.descender
    @metrics.descender = @metrics.bounding_box[1]
  end
  if @metrics.bounding_box && !@metrics.ascender
    @metrics.ascender = @metrics.bounding_box[3]
  end

  @metrics
end

Private Instance Methods

each_line() { || ... } click to toggle source

Iterates over all the lines in the IO, yielding every time a line has been read into the internal buffer.

# File lib/hexapdf/font/type1/afm_parser.rb, line 183
def each_line
  read_line
  unless parse_name == 'StartFontMetrics'
    raise HexaPDF::Error, "The AFM file has to start with StartFontMetrics, not #{@line}"
  end
  until @io.eof?
    read_line
    yield
  end
end
parse_boolean() click to toggle source

Parses the boolean at the start of the line.

# File lib/hexapdf/font/type1/afm_parser.rb, line 226
def parse_boolean
  parse_name == 'true'
end
parse_character_metrics() click to toggle source

Parses the character metrics in a StartCharMetrics section.

It is assumed that the StartCharMetrics name has already been parsed from the line.

# File lib/hexapdf/font/type1/afm_parser.rb, line 148
def parse_character_metrics
  parse_integer.times do
    read_line
    char = CharacterMetrics.new
    if @line =~ /C (\S+) ; WX (\S+) ; N (\S+) ; B (\S+) (\S+) (\S+) (\S+) ;((?: L \S+ \S+ ;)+)?/
      char.code = $1.to_i
      char.width = $2.to_f
      char.name = $3.to_sym
      char.bbox = [$4.to_i, $5.to_i, $6.to_i, $7.to_i]
      if $8
        @metrics.ligature_pairs[char.name] = {}
        $8.scan(/L (\S+) (\S+)/).each do |name, ligature|
          @metrics.ligature_pairs[char.name][name.to_sym] = ligature.to_sym
        end
      end
    end
    @metrics.character_metrics[char.name] = char if char.name
    @metrics.character_metrics[char.code] = char if char.code != -1
  end
end
parse_global_font_information(command) click to toggle source

Parses global font information line for the given command (a symbol).

It is assumed that the command name has already been parsed from the line.

Note that writing direction metrics are also processed here since the standard 14 core fonts' AFM files don't have an extra StartDirection section.

# File lib/hexapdf/font/type1/afm_parser.rb, line 122
def parse_global_font_information(command)
  case command
  when :FontName then @metrics.font_name = parse_string
  when :FullName then @metrics.full_name = parse_string
  when :FamilyName then @metrics.family_name = parse_string
  when :CharacterSet then @metrics.character_set = parse_string
  when :EncodingScheme then @metrics.encoding_scheme = parse_string
  when :Weight then @metrics.weight = parse_string
  when :FontBBox
    @metrics.bounding_box = [parse_number, parse_number, parse_number, parse_number]
  when :CapHeight then @metrics.cap_height = parse_number
  when :XHeight then @metrics.x_height = parse_number
  when :Ascender then @metrics.ascender = parse_number
  when :Descender then @metrics.descender = parse_number
  when :StdHW then @metrics.dominant_horizontal_stem_width = parse_number
  when :StdVW then @metrics.dominant_vertical_stem_width = parse_number
  when :UnderlinePosition then @metrics.underline_position = parse_number
  when :UnderlineThickness then @metrics.underline_thickness = parse_number
  when :ItalicAngle then @metrics.italic_angle = parse_number
  when :IsFixedPitch then @metrics.is_fixed_pitch = parse_boolean
  end
end
parse_integer() click to toggle source

Parses the integer at the start of the line.

# File lib/hexapdf/font/type1/afm_parser.rb, line 216
def parse_integer
  parse_name.to_i
end
parse_kerning_pairs() click to toggle source

Parses the kerning pairs in a StartKernPairs section.

It is assumed that the StartKernPairs name has already been parsed from the line.

# File lib/hexapdf/font/type1/afm_parser.rb, line 172
def parse_kerning_pairs
  parse_integer.times do
    read_line
    if @line =~ /KPX (\S+) (\S+) (\S+)/
      (@metrics.kerning_pairs[$1.to_sym] ||= {})[$2.to_sym] = $3.to_i
    end
  end
end
parse_name() click to toggle source

Parses and returns the name at the start of the line, with whitespace stripped.

# File lib/hexapdf/font/type1/afm_parser.rb, line 200
def parse_name
  result = @line[/\S+\s*/].to_s
  @line[0, result.size] = ''
  result.strip!
  result
end
parse_number() click to toggle source

Parses the float number at the start of the line.

# File lib/hexapdf/font/type1/afm_parser.rb, line 221
def parse_number
  parse_name.to_f
end
parse_string() click to toggle source

Returns the rest of the line, with whitespace stripped.

# File lib/hexapdf/font/type1/afm_parser.rb, line 208
def parse_string
  @line.strip!
  line = @line
  @line = ''
  line
end
read_line() click to toggle source

Reads the next line into the current line variable.

# File lib/hexapdf/font/type1/afm_parser.rb, line 195
def read_line
  @line = @io.readline
end