class HexaPDF::Serializer

Knows how to serialize Ruby objects for a PDF file.

For normal serialization purposes, the serialize or serialize_to_io methods should be used. However, if the type of the object to be serialized is known, a specialized serialization method like serialize_float can be used.

Additionally, an object for encrypting strings and streams while serializing can be set via the encrypter= method. The assigned object has to respond to encrypt_string(str, ind_obj) (where the string is part of the indirect object; returns the encrypted string) and encrypt_stream(stream) (returns a fiber that represents the encrypted stream).

How This Class Works

The main public interface consists of the serialize and serialize_to_io methods which accept an object and return its serialized form. During serialization of this object it is accessible by individual serialization methods via the @object instance variable (useful if the object is a composed object).

Internally, the #__serialize method is used for invoking the correct serialization method based on the class of a given object. It is also used for serializing individual parts of a composed object.

Therefore the serializer contains one serialization method for each class it needs to serialize. The naming scheme of these methods is based on the class name: The full class name is converted to lowercase, the namespace separator '::' is replaced with a single underscore and the string “serialize_” is then prepended.

Examples:

NilClass                 => serialize_nilclass
TrueClass                => serialize_trueclass
HexaPDF::Object          => serialize_hexapdf_object

If no serialization method for a specific class is found, the ancestors classes are tried.

See: PDF1.7 s7.3

Attributes

encrypter[RW]

The encrypter to use for encrypting strings and streams. If nil, strings and streams are not encrypted.

Default: nil

Public Class Methods

new() click to toggle source

Creates a new Serializer object.

# File lib/hexapdf/serializer.rb, line 89
def initialize
  @dispatcher = {
    Hash => 'serialize_hash',
    Array => 'serialize_array',
    Symbol => 'serialize_symbol',
    String => 'serialize_string',
    Integer => 'serialize_integer',
    Float => 'serialize_float',
    Time => 'serialize_time',
    TrueClass => 'serialize_trueclass',
    FalseClass => 'serialize_falseclass',
    NilClass => 'serialize_nilclass',
    HexaPDF::Reference => 'serialize_hexapdf_reference',
    HexaPDF::Object => 'serialize_hexapdf_object',
    HexaPDF::Stream => 'serialize_hexapdf_stream',
    HexaPDF::Dictionary => 'serialize_hexapdf_object',
    HexaPDF::PDFArray => 'serialize_hexapdf_object',
    HexaPDF::Rectangle => 'serialize_hexapdf_object',
  }
  @dispatcher.default_proc = lambda do |h, klass|
    h[klass] = if klass <= HexaPDF::Stream
                 "serialize_hexapdf_stream"
               elsif klass <= HexaPDF::Object
                 "serialize_hexapdf_object"
               else
                 method = nil
                 klass.ancestors.each do |ancestor_klass|
                   name = ancestor_klass.name.to_s.downcase
                   name.gsub!(/::/, '_')
                   method = "serialize_#{name}"
                   break if respond_to?(method, true)
                 end
                 method
               end
  end
  @encrypter = false
  @io = nil
  @object = nil
  @in_object = false
end

Public Instance Methods

serialize(obj) click to toggle source

Returns the serialized form of the given object.

For developers: While the object is serialized, methods can use the instance variable @object to obtain information about or use the object in case it is a composed object.

# File lib/hexapdf/serializer.rb, line 134
def serialize(obj)
  @object = obj
  __serialize(obj)
ensure
  @object = nil
end
serialize_array(obj) click to toggle source

Serializes an Array object.

See: PDF1.7 s7.3.6

# File lib/hexapdf/serializer.rb, line 231
def serialize_array(obj)
  str = +"["
  index = 0
  while index < obj.size
    tmp = __serialize(obj[index])
    str << " " unless BYTE_IS_DELIMITER[tmp.getbyte(0)] ||
      BYTE_IS_DELIMITER[str.getbyte(-1)]
    str << tmp
    index += 1
  end
  str << "]"
end
serialize_date(obj) click to toggle source

See: serialize_time

# File lib/hexapdf/serializer.rb, line 296
def serialize_date(obj)
  serialize_time(obj.to_time)
end
serialize_datetime(obj) click to toggle source

See: serialize_time

# File lib/hexapdf/serializer.rb, line 301
def serialize_datetime(obj)
  serialize_time(obj.to_time)
end
serialize_falseclass(_obj) click to toggle source

Serializes the false value.

See: PDF1.7 s7.3.2

# File lib/hexapdf/serializer.rb, line 168
def serialize_falseclass(_obj)
  "false"
end
serialize_float(obj) click to toggle source

Serializes a Float object.

See: PDF1.7 s7.3.3

# File lib/hexapdf/serializer.rb, line 192
def serialize_float(obj)
  if -0.0001 < obj && obj < 0.0001 && obj != 0
    sprintf("%.6f", obj)
  elsif obj.finite?
    obj.round(6).to_s
  else
    raise HexaPDF::Error, "Can't serialize special floating point number #{obj}"
  end
end
serialize_hash(obj) click to toggle source

Serializes a Hash object (i.e. a PDF dictionary object).

See: PDF1.7 s7.3.7

# File lib/hexapdf/serializer.rb, line 247
def serialize_hash(obj)
  str = +"<<"
  obj.each do |k, v|
    next if v.nil? || (v.respond_to?(:null?) && v.null?)
    str << serialize_symbol(k)
    tmp = __serialize(v)
    str << " " unless BYTE_IS_DELIMITER[tmp.getbyte(0)] ||
      BYTE_IS_DELIMITER[str.getbyte(-1)]
    str << tmp
  end
  str << ">>"
end
serialize_integer(obj) click to toggle source

Serializes an Integer object.

See: PDF1.7 s7.3.3

# File lib/hexapdf/serializer.rb, line 185
def serialize_integer(obj)
  obj.to_s
end
serialize_nilclass(_obj) click to toggle source

Serializes the nil value.

See: PDF1.7 s7.3.9

# File lib/hexapdf/serializer.rb, line 154
def serialize_nilclass(_obj)
  "null"
end
serialize_numeric(obj) click to toggle source

Serializes a Numeric object (either Integer or Float).

This method should be used for cases where it is known that the object is either an Integer or a Float.

See: PDF1.7 s7.3.3

# File lib/hexapdf/serializer.rb, line 178
def serialize_numeric(obj)
  obj.kind_of?(Integer) ? obj.to_s : serialize_float(obj)
end
serialize_string(obj) click to toggle source

Serializes a String object.

See: PDF1.7 s7.3.4

# File lib/hexapdf/serializer.rb, line 265
def serialize_string(obj)
  obj = if @encrypter && @object.kind_of?(HexaPDF::Object) && @object.indirect?
          encrypter.encrypt_string(obj, @object)
        elsif obj.encoding != Encoding::BINARY
          if obj.match?(/[^ -~\t\r\n]/)
            "\xFE\xFF".b << obj.encode(Encoding::UTF_16BE).force_encoding(Encoding::BINARY)
          else
            obj.b
          end
        else
          obj.dup
        end
  obj.gsub!(/[()\\\r]/n, STRING_ESCAPE_MAP)
  "(#{obj})"
end
serialize_symbol(obj) click to toggle source

Serializes a Symbol object (i.e. a PDF name object).

See: PDF1.7 s7.3.5

# File lib/hexapdf/serializer.rb, line 216
def serialize_symbol(obj)
  NAME_CACHE[obj] ||=
    begin
      str = obj.to_s.dup.force_encoding(Encoding::BINARY)
      str.gsub!(NAME_REGEXP, NAME_SUBSTS)
      str.empty? ? "/ " : "/#{str}"
    end
end
serialize_time(obj) click to toggle source

The ISO PDF specification differs in respect to the supported date format. When converting to a date string, a format suitable for both is output.

See: PDF1.7 s7.9.4, ADB1.7 3.8.3

# File lib/hexapdf/serializer.rb, line 285
def serialize_time(obj)
  zone = obj.strftime("%z'")
  if zone == "+0000'"
    zone = ''
  else
    zone[3, 0] = "'"
  end
  serialize_string(obj.strftime("D:%Y%m%d%H%M%S#{zone}"))
end
serialize_to_io(obj, io) click to toggle source

Serializes the given object and writes it to the IO.

Also see: serialize

# File lib/hexapdf/serializer.rb, line 144
def serialize_to_io(obj, io)
  @io = io
  @io << serialize(obj).freeze
ensure
  @io = nil
end
serialize_trueclass(_obj) click to toggle source

Serializes the true value.

See: PDF1.7 s7.3.2

# File lib/hexapdf/serializer.rb, line 161
def serialize_trueclass(_obj)
  "true"
end

Private Instance Methods

__serialize(obj) click to toggle source

Invokes the correct serialization method for the object.

# File lib/hexapdf/serializer.rb, line 368
def __serialize(obj)
  send(@dispatcher[obj.class], obj)
end
serialize_hexapdf_object(obj) click to toggle source

Uses serialize_hexapdf_reference if it is an indirect object, otherwise just serializes the objects value.

# File lib/hexapdf/serializer.rb, line 309
def serialize_hexapdf_object(obj)
  if obj.indirect? && (obj != @object || @in_object)
    serialize_hexapdf_reference(obj)
  else
    @in_object ||= (obj == @object)
    str = __serialize(obj.value)
    @in_object = false if obj == @object
    str
  end
end
serialize_hexapdf_reference(obj) click to toggle source

See: PDF1.7 s7.3.10

# File lib/hexapdf/serializer.rb, line 321
def serialize_hexapdf_reference(obj)
  "#{obj.oid} #{obj.gen} R"
end
serialize_hexapdf_stream(obj) click to toggle source

Serializes the streams dictionary and its stream.

See: PDF1.7 s7.3.8

# File lib/hexapdf/serializer.rb, line 328
def serialize_hexapdf_stream(obj)
  if !obj.indirect?
    raise HexaPDF::Error, "Can't serialize PDF stream without object identifier"
  elsif obj != @object || @in_object
    return serialize_hexapdf_reference(obj)
  end

  @in_object = true

  fiber = if @encrypter
            encrypter.encrypt_stream(obj)
          else
            obj.stream_encoder
          end

  if @io && fiber.respond_to?(:length) && fiber.length >= 0
    obj.value[:Length] = fiber.length
    @io << serialize_hash(obj.value)
    @io << "stream\n"
    while fiber.alive? && (data = fiber.resume)
      @io << data.freeze
    end
    @io << "\nendstream"
    @in_object = false

    nil
  else
    data = Filter.string_from_source(fiber)
    obj.value[:Length] = data.size

    str = serialize_hash(obj.value)
    @in_object = false

    str << "stream\n"
    str << data
    str << "\nendstream"
  end
end