class HexaPDF::Object
Objects of the PDF object system.
Overview¶ ↑
A PDF object is like a normal object but with an additional *object identifier* consisting of an object number and a generation number. If the object number is zero, then the PDF object represents a direct object. Otherwise the object identifier uniquely identifies this object as an indirect object and can be used for referencing it (from possibly multiple places).
Furthermore a PDF object may have an associated stream. However, this stream is only accessible if the subclass Stream
is used.
A PDF object should be connected to a PDF document, otherwise some methods may not work.
Most PDF objects in a PDF document are represented by subclasses of this class that provide additional functionality.
The methods hash
and eql?
are implemented so that objects of this class can be used as hash keys. Furthermore the implementation is compatible to the one of Reference
, i.e. the hash of a PDF Object
is the same as the hash of its corresponding Reference
object.
Allowed PDF Object
Values¶ ↑
The PDF specification knows of the following object types:
-
Boolean (mapped to
true
andfalse
), -
Integer (mapped to Integer object)
-
Real (mapped to Float objects)
-
String (mapped to String objects with UTF-8 or binary encoding)
-
Names (mapped to Symbol objects)
-
Array (mapped to Array objects)
-
Dictionary
(mapped to Hash objects) -
Stream
(mapped to theStream
class which is aDictionary
with the associated stream data) -
Null (mapped to
nil
) -
Indirect
Object
(mapped to this class)
So working with PDF objects in HexaPDF
is rather straightforward since the common Ruby objects can be used for most things, i.e. wrapping an plain Ruby object into an object of this class is not necessary (except if it should become an indirect object).
There are also some additional data structures built from these primitive ones. For example, Time objects are represented as specially formatted string objects and conversion from and to the string representation is handled automatically.
Important: Users of HexaPDF
may use other plain Ruby objects but then there is no guarantee that everything will work correctly, especially when using other collection types than arrays and hashes.
See: HexaPDF::Dictionary
, HexaPDF::Stream
, HexaPDF::Reference
, HexaPDF::Document
See: PDF1.7 s7.3.10, s7.3.8
Attributes
The wrapped HexaPDF::PDFData
value.
This attribute is not part of the public API!
Sets the associated PDF document.
Sets whether the object has to be an indirect object once it is written.
Public Class Methods
Creates a deep copy of the given object which retains the references to indirect objects.
# File lib/hexapdf/object.rb, line 128 def self.deep_copy(object) case object when Hash object.transform_values {|value| deep_copy(value) } when Array object.map {|o| deep_copy(o) } when HexaPDF::Object (object.indirect? || object.must_be_indirect? ? object : deep_copy(object.value)) when HexaPDF::Reference object else object.dup end end
Creates a new PDF object wrapping the value.
The value
can either be a PDFData
object in which case it is used directly. If it is a PDF Object
, then its data is used. Otherwise the value
object is used as is. In all cases, the oid, gen and stream values may be overridden by the corresponding keyword arguments.
# File lib/hexapdf/object.rb, line 159 def initialize(value, document: nil, oid: nil, gen: nil, stream: nil) @data = case value when PDFData then value when Object then value.data else PDFData.new(value) end @data.oid = oid if oid @data.gen = gen if gen @data.stream = stream if stream self.document = document self.must_be_indirect = false after_data_change end
Public Instance Methods
Compares this object to another object.
If the other object does not respond to oid
or gen
, nil
is returned. Otherwise objects are ordered first by object number and then by generation number.
# File lib/hexapdf/object.rb, line 312 def <=>(other) return nil unless other.respond_to?(:oid) && other.respond_to?(:gen) (oid == other.oid ? gen <=> other.gen : oid <=> other.oid) end
Caches and returns the given value
or the value of the block under the given cache key. If there is already a cached value for the key and update
is false
, it is just returned.
Set update
to true
to force an update of the cached value.
This uses Document#cache
internally.
# File lib/hexapdf/object.rb, line 292 def cache(key, value = Document::UNSET, update: false, &block) document.cache(@data, key, value, update: update, &block) end
Returns true
if there is a cached value for the given key.
This uses Document#cached?
internally.
# File lib/hexapdf/object.rb, line 299 def cached?(key) document.cached?(@data, key) end
Clears the cache for this object.
# File lib/hexapdf/object.rb, line 304 def clear_cache document.clear_cache(@data) end
Makes a deep copy of the source PDF object and resets the object identifier.
# File lib/hexapdf/object.rb, line 276 def deep_copy obj = dup obj.instance_variable_set(:@data, @data.dup) obj.data.oid = 0 obj.data.gen = 0 obj.data.stream = @data.stream.dup if @data.stream.kind_of?(String) obj.data.value = self.class.deep_copy(@data.value) obj end
Returns the associated PDF document.
If no document is associated, an error is raised.
# File lib/hexapdf/object.rb, line 207 def document @document || raise(HexaPDF::Error, "No document associated with this object (#{inspect})") end
Returns true
if a PDF document is associated.
# File lib/hexapdf/object.rb, line 212 def document? !@document.nil? end
Returns true
if the other object references the same PDF object as this object.
# File lib/hexapdf/object.rb, line 324 def eql?(other) other.respond_to?(:oid) && oid == other.oid && other.respond_to?(:gen) && gen == other.gen end
Returns the generation number of the PDF object.
# File lib/hexapdf/object.rb, line 184 def gen data.gen end
Sets the generation number of the PDF object.
# File lib/hexapdf/object.rb, line 189 def gen=(gen) data.gen = gen end
Computes the hash value based on the object and generation numbers.
# File lib/hexapdf/object.rb, line 329 def hash oid.hash ^ gen.hash end
Returns true
if the object is an indirect object (i.e. has an object number unequal to zero).
# File lib/hexapdf/object.rb, line 218 def indirect? oid != 0 end
Returns true
if the object must be an indirect object once it is written.
# File lib/hexapdf/object.rb, line 223 def must_be_indirect? @must_be_indirect end
Returns true
if the object represents the PDF null object.
# File lib/hexapdf/object.rb, line 243 def null? value.nil? end
Returns the object number of the PDF object.
# File lib/hexapdf/object.rb, line 174 def oid data.oid end
Sets the object number of the PDF object.
# File lib/hexapdf/object.rb, line 179 def oid=(oid) data.oid = oid end
Returns the type (symbol) of the object.
Since the type system is implemented in such a way as to allow exchanging implementations of specific types, the class of an object can't be reliably used for determining the actual type.
However, the Type
and Subtype fields can easily be used for this. Subclasses for PDF objects that don't have such fields may use a unique name that has to begin with XX (see PDF1.7 sE.2) and therefore doesn't clash with names defined by the PDF specification.
For basic objects this always returns :Unknown
.
# File lib/hexapdf/object.rb, line 238 def type :Unknown end
Validates the object, optionally corrects problems when the option auto_correct
is set and returns true
if the object is deemed valid and false
otherwise.
If a block is given, it is called on validation problems with a problem description and whether the problem is automatically correctable. The third argument to the block is usually this object but may be another object if during auto-correction a new object was created and validated.
The validation routine itself has to be implemented in the perform_validation
method - see its documentation for more information.
Note: Even if the return value is true
there may be problems since HexaPDF
doesn't currently implement the full PDF spec. However, if the return value is false
, there is certainly a problem!
# File lib/hexapdf/object.rb, line 265 def validate(auto_correct: true) result = true perform_validation do |msg, correctable, object| yield(msg, correctable, object || self) if block_given? result = false unless correctable return false unless auto_correct end result end
Returns the object value.
# File lib/hexapdf/object.rb, line 194 def value data.value end
Sets the object value. Unlike in initialize the value is used as is!
# File lib/hexapdf/object.rb, line 199 def value=(val) data.value = val after_data_change end
Private Instance Methods
This method is called whenever the value or the stream of the wrapped PDFData
structure is changed.
A subclass implementing this method has to call super
! Otherwise things might not work properly.
# File lib/hexapdf/object.rb, line 344 def after_data_change end
Returns the configuration object of the PDF document.
# File lib/hexapdf/object.rb, line 348 def config document.config end
Validates the basic object properties.
Implementation Hint for Subclasses¶ ↑
A subclass needs to call the super method so that the validation routines of the superclasses are also performed!
When the validation routine finds that the object is invalid, it has to yield a problem description and whether the problem can be corrected. An optional third argument may contain the object that gets validated if it is different from this object (may happen when auto-correction is used).
After yielding, the problem has to be corrected if it is correctable. If it is not correctable and not correcting would lead to exceptions the method has to return early.
Here is a sample validation routine for a dictionary object type:
def perform_validation super if value[:SomeKey].length != 7 yield("Length of /SomeKey is invalid") # No need to return early here because following check doesn't rely on /SomeKey end if value[:OtherKey] % 2 == 0 yield("/OtherKey needs to contain an odd number of elements") end end
# File lib/hexapdf/object.rb, line 381 def perform_validation(&block) # Validate that the object is indirect if #must_be_indirect? is +true+. if must_be_indirect? && !indirect? yield("Object must be an indirect object", true) document.add(self) end validate_nested(value, &block) end
Validates all nested values of the object, i.e. values inside collection objects.
# File lib/hexapdf/object.rb, line 392 def validate_nested(obj, &block) if obj.kind_of?(HexaPDF::Object) && !obj.indirect? obj.validate(&block) elsif obj.kind_of?(Hash) obj.each_value {|val| validate_nested(val, &block) } elsif obj.kind_of?(Array) obj.each {|val| validate_nested(val, &block) } end end