class HexaPDF::Type::FileSpecification
Represents a file specification dictionary.
File specifications are used to refer to other files or URLs from within a PDF file. Simple file specifications are just strings. However, the are automatically converted on access to a full file specification to provide a unified interface.
Working with File Specifications¶ ↑
A file specification may refer to a file or an URL. This can easily be checked with url?
. Independent of whether the file specification referes to an URL or a file, the path
method returns the “best” useable path for it.
Modifying a file specification should be done via the path=
and url=
methods as they ensure that no obsolescent entries are used and the file specification is consistent.
Finally, since embedded files in a PDF document are always linked to a file specification it is useful to provide embedding/unembedding operations in this class, see embed
and unembed
.
See: PDF1.7 s7.11
Public Instance Methods
Embeds the given file or IO stream into the PDF file, sets the path accordingly and returns the created stream object.
If a file is given, the name
option defaults to the basename of the file. However, if an IO object is given, the name
argument is mandatory.
If there already was a file embedded for this file specification, it is unembedded first.
The embedded file stream automatically uses the FlateEncode filter for compressing the embedded file.
Options:
- name
-
The name that should be used as path value and when registering.
- register
-
Specifies whether the embedded file will be added to the EmbeddedFiles name tree under the
name
. If the name is already taken, it's value is overwritten.
The file has to be available until the PDF document gets written because reading and writing is done lazily.
# File lib/hexapdf/type/file_specification.rb, line 184 def embed(file_or_io, name: nil, register: true) name ||= File.basename(file_or_io) if file_or_io.kind_of?(String) if name.nil? raise ArgumentError, "The name argument is mandatory when given an IO object" end unembed self.path = name self[:EF] ||= {} ef_stream = self[:EF][:UF] = self[:EF][:F] = document.add({Type: :EmbeddedFile}) stat = if file_or_io.kind_of?(String) File.stat(file_or_io) elsif file_or_io.respond_to?(:stat) file_or_io.stat end if stat ef_stream[:Params] = {Size: stat.size, CreationDate: stat.ctime, ModDate: stat.mtime} end ef_stream.set_filter(:FlateDecode) ef_stream.stream = HexaPDF::StreamData.new(file_or_io) if register (document.catalog[:Names] ||= {})[:EmbeddedFiles] ||= {} document.catalog[:Names][:EmbeddedFiles].add_entry(name, self) end ef_stream end
Returns true
if this file specification contains an embedded file.
See: embedded_file_stream
# File lib/hexapdf/type/file_specification.rb, line 143 def embedded_file? key?(:EF) && !self[:EF].empty? end
Returns the embedded file associated with this file specification, or nil
if this file specification references no embedded file.
If there are multiple possible embedded files, the /EF fields are searched in the following order and the first one with a value is used: /UF, /F, /Unix, /Mac, /DOS.
# File lib/hexapdf/type/file_specification.rb, line 152 def embedded_file_stream return unless key?(:EF) ef = self[:EF] ef[:UF] || ef[:F] || ef[:Unix] || ef[:Mac] || ef[:DOS] end
Returns the path for the referenced file or URL. An empty string is returned if no file specification string is set.
If multiple file specification strings are available, the fields are search in the following order and the first one with a value is used: /UF, /F, /Unix, /Mac, /DOS.
The encoding of the returned path string is either UTF-8 (for /UF) or BINARY (for /F /Unix, /Mac and /DOS).
# File lib/hexapdf/type/file_specification.rb, line 107 def path tmp = (self[:UF] || self[:F] || self[:Unix] || self[:Mac] || self[:DOS] || '').dup tmp.gsub!(/\\\//, "/") # PDF1.7 s7.11.2.1 but / in filename is interpreted as separator! tmp.tr!("\\", "/") # always use slashes instead of back-slashes! tmp end
Sets the file specification string to the given filename.
Since the /Unix, /Mac and /DOS fields are obsolescent, only the /F and /UF fields are set.
# File lib/hexapdf/type/file_specification.rb, line 117 def path=(filename) self[:UF] = self[:F] = filename delete(:FS) delete(:Unix) delete(:Mac) delete(:DOS) end
Deletes any embedded file streams associated with this file specification. A possible entry in the EmbeddedFiles name tree is also deleted.
# File lib/hexapdf/type/file_specification.rb, line 216 def unembed return unless key?(:EF) self[:EF].each {|_, ef_stream| document.delete(ef_stream) } if document.catalog.key?(:Names) && document.catalog[:Names].key?(:EmbeddedFiles) tree = document.catalog[:Names][:EmbeddedFiles] tree.each_entry.find_all {|_, spec| spec == self }.each do |(name, _)| tree.delete_entry(name) end end end
Sets the file specification string to the given URL and updates the file system entry appropriately.
The provided URL needs to be in an RFC1738 compliant string representation. If not, an error is raised.
# File lib/hexapdf/type/file_specification.rb, line 130 def url=(url) begin URI(url) rescue URI::InvalidURIError => e raise HexaPDF::Error, e end self.path = url self[:FS] = :URL end
Returns true
if this file specification references an URL and not a file.
# File lib/hexapdf/type/file_specification.rb, line 95 def url? self[:FS] == :URL end