module Archive::Zip::Entry
The Archive::Zip::Entry
mixin provides classes with methods implementing many of the common features of all entry types. Some of these methods, such as dump_local_file_record and dump_central_file_record, are required by Archive::Zip
in order to store the entry into an archive. Those should be left alone. Others, such as ftype and mode=, are expected to be overridden to provide sensible information for the new entry type.
A class using this mixin must provide 2 methods: extract and dump_file_data. extract should be a public method with the following signature:
def extract(options = {}) ... end
This method should extract the contents of the entry to the filesystem. options should be an optional Hash containing a mapping of option names to option values. Please refer to Archive::Zip::Entry::File#extract
, Archive::Zip::Entry::Symlink#extract
, and Archive::Zip::Entry::Directory#extract
for examples of the options currently supported.
dump_file_data should be a private method with the following signature:
def dump_file_data(io) ... end
This method should use the write method of io to write all file data. io will be a writable, IO-like object.
The class methods from_file
and parse are factories for creating the 3 kinds of concrete entries currently implemented: File
, Directory
, and Symlink
. While it is possible to create new archives using custom entry implementations, it is not possible to load those same entries from the archive since the parse factory method does not know about them. Patches to support new entry types are welcome.
Constants
- CFHRecord
- FLAG_DATA_DESCRIPTOR_FOLLOWS
When this flag is set in the general purpose flags, it indicates that the read data descriptor record for a local file record is located after the entry's file data.
- FLAG_ENCRYPTED
When this flag is set in the general purpose flags, it indicates that the entry's file data is encrypted using the original (weak) algorithm.
- LFHRecord
Attributes
The last accessed time.
The comment associated with this entry.
The selected compression codec.
The selected encryption codec.
An Archive::Zip::DataDescriptor
instance which should contain the expected CRC32 checksum, compressed size, and uncompressed size for the file data. When not nil
, this is used by extract to confirm that the data extraction was successful.
The group ID of the owner of this entry.
The file mode/permission bits for this entry.
The last modified time.
The password used with the encryption codec to encrypt or decrypt the file data for an entry.
The raw, possibly compressed and/or encrypted file data for an entry.
The user ID of the owner of this entry.
The path for this entry in the ZIP archive.
Public Class Methods
Cleans up and returns zip_path by eliminating . and .. references, leading and trailing /
's, and runs of /
's.
# File lib/archive/zip/entry.rb, line 94 def self.expand_path(zip_path) result = [] source = zip_path.split('/') source.each do |e| next if e.empty? || e == '.' if e == '..' && ! (result.last.nil? || result.last == '..') then result.pop else result.push(e) end end result.shift while result.first == '..' result.join('/') end
Creates a new Entry
based upon a file, symlink, or directory. file_path points to the source item. options is a Hash optionally containing the following:
- :zip_path
-
The path for the entry in the archive where `/' is the file separator character. This defaults to the basename of file_path if unspecified.
- :follow_symlinks
-
When set to
true
(the default), symlinks are treated as the files or directories to which they point. - :compression_codec
-
Specifies a proc, lambda, or class. If a proc or lambda is used, it must take a single argument containing a zip entry and return a compression codec class to be instantiated and used with the entry. Otherwise, a compression codec class must be specified directly. When unset, the default compression codec for each entry type is used.
- :encryption_codec
-
Specifies a proc, lambda, or class. If a proc or lambda is used, it must take a single argument containing a zip entry and return an encryption codec class to be instantiated and used with the entry. Otherwise, an encryption codec class must be specified directly. When unset, the default encryption codec for each entry type is used.
Raises Archive::Zip::EntryError
if processing the given file path results in a file not found error.
# File lib/archive/zip/entry.rb, line 136 def self.from_file(file_path, options = {}) zip_path = options.has_key?(:zip_path) ? expand_path(options[:zip_path]) : ::File.basename(file_path) follow_symlinks = options.has_key?(:follow_symlinks) ? options[:follow_symlinks] : true # Avoid repeatedly stat'ing the file by storing the stat structure once. begin stat = follow_symlinks ? ::File.stat(file_path) : ::File.lstat(file_path) rescue Errno::ENOENT if ::File.symlink?(file_path) then raise Zip::EntryError, "symlink at `#{file_path}' points to a non-existent file `#{::File.readlink(file_path)}'" else raise Zip::EntryError, "no such file or directory `#{file_path}'" end end # Ensure that zip paths for directories end with '/'. if stat.directory? then zip_path += '/' end # Instantiate the entry. if stat.symlink? then entry = Entry::Symlink.new(zip_path) entry.link_target = ::File.readlink(file_path) elsif stat.file? then entry = Entry::File.new(zip_path) entry.file_path = file_path elsif stat.directory? then entry = Entry::Directory.new(zip_path) else raise Zip::EntryError, "unsupported file type `#{stat.ftype}' for file `#{file_path}'" end # Set the compression and encryption codecs. unless options[:compression_codec].nil? then if options[:compression_codec].kind_of?(Proc) then entry.compression_codec = options[:compression_codec][entry].new else entry.compression_codec = options[:compression_codec].new end end unless options[:encryption_codec].nil? then if options[:encryption_codec].kind_of?(Proc) then entry.encryption_codec = options[:encryption_codec][entry].new else entry.encryption_codec = options[:encryption_codec].new end end # Set the entry's metadata. entry.uid = stat.uid entry.gid = stat.gid entry.mtime = stat.mtime entry.atime = stat.atime entry.mode = stat.mode entry end
Creates a new, uninitialized Entry
instance using the Store compression method. The zip path is initialized to zip_path. raw_data, if specified, must be a readable, IO-like object containing possibly compressed/encrypted file data for the entry. It is intended to be used primarily by the parse class method.
# File lib/archive/zip/entry.rb, line 473 def initialize(zip_path, raw_data = nil) self.zip_path = zip_path self.mtime = Time.now self.atime = @mtime self.uid = nil self.gid = nil self.mode = 0777 self.comment = '' self.expected_data_descriptor = nil self.compression_codec = Zip::Codec::Store.new self.encryption_codec = Zip::Codec::NullEncryption.new @raw_data = raw_data self.password = nil @extra_fields = [] end
Creates and returns a new entry object by parsing from the current position of io. io must be a readable, IO-like object which is positioned at the start of a central file record following the signature for that record.
NOTE: For now io MUST be seekable.
Currently, the only entry objects returned are instances of Archive::Zip::Entry::File
, Archive::Zip::Entry::Directory
, and Archive::Zip::Entry::Symlink
. Any other kind of entry will be mapped into an instance of Archive::Zip::Entry::File
.
Raises Archive::Zip::EntryError
for any other errors related to processing the entry.
# File lib/archive/zip/entry.rb, line 217 def self.parse(io) # Parse the central file record and then use the information found there # to locate and parse the corresponding local file record. cfr = parse_central_file_record(io) next_record_position = io.pos io.seek(cfr.local_header_position) unless IOExtensions.read_exactly(io, 4) == LFH_SIGNATURE then raise Zip::EntryError, 'bad local file header signature' end lfr = parse_local_file_record(io, cfr.compressed_size) # Check to ensure that the contents of the central file record and the # local file record which are supposed to be duplicated are in fact the # same. compare_file_records(lfr, cfr) begin # Load the correct compression codec. compression_codec = Codec.create_compression_codec( cfr.compression_method, cfr.general_purpose_flags ) rescue Zip::Error => e raise Zip::EntryError, "`#{cfr.zip_path}': #{e.message}" end begin # Load the correct encryption codec. encryption_codec = Codec.create_encryption_codec( cfr.general_purpose_flags ) rescue Zip::Error => e raise Zip::EntryError, "`#{cfr.zip_path}': #{e.message}" end # Set up a data descriptor with expected values for later comparison. expected_data_descriptor = DataDescriptor.new( cfr.crc32, cfr.compressed_size, cfr.uncompressed_size ) # Create the entry. expanded_path = expand_path(cfr.zip_path) io_window = IOWindow.new(io, io.pos, cfr.compressed_size) if cfr.zip_path[-1..-1] == '/' then # This is a directory entry. entry = Entry::Directory.new(expanded_path, io_window) elsif (cfr.external_file_attributes >> 16) & 0770000 == 0120000 then # This is a symlink entry. entry = Entry::Symlink.new(expanded_path, io_window) else # Anything else is a file entry. entry = Entry::File.new(expanded_path, io_window) end # Set the expected data descriptor so that extraction can be verified. entry.expected_data_descriptor = expected_data_descriptor # Record the compression codec. entry.compression_codec = compression_codec # Record the encryption codec. entry.encryption_codec = encryption_codec # Set some entry metadata. entry.mtime = cfr.mtime # Only set mode bits for the entry if the external file attributes are # Unix-compatible. if cfr.made_by_version & 0xFF00 == 0x0300 then entry.mode = cfr.external_file_attributes >> 16 end entry.comment = cfr.comment cfr.extra_fields.each { |ef| entry.add_extra_field(ef) } lfr.extra_fields.each { |ef| entry.add_extra_field(ef) } # Return to the beginning of the next central directory record. io.seek(next_record_position) entry end
Private Class Methods
Compares the local and the central file records found in lfr and _cfr respectively. Raises Archive::Zip::EntryError
if the comparison fails.
# File lib/archive/zip/entry.rb, line 436 def self.compare_file_records(lfr, cfr) # Exclude the extra fields from the comparison since some implementations, # such as InfoZip, are known to have differences in the extra fields used # in local file records vs. central file records. if lfr.zip_path != cfr.zip_path then raise Zip::EntryError, "zip path differs between local and central file records: `#{lfr.zip_path}' != `#{cfr.zip_path}'" end if lfr.extraction_version != cfr.extraction_version then raise Zip::EntryError, "`#{cfr.zip_path}': extraction version differs between local and central file records" end if lfr.crc32 != cfr.crc32 then raise Zip::EntryError, "`#{cfr.zip_path}': CRC32 differs between local and central file records" end if lfr.compressed_size != cfr.compressed_size then raise Zip::EntryError, "`#{cfr.zip_path}': compressed size differs between local and central file records" end if lfr.uncompressed_size != cfr.uncompressed_size then raise Zip::EntryError, "`#{cfr.zip_path}': uncompressed size differs between local and central file records" end if lfr.general_purpose_flags != cfr.general_purpose_flags then raise Zip::EntryError, "`#{cfr.zip_path}': general purpose flag differs between local and central file records" end if lfr.compression_method != cfr.compression_method then raise Zip::EntryError, "`#{cfr.zip_path}': compression method differs between local and central file records" end if lfr.mtime != cfr.mtime then raise Zip::EntryError, "`#{cfr.zip_path}': last modified time differs between local and central file records" end end
Parses the extra fields for central file records and returns an array of extra field objects. bytes must be a String containing all of the extra field data to be parsed.
# File lib/archive/zip/entry.rb, line 397 def self.parse_central_extra_fields(bytes) BinaryStringIO.open(bytes) do |io| extra_fields = [] while ! io.eof? do begin header_id, data_size = IOExtensions.read_exactly(io, 4).unpack('vv') data = IOExtensions.read_exactly(io, data_size) rescue ::EOFError raise EntryError, 'insufficient data available' end extra_fields << ExtraField.parse_central(header_id, data) end extra_fields end end
Parses a central file record and returns a CFHRecord
instance containing the parsed data. io must be a readable, IO-like object which is positioned at the start of a central file record following the signature for that record.
# File lib/archive/zip/entry.rb, line 302 def self.parse_central_file_record(io) cfr = CFHRecord.new cfr.made_by_version, cfr.extraction_version, cfr.general_purpose_flags, cfr.compression_method, dos_mtime, cfr.crc32, cfr.compressed_size, cfr.uncompressed_size, file_name_length, extra_fields_length, comment_length, cfr.disk_number_start, cfr.internal_file_attributes, cfr.external_file_attributes, cfr.local_header_position = IOExtensions.read_exactly(io, 42).unpack('vvvvVVVVvvvvvVV') cfr.zip_path = IOExtensions.read_exactly(io, file_name_length) cfr.extra_fields = parse_central_extra_fields( IOExtensions.read_exactly(io, extra_fields_length) ) cfr.comment = IOExtensions.read_exactly(io, comment_length) # Convert from MSDOS time to Unix time. cfr.mtime = DOSTime.new(dos_mtime).to_time cfr rescue EOFError raise Zip::EntryError, 'unexpected end of file' end
Parses the extra fields for local file records and returns an array of extra field objects. bytes must be a String containing all of the extra field data to be parsed.
# File lib/archive/zip/entry.rb, line 417 def self.parse_local_extra_fields(bytes) BinaryStringIO.open(bytes) do |io| extra_fields = [] while ! io.eof? do begin header_id, data_size = IOExtensions.read_exactly(io, 4).unpack('vv') data = IOExtensions.read_exactly(io, data_size) rescue ::EOFError raise EntryError, 'insufficient data available' end extra_fields << ExtraField.parse_local(header_id, data) end extra_fields end end
Parses a local file record and returns a LFHRecord
instance containing the parsed data. io must be a readable, IO-like object which is positioned at the start of a local file record following the signature for that record.
If the record to be parsed is flagged to have a trailing data descriptor record, expected_compressed_size must be set to an integer counting the number of bytes of compressed data to skip in order to find the trailing data descriptor record, and io must be seekable by providing pos and pos= methods.
# File lib/archive/zip/entry.rb, line 346 def self.parse_local_file_record(io, expected_compressed_size = nil) lfr = LFHRecord.new lfr.extraction_version, lfr.general_purpose_flags, lfr.compression_method, dos_mtime, lfr.crc32, lfr.compressed_size, lfr.uncompressed_size, file_name_length, extra_fields_length = IOExtensions.read_exactly(io, 26).unpack('vvvVVVVvv') lfr.zip_path = IOExtensions.read_exactly(io, file_name_length) lfr.extra_fields = parse_local_extra_fields( IOExtensions.read_exactly(io, extra_fields_length) ) # Convert from MSDOS time to Unix time. lfr.mtime = DOSTime.new(dos_mtime).to_time if lfr.general_purpose_flags & FLAG_DATA_DESCRIPTOR_FOLLOWS > 0 then saved_pos = io.pos io.pos += expected_compressed_size # Because the ZIP specification has a history of murkiness, some # libraries create trailing data descriptor records with a preceding # signature while others do not. # This handles both cases. possible_signature = IOExtensions.read_exactly(io, 4) if possible_signature == DD_SIGNATURE then lfr.crc32, lfr.compressed_size, lfr.uncompressed_size = IOExtensions.read_exactly(io, 12).unpack('VVV') else lfr.crc32 = possible_signature.unpack('V')[0] lfr.compressed_size, lfr.uncompressed_size = IOExtensions.read_exactly(io, 8).unpack('VV') end io.pos = saved_pos end lfr rescue EOFError raise Zip::EntryError, 'unexpected end of file' end
Public Instance Methods
Adds extra_field as an extra field specification to both the central file record and the local file record of this entry.
If extra_field is an instance of Archive::Zip::Entry::ExtraField::ExtendedTimestamp, the values of that field are used to set mtime and atime for this entry. If extra_field is an instance of Archive::Zip::Entry::ExtraField::Unix, the values of that field are used to set mtime, atime, uid, and gid for this entry.
# File lib/archive/zip/entry.rb, line 558 def add_extra_field(extra_field) # Try to find an extra field with the same header ID already in the list # and merge the new one with that if one exists; otherwise, add the new # one to the list. existing_extra_field = @extra_fields.find do |ef| ef.header_id == extra_field.header_id end if existing_extra_field.nil? then @extra_fields << extra_field else extra_field = existing_extra_field.merge(extra_field) end # Set some attributes of this entry based on the settings in select types # of extra fields. if extra_field.kind_of?(ExtraField::ExtendedTimestamp) then self.mtime = extra_field.mtime unless extra_field.mtime.nil? self.atime = extra_field.atime unless extra_field.atime.nil? elsif extra_field.kind_of?(ExtraField::Unix) then self.mtime = extra_field.mtime unless extra_field.mtime.nil? self.atime = extra_field.atime unless extra_field.atime.nil? self.uid = extra_field.uid unless extra_field.uid.nil? self.gid = extra_field.gid unless extra_field.uid.nil? end self end
Returns false.
# File lib/archive/zip/entry.rb, line 546 def directory? false end
Writes the central file record for this entry to io, a writable, IO-like object which provides a write method. Returns the number of bytes written.
NOTE: This method should only be called by Archive::Zip
.
# File lib/archive/zip/entry.rb, line 694 def dump_central_file_record(io) bytes_written = 0 # Assume that no trailing data descriptor will be necessary. need_trailing_data_descriptor = false begin io.pos rescue Errno::ESPIPE # A trailing data descriptor is required for non-seekable IO. need_trailing_data_descriptor = true end if encryption_codec.class == Codec::TraditionalEncryption then # HACK: # According to the ZIP specification, a trailing data descriptor should # only be required when writing to non-seekable IO , but InfoZIP # *always* does this when using traditional encryption even though it # will also write the data descriptor in the usual place if possible. # Failure to emulate InfoZIP in this behavior will prevent InfoZIP # compatibility with traditionally encrypted entries. need_trailing_data_descriptor = true end # Set the general purpose flags. general_purpose_flags = compression_codec.general_purpose_flags general_purpose_flags |= encryption_codec.general_purpose_flags if need_trailing_data_descriptor then general_purpose_flags |= FLAG_DATA_DESCRIPTOR_FOLLOWS end # Select the minimum ZIP specification version needed to extract this # entry. version_needed_to_extract = compression_codec.version_needed_to_extract if encryption_codec.version_needed_to_extract > version_needed_to_extract then version_needed_to_extract = encryption_codec.version_needed_to_extract end # Write the data. bytes_written += io.write(CFH_SIGNATURE) bytes_written += io.write( [ version_made_by, version_needed_to_extract, general_purpose_flags, compression_codec.compression_method, mtime.to_dos_time.to_i ].pack('vvvvV') ) bytes_written += @data_descriptor.dump(io) extra_field_data = central_extra_field_data bytes_written += io.write( [ zip_path.bytesize, extra_field_data.length, comment.length, 0, internal_file_attributes, external_file_attributes, @local_file_record_position ].pack('vvvvvVV') ) bytes_written += io.write(zip_path) bytes_written += io.write(extra_field_data) bytes_written += io.write(comment) bytes_written end
Writes the local file record for this entry to io, a writable, IO-like object which provides a write method. local_file_record_position is the offset within io at which writing will begin. This is used so that when writing to a non-seekable IO object it is possible to avoid calling the pos method of io. Returns the number of bytes written.
NOTE: This method should only be called by Archive::Zip
.
# File lib/archive/zip/entry.rb, line 592 def dump_local_file_record(io, local_file_record_position) @local_file_record_position = local_file_record_position bytes_written = 0 # Assume that no trailing data descriptor will be necessary. need_trailing_data_descriptor = false begin io.pos rescue Errno::ESPIPE # A trailing data descriptor is required for non-seekable IO. need_trailing_data_descriptor = true end if encryption_codec.class == Codec::TraditionalEncryption then # HACK: # According to the ZIP specification, a trailing data descriptor should # only be required when writing to non-seekable IO , but InfoZIP # *always* does this when using traditional encryption even though it # will also write the data descriptor in the usual place if possible. # Failure to emulate InfoZIP in this behavior will prevent InfoZIP # compatibility with traditionally encrypted entries. need_trailing_data_descriptor = true # HACK: # The InfoZIP implementation of traditional encryption requires that the # the last modified file time be used as part of the encryption header. # This is a deviation from the ZIP specification. encryption_codec.mtime = mtime end # Set the general purpose flags. general_purpose_flags = compression_codec.general_purpose_flags general_purpose_flags |= encryption_codec.general_purpose_flags if need_trailing_data_descriptor then general_purpose_flags |= FLAG_DATA_DESCRIPTOR_FOLLOWS end # Select the minimum ZIP specification version needed to extract this # entry. version_needed_to_extract = compression_codec.version_needed_to_extract if encryption_codec.version_needed_to_extract > version_needed_to_extract then version_needed_to_extract = encryption_codec.version_needed_to_extract end # Write the data. bytes_written += io.write(LFH_SIGNATURE) extra_field_data = local_extra_field_data bytes_written += io.write( [ version_needed_to_extract, general_purpose_flags, compression_codec.compression_method, mtime.to_dos_time.to_i, 0, 0, 0, zip_path.bytesize, extra_field_data.length ].pack('vvvVVVVvv') ) bytes_written += io.write(zip_path) bytes_written += io.write(extra_field_data) # Pipeline a compressor into an encryptor, write all the file data to the # compressor, and get a data descriptor from it. encryption_codec.encryptor(io, password) do |e| compression_codec.compressor(e) do |c| dump_file_data(c) c.close(false) @data_descriptor = DataDescriptor.new( c.data_descriptor.crc32, c.data_descriptor.compressed_size + encryption_codec.header_size, c.data_descriptor.uncompressed_size ) end e.close(false) end bytes_written += @data_descriptor.compressed_size # Write the trailing data descriptor if necessary. if need_trailing_data_descriptor then bytes_written += io.write(DD_SIGNATURE) bytes_written += @data_descriptor.dump(io) end begin # Update the data descriptor located before the compressed data for the # entry. saved_position = io.pos io.pos = @local_file_record_position + 14 @data_descriptor.dump(io) io.pos = saved_position rescue Errno::ESPIPE # Ignore a failed attempt to update the data descriptor. end bytes_written end
Returns false.
# File lib/archive/zip/entry.rb, line 536 def file? false end
Returns the file type of this entry as the symbol :unknown
.
Override this in concrete subclasses to return an appropriate symbol.
# File lib/archive/zip/entry.rb, line 531 def ftype :unknown end
Returns false.
# File lib/archive/zip/entry.rb, line 541 def symlink? false end
Sets the path in the archive for this entry to zip_path after passing it through Archive::Zip::Entry.expand_path
and ensuring that the result is not empty.
# File lib/archive/zip/entry.rb, line 521 def zip_path=(zip_path) @zip_path = Archive::Zip::Entry.expand_path(zip_path) if @zip_path.empty? then raise ArgumentError, "zip path expands to empty string" end end
Private Instance Methods
# File lib/archive/zip/entry.rb, line 767 def central_extra_field_data @central_extra_field_data = @extra_fields.collect do |extra_field| extra_field.dump_central end.join end
# File lib/archive/zip/entry.rb, line 773 def dummy # Add fields for time data if available. unless mtime.nil? && atime.nil? then @central_extra_field_data += ExtraField::ExtendedTimestamp.new(mtime, atime, nil).dump_central end # Add fields for user and group ownerships if available. unless uid.nil? || gid.nil? || mtime.nil? || atime.nil? then @central_extra_field_data += ExtraField::Unix.new( mtime, atime, uid, gid ).dump_central end end
# File lib/archive/zip/entry.rb, line 798 def external_file_attributes # Put Unix attributes into the high word and DOS attributes into the low # word. (mode << 16) + (directory? ? 0x10 : 0) end
# File lib/archive/zip/entry.rb, line 794 def internal_file_attributes 0x0000 end
# File lib/archive/zip/entry.rb, line 788 def local_extra_field_data @local_extra_field_data = @extra_fields.collect do |extra_field| extra_field.dump_local end.join end
# File lib/archive/zip/entry.rb, line 763 def version_made_by 0x0314 end